Functions

Functions in BioLang are first-class values. They can be assigned to variables, passed as arguments, and returned from other functions. BioLang supports named function declarations, anonymous lambdas, closures, default parameters, and variadic arguments.

Function Declarations

Functions are declared with the fn keyword. The return type is optional and can be inferred:

# Simple function
fn greet(name: String) -> String {
  f"Hello, {name}!"
}

# The last expression is the return value (no semicolon needed)
fn add(a: Int, b: Int) -> Int {
  a + b
}

# Multi-statement function
fn gc_ratio(seq: DNA) -> Float {
  gc_content(seq)
}

Type Inference for Return Types

When the return type is omitted, BioLang infers it from the function body:

# Return type inferred as Float
fn average(values) {
  sum(values) / len(values)
}

# Return type inferred as String
fn format_gene(name, chrom, pos) {
  f"{name} at {chrom}:{pos}"
}

# Return type inferred as List[DNA]
fn extract_codons(seq) {
  seq |> kmers(3) |> filter(|c| len(c) == 3)
}

Lambda Expressions

Lambdas are anonymous functions written with pipe-delimited parameters. They are used extensively with higher-order functions like map, filter, and reduce:

# Single-parameter lambda
let double = |x| x * 2
print(double(21))   # 42

# Multi-parameter lambda
let add = |a, b| a + b
print(add(3, 4))    # 7

# Lambda with block body
let classify = |score| {
  if score >= 0.9 { "high" }
  else if score >= 0.5 { "medium" }
  else { "low" }
}

# Lambdas in pipe chains
let results = scores
  |> map(|s| s * 100.0)
  |> filter(|s| s >= 50.0)
  |> map(|s| round(s, 2))

Function Expression Syntax

For inline function definitions with type annotations, use the fn keyword as an expression:

# Lambda with typed parameters
let normalize = |values: List[Float]| {
  let mu = mean(values)
  let sd = stdev(values)
  values |> map(|x| (x - mu) / sd)
}

# Short lambda expression
let square = |x: Int| x * x

Closures

Lambdas and function expressions capture variables from their enclosing scope, forming closures:

# Closure captures 'threshold'
fn make_filter(threshold: Float) {
  |value| value >= threshold
}

let high_qual = make_filter(30.0)
let reads = all_reads |> filter(high_qual)

# Closure captures 'reference'
fn make_aligner(reference: DNA) {
  |query| align(query, reference, method = "sw")
}

let align_to_hg38 = make_aligner(load_reference("hg38.fa"))
let alignments = reads |> map(align_to_hg38)

Default Parameters

Parameters can have default values. Parameters with defaults must come after required parameters:

fn trim_reads(reads, min_quality = 20, min_length = 50, adapter = "AGATCGGAAGAG") {
  reads
    |> map(|r| trim_quality(r, min_quality))
    |> filter(|r| r.length >= min_length)
}

# Call with defaults
let trimmed = trim_reads(raw_reads)

# Override specific defaults
let trimmed = trim_reads(raw_reads, min_quality = 30)
let trimmed = trim_reads(raw_reads, min_length = 100, adapter = "CTGTCTCTTAT")

Named Arguments

Any function can be called with named arguments for clarity. Named arguments can appear in any order after positional arguments:

fn call_variants(bam, reference, min_depth = 10, min_qual = 20.0, caller = "gatk") {
  # ...
}

# Positional
call_variants("sample.bam", "hg38.fa", 15, 30.0, "bcftools")

# Named — much clearer
call_variants("sample.bam", "hg38.fa",
  caller = "bcftools",
  min_depth = 15,
  min_qual = 30.0
)

Variadic Functions

Functions can accept a variable number of arguments using the ... syntax. The variadic parameter is received as a List:

fn merge_tables(...tables: List[Table]) -> Table {
  tables |> reduce(|acc, t| concat(acc, t))
}

let combined = merge_tables(table_a, table_b, table_c)

# Variadic with required params
fn log(level: String, ...messages: List[String]) {
  let text = messages |> join(" ")
  print(f"[{level}] {text}")
}

log("INFO", "Processing", "sample", "001")

Higher-Order Functions

Functions that take or return other functions are a core pattern in BioLang:

# map, filter, reduce — the classic trio
let lengths = sequences |> map(|s| len(s))
let long_seqs = sequences |> filter(|s| len(s) > 1000)
let total_len = sequences |> map(|s| len(s)) |> reduce(|a, b| a + b)

# compose — combine functions
fn compose(f, g) {
  |x| f(g(x))
}

let process = compose(upper, trim)
let clean = " hello " |> process   # "HELLO"

# apply to each element with index
let indexed = items |> enumerate() |> map(|(i, item)| f"{i}: {item}")

Recursive Functions

# Fibonacci
fn fib(n: Int) -> Int {
  if n <= 1 { n }
  else { fib(n - 1) + fib(n - 2) }
}

# Tree traversal
fn count_leaves(node) -> Int {
  match node {
    Leaf(_) => 1,
    Branch(left, right) => count_leaves(left) + count_leaves(right)
  }
}

Early Return

Use return for explicit early returns. The last expression is the implicit return value:

fn find_gene(name: String, database: Map[String, Gene]) -> Option[Gene] {
  if name == "" {
    return None
  }

  let normalized = upper(trim(name))

  if normalized in database {
    Some(database[normalized])
  } else {
    None
  }
}

Pipeline Functions

A common pattern is defining reusable pipeline stages as functions:

# Define pipeline stages
fn quality_filter(reads, min_q = 20) {
  reads |> filter(|r| mean_phred(r.quality) >= min_q)
}

fn length_filter(reads, min_len = 50) {
  reads |> filter(|r| r.length >= min_len)
}

fn adapter_trim(reads, adapter = "AGATCGGAAGAG") {
  reads |> map(|r| trim_quality(r, 20))
}

# Compose into a pipeline
let processed = raw_reads
  |> quality_filter(min_q = 30)
  |> adapter_trim()
  |> length_filter(min_len = 75)

Functions as Values

# Store functions in a map
let normalizers = {
  "zscore": |vals| {
    let mu = mean(vals)
    let sd = stdev(vals)
    vals |> map(|x| (x - mu) / sd)
  },
  "minmax": |vals| {
    let lo = min(vals)
    let hi = max(vals)
    vals |> map(|x| (x - lo) / (hi - lo))
  },
  "log2": |vals| vals |> map(|x| log2(x + 1.0))
}

# Select normalizer dynamically
let method = "zscore"
let normed = normalizers[method](raw_values)

Where Clauses

Functions can declare preconditions using where clauses. If the condition fails at call time, an assertion error is raised:

fn normalize(counts) where len(counts) > 0 {
    let total = sum(counts)
    counts |> map(|c| c / total)
}

fn align_reads(reads, reference) where len(reads) > 0 && len(reference) > 100 {
    # alignment logic
}

Decorators

BioLang supports decorators that modify function behavior. The @memoize decorator caches results based on arguments — critical for expensive computations like sequence alignment scores or API calls:

@memoize
fn fetch_gene_info(symbol) {
    ncbi_gene(symbol)
}

# First call fetches from NCBI, subsequent calls return cached result
let tp53 = fetch_gene_info("TP53")
let tp53_again = fetch_gene_info("TP53")  # instant — cached

Multi-line Lambdas

For lambdas that need more than one expression, use a block body:

let process = |reads| {
    let filtered = reads |> filter(|r| mean_phred(r.quality) > 20)
    let gc = filtered |> map(|r| gc_content(r.seq))
    {count: len(filtered), mean_gc: mean(gc)}
}

# Use in pipe chains
samples |> map(|s| {
    let data = read_fastq(s.path)
    let stats = describe(data)
    {sample: s.name, ...stats}
})