Functions
Functions in BioLang are first-class values. They can be assigned to variables, passed as arguments, and returned from other functions. BioLang supports named function declarations, anonymous lambdas, closures, default parameters, and variadic arguments.
Function Declarations
Functions are declared with the fn keyword. The return type is optional
and can be inferred:
# Simple function
fn greet(name: String) -> String {
f"Hello, {name}!"
}
# The last expression is the return value (no semicolon needed)
fn add(a: Int, b: Int) -> Int {
a + b
}
# Multi-statement function
fn gc_ratio(seq: DNA) -> Float {
gc_content(seq)
}
Type Inference for Return Types
When the return type is omitted, BioLang infers it from the function body:
# Return type inferred as Float
fn average(values) {
sum(values) / len(values)
}
# Return type inferred as String
fn format_gene(name, chrom, pos) {
f"{name} at {chrom}:{pos}"
}
# Return type inferred as List[DNA]
fn extract_codons(seq) {
seq |> kmers(3) |> filter(|c| len(c) == 3)
}
Lambda Expressions
Lambdas are anonymous functions written with pipe-delimited parameters. They are used
extensively with higher-order functions like map, filter,
and reduce:
# Single-parameter lambda
let double = |x| x * 2
print(double(21)) # 42
# Multi-parameter lambda
let add = |a, b| a + b
print(add(3, 4)) # 7
# Lambda with block body
let classify = |score| {
if score >= 0.9 { "high" }
else if score >= 0.5 { "medium" }
else { "low" }
}
# Lambdas in pipe chains
let results = scores
|> map(|s| s * 100.0)
|> filter(|s| s >= 50.0)
|> map(|s| round(s, 2))
Function Expression Syntax
For inline function definitions with type annotations, use the fn keyword
as an expression:
# Lambda with typed parameters
let normalize = |values: List[Float]| {
let mu = mean(values)
let sd = stdev(values)
values |> map(|x| (x - mu) / sd)
}
# Short lambda expression
let square = |x: Int| x * x
Closures
Lambdas and function expressions capture variables from their enclosing scope, forming closures:
# Closure captures 'threshold'
fn make_filter(threshold: Float) {
|value| value >= threshold
}
let high_qual = make_filter(30.0)
let reads = all_reads |> filter(high_qual)
# Closure captures 'reference'
fn make_aligner(reference: DNA) {
|query| align(query, reference, method = "sw")
}
let align_to_hg38 = make_aligner(load_reference("hg38.fa"))
let alignments = reads |> map(align_to_hg38)
Default Parameters
Parameters can have default values. Parameters with defaults must come after required parameters:
fn trim_reads(reads, min_quality = 20, min_length = 50, adapter = "AGATCGGAAGAG") {
reads
|> map(|r| trim_quality(r, min_quality))
|> filter(|r| r.length >= min_length)
}
# Call with defaults
let trimmed = trim_reads(raw_reads)
# Override specific defaults
let trimmed = trim_reads(raw_reads, min_quality = 30)
let trimmed = trim_reads(raw_reads, min_length = 100, adapter = "CTGTCTCTTAT")
Named Arguments
Any function can be called with named arguments for clarity. Named arguments can appear in any order after positional arguments:
fn call_variants(bam, reference, min_depth = 10, min_qual = 20.0, caller = "gatk") {
# ...
}
# Positional
call_variants("sample.bam", "hg38.fa", 15, 30.0, "bcftools")
# Named — much clearer
call_variants("sample.bam", "hg38.fa",
caller = "bcftools",
min_depth = 15,
min_qual = 30.0
)
Variadic Functions
Functions can accept a variable number of arguments using the ... syntax.
The variadic parameter is received as a List:
fn merge_tables(...tables: List[Table]) -> Table {
tables |> reduce(|acc, t| concat(acc, t))
}
let combined = merge_tables(table_a, table_b, table_c)
# Variadic with required params
fn log(level: String, ...messages: List[String]) {
let text = messages |> join(" ")
print(f"[{level}] {text}")
}
log("INFO", "Processing", "sample", "001")
Higher-Order Functions
Functions that take or return other functions are a core pattern in BioLang:
# map, filter, reduce — the classic trio
let lengths = sequences |> map(|s| len(s))
let long_seqs = sequences |> filter(|s| len(s) > 1000)
let total_len = sequences |> map(|s| len(s)) |> reduce(|a, b| a + b)
# compose — combine functions
fn compose(f, g) {
|x| f(g(x))
}
let process = compose(upper, trim)
let clean = " hello " |> process # "HELLO"
# apply to each element with index
let indexed = items |> enumerate() |> map(|(i, item)| f"{i}: {item}")
Recursive Functions
# Fibonacci
fn fib(n: Int) -> Int {
if n <= 1 { n }
else { fib(n - 1) + fib(n - 2) }
}
# Tree traversal
fn count_leaves(node) -> Int {
match node {
Leaf(_) => 1,
Branch(left, right) => count_leaves(left) + count_leaves(right)
}
}
Early Return
Use return for explicit early returns. The last expression is the implicit
return value:
fn find_gene(name: String, database: Map[String, Gene]) -> Option[Gene] {
if name == "" {
return None
}
let normalized = upper(trim(name))
if normalized in database {
Some(database[normalized])
} else {
None
}
}
Pipeline Functions
A common pattern is defining reusable pipeline stages as functions:
# Define pipeline stages
fn quality_filter(reads, min_q = 20) {
reads |> filter(|r| mean_phred(r.quality) >= min_q)
}
fn length_filter(reads, min_len = 50) {
reads |> filter(|r| r.length >= min_len)
}
fn adapter_trim(reads, adapter = "AGATCGGAAGAG") {
reads |> map(|r| trim_quality(r, 20))
}
# Compose into a pipeline
let processed = raw_reads
|> quality_filter(min_q = 30)
|> adapter_trim()
|> length_filter(min_len = 75)
Functions as Values
# Store functions in a map
let normalizers = {
"zscore": |vals| {
let mu = mean(vals)
let sd = stdev(vals)
vals |> map(|x| (x - mu) / sd)
},
"minmax": |vals| {
let lo = min(vals)
let hi = max(vals)
vals |> map(|x| (x - lo) / (hi - lo))
},
"log2": |vals| vals |> map(|x| log2(x + 1.0))
}
# Select normalizer dynamically
let method = "zscore"
let normed = normalizers[method](raw_values)
Where Clauses
Functions can declare preconditions using where clauses. If the
condition fails at call time, an assertion error is raised:
fn normalize(counts) where len(counts) > 0 {
let total = sum(counts)
counts |> map(|c| c / total)
}
fn align_reads(reads, reference) where len(reads) > 0 && len(reference) > 100 {
# alignment logic
}
Decorators
BioLang supports decorators that modify function behavior. The @memoize
decorator caches results based on arguments — critical for expensive computations
like sequence alignment scores or API calls:
@memoize
fn fetch_gene_info(symbol) {
ncbi_gene(symbol)
}
# First call fetches from NCBI, subsequent calls return cached result
let tp53 = fetch_gene_info("TP53")
let tp53_again = fetch_gene_info("TP53") # instant — cached
Multi-line Lambdas
For lambdas that need more than one expression, use a block body:
let process = |reads| {
let filtered = reads |> filter(|r| mean_phred(r.quality) > 20)
let gc = filtered |> map(|r| gc_content(r.seq))
{count: len(filtered), mean_gc: mean(gc)}
}
# Use in pipe chains
samples |> map(|s| {
let data = read_fastq(s.path)
let stats = describe(data)
{sample: s.name, ...stats}
})