Language Overview

BioLang is a pipe-first programming language for bioinformatics. Data flows left to right through transformation chains, biological sequences are first-class values with literal syntax, and tables have built-in dplyr-style verbs. BioLang uses dynamic typing — no type annotations are needed or supported.

A Quick Taste

DNA Sequences

DNA, RNA, and protein sequences are first-class values with dedicated literal syntax and built-in operations:

let seq = dna"ATCGATCGATCG"
println(f"Sequence: {seq}")
println(f"Length: {seq_len(seq)} bp")
println(f"GC content: {gc_content(seq)}")

let rc = reverse_complement(seq)
println(f"Reverse complement: {rc}")

let rna_seq = transcribe(seq)
println(f"RNA: {rna_seq}")
println(f"Protein: {translate(rna_seq)}")

Pipe Chains

The pipe operator |> passes the left-hand value as the first argument to the right-hand function. This creates readable, linear data flows:

let result = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
  |> filter(|x| x % 2 == 0)
  |> map(|x| x * x)
  |> sort()
  |> reverse()

println(f"Even squares (descending): {result}")

Statistics

Built-in functions for common statistical operations:

let data = [23.1, 45.7, 12.3, 67.8, 34.5, 56.2, 19.8]
println(f"Mean:   {mean(data)}")
println(f"Median: {median(data)}")
println(f"Stdev:  {stdev(data)}")
println(f"Min:    {min(data)}")
println(f"Max:    {max(data)}")
println(f"Sum:    {sum(data)}")

Tables

Tables are a built-in type with familiar dplyr-style verbs for data manipulation:

let t = from_records([
  {gene: "BRCA1", score: 0.95, chrom: "chr17"},
  {gene: "TP53",  score: 0.87, chrom: "chr17"},
  {gene: "EGFR",  score: 0.72, chrom: "chr7"},
  {gene: "MYC",   score: 0.91, chrom: "chr8"},
  {gene: "KRAS",  score: 0.65, chrom: "chr12"},
])

let top = t
  |> filter(|r| r.score > 0.8)
  |> arrange("score")

println(top)

Key Features

  • Pipe-first composition — Data flows left to right using |>, mirroring how bioinformatics pipelines are conceptualized.
  • Biology-native types — DNA, RNA, and protein sequences are typed values with literal syntax (dna"ATCG", rna"AUGC", protein"MKT"), not plain strings.
  • Lambdas everywhere — Concise lambda syntax |x| expr for inline transformations.
  • Tables as data — Built-in table type with filter, select, mutate, arrange, group_by, and summarize verbs.
  • Dynamic typing — Types are always inferred from values. No type annotations are needed or supported.
  • Streaming — Lazy streams for memory-efficient processing of large genomic datasets.

Type System

BioLang is dynamically typed. Every value has a runtime type that you can inspect with typeof():

println(typeof(42))                # Int
println(typeof(3.14))              # Float
println(typeof("hello"))           # Str
println(typeof(true))              # Bool
println(typeof(dna"ATCG"))         # DNA
println(typeof([1, 2, 3]))         # List
println(typeof({name: "BRCA1"}))   # Record

Built-in Types

CategoryTypes
PrimitivesInt, Float, Str, Bool, Nil
BiologyDNA, RNA, Protein, Interval
CollectionsList, Record, Table, Set
ControlStream, Function

Error Handling

BioLang provides try/catch for error handling, error() for throwing errors, and ?? for nil coalescing:

# Nil coalescing with ??
let value = nil
let safe = value ?? "default"
println(f"Safe value: {safe}")

# try/catch for fallible operations
let result = try {
  int("not_a_number")
} catch e {
  println(f"Caught error: {e}")
  0
}
println(f"Result: {result}")

Next Steps

Explore the language reference in depth: