ATCGATCG

GCTAGCTA

|> filter |> map

v0.3.0 — Now Available

A domain-specific
language for
bioinformatics

A DSL purpose-built for genomics and bioinformatics — native DNA/RNA/protein types, 750+ built-in functions, streaming I/O, 15 bio API clients, and Rust performance with a clean, pipe-first syntax.

Get Started Try Playground

analysis.bl

# Stream FASTQ — constant memory
read_fastq("data/reads.fastq")
  |> filter(|r| mean_phred(r.quality) >= 30)
  |> map(|r| r.id)
  |> each(|id| println(id))

# Native DNA literal + operations
let seq = dna"ATCGATCGATCG"
println(gc_content(seq))            # 0.5
println(reverse_complement(seq))    # DNA(CGATCGATCGAT)

# Query NCBI for gene info
println("Fetching BRCA1 from NCBI...")
ncbi_gene("BRCA1") |> println()

▶

🔍

🔎

🔬

Why a domain-specific language?

General-purpose languages like Python and R require stitching together dozens of packages to do bioinformatics. BioLang is a DSL — every design decision, from the type system to the syntax, is made for genomics workflows. Here's what that means in practice:

✓

No import ceremony — DNA types, FASTA readers, GC content, k-mers, interval queries, and 750+ functions are available immediately. No pip install, no import boilerplate.

✓

Types that match the domain — dna"ATCG", Interval, Variant, AlignedRead, Quality are first-class values with built-in methods, not strings pretending to be sequences.

✓

Pipes match how bioinformatics thinks — reads |> filter |> map |> summarize. Data flows through a pipeline, just like the biological workflow it models.

✓

Safe by default — no null pointer exceptions, no silent type coercions. Errors point to the genomic operation that failed, not a stack trace in pandas internals.

✓

Fast without C extensions — compiled to native bytecode via Rust. No NumPy wheel issues, no Cython compilation step. Single binary, runs everywhere.

✓

Streaming by design — process 100 GB FASTQ files in constant memory. Lazy evaluation is the default, not an afterthought bolted onto eager collections.

BioLang is not a general-purpose language, and that's the point. It does one thing — bioinformatics scripting — and does it well.

Who is BioLang for?

🧬

Biologists

Analyze sequences, variants, and expression data without learning a programming language first.

🎓

Students

The fastest on-ramp to bioinformatics. Write your first analysis in minutes, not days of environment setup.

⚡

Researchers

Quick one-off analyses without spinning up a Jupyter notebook. One command, instant results.

🔬

Anyone tired of setup

No conda environments, no dependency conflicts, no wheel compilation failures. Single binary, works everywhere.

BioLang isn't here to replace your existing tools — it's the fastest path from raw data to first result. Start here, then grow into Python or R when your analysis demands it.

What's included

Everything you need from read to result — batteries included.

Pipe-First Syntax

Chain operations naturally with |>. No nested function calls, no temp variables. Data flows left to right.

Bio-Native Types

First-class dna"...", rna"...", protein"..." literals with built-in methods for complement, translate, GC content.

750+ Builtins

Statistics, tables, matrices, file I/O (FASTA/FASTQ/VCF/BAM/BED/GFF), plotting, k-mers, alignment, motifs — all built in.

Streaming I/O

Process multi-GB FASTQ/BAM files without loading into memory. Lazy streams + pipes = constant memory usage.

15 API Clients

NCBI, Ensembl, UniProt, UCSC, KEGG, STRING, PDB, Reactome, GO, COSMIC, BioMart, nf-core, BioContainers, Galaxy ToolShed, NCBI Datasets — query any database in one line.

Rust Performance

Native Rust I/O via noodles. Up to 7.1x faster on ENCODE overlap, 7.0x on protein k-mers, 6.7x on FASTA parsing, 3.2x on k-mer counting, 50–70% fewer lines of code. See benchmarks →

Try it in your browser

BioLang runs right here via WebAssembly. Click Run on any example below — no install needed.

First click downloads the runtime (~4 MB), then it's cached for the session. Every code block across the docs is interactive too.

DNA Operations

let seq = dna"ATCGATCGATCG"
println(f"GC content: {gc_content(seq)}")
println(f"Complement: {complement(seq)}")
println(f"Rev-comp:   {reverse_complement(seq)}")
println(f"Transcribe: {transcribe(seq)}")
println(f"Length:     {seq_len(seq)} bp")

Translation & K-mers

let coding = dna"ATGAAAGCTTTTGACTGA"
let prot = translate(coding)
println(f"Protein: {prot}")

let seq = dna"ATCGATCGATCG"
let kmer_list = kmers(seq, 4)
println(f"4-mers: {kmer_list}")

Statistics

let normal = [5.2, 4.8, 5.1, 4.9, 5.3]
let tumor = [8.1, 7.9, 8.5, 7.6, 8.3]
let result = ttest(normal, tumor)
println(f"t = {round(result.statistic, 3)}")
println(f"p = {result.p_value}")
println(f"Significant: {result.p_value < 0.05}")

Pipes & Lambdas

# Pipe-first: data flows left to right
let genes = ["BRCA1", "TP53", "EGFR", "KRAS"]
genes
  |> filter(|g| len(g) <= 4)
  |> map(|g| f"{g} ({len(g)} chars)")
  |> each(|g| println(g))

BioLang vs Python

Same task, less code, more clarity.

Python + BioPython + pandas 14 lines

from Bio import SeqIO
import pandas as pd

records = []
for rec in SeqIO.parse("reads.fq", "fastq"):
    quals = rec.letter_annotations[
        "phred_quality"
    ]
    if sum(quals)/len(quals) >= 30:
        gc = (rec.seq.count("G")
              + rec.seq.count("C")) \
              / len(rec.seq)
        records.append({"id": rec.id,
                        "gc": gc})
df = pd.DataFrame(records)
print(df.describe())

BioLang 5 lines

read_fastq("data/reads.fastq")
  |> filter(|r| mean_phred(r.quality) >= 30)
  |> each(|r| println(f"{r.id}: len={r.length}"))

Benchmarked against BioPython & Bioconductor

30 bioinformatics tasks on real-world data (NCBI, UniProt, ClinVar, ENCODE). Correctness validated on both synthetic and real biological data (E. coli, yeast, ClinVar) against BioPython and Bioconductor.

7.1x

ENCODE Overlap

7.0x

Protein K-mers

6.7x

FASTA Parse

3.2x

K-mer Counting

Task	BioLang	Python	R	Speedup
ENCODE Peak Overlap	0.363s	2.574s	—	7.1x
Protein K-mers	0.191s	1.331s	1.298s	7.0x
FASTA Parse (30 KB)	0.138s	0.926s	1.243s	6.7x
E. coli Genome	0.176s	1.081s	1.354s	6.1x
GC Content (51 MB)	0.830s	2.771s	2.358s	3.3x
K-mer Counting (21-mers)	6.551s	21.01s	—	3.2x

Linux (WSL2) — Intel i9-12900K, 16 GB RAM. Python wins on VCF/CSV text parsing where C extensions dominate. K-mer counting uses canonical (strand-agnostic) 21-mers — BioLang does strictly more work.

Full benchmark results (30 tasks, Linux & Windows)

Browser Tools — Installable PWAs

Everything runs in your browser

No installation, no server, no uploads. BioLang compiles to WebAssembly so you can analyze bioinformatics data entirely client-side. All tools work offline as installable PWAs.

▶

Playground

Run code instantly

Write and execute BioLang code blocks with persistent state, inline SVG charts, and syntax highlighting. Great for experimenting and learning.

Open → Help

🔍

Viewer

Inspect bio files

Drop FASTA, FASTQ, VCF, BED, GFF, CSV files for instant parsing, statistics (N50, GC%, Q30, Ti/Tv), sortable tables, column filters, multi-format export, URL loading, and BioLang analysis. Data never leaves your machine.

Open → Chrome Firefox Help

🔎

BioGist

Gene intelligence sidebar

Auto-detects genes, variants, accessions, and species on any webpage. Click any entity for instant details from NCBI, UniProt, gnomAD, and ClinVar. Chrome sidebar extension + PWA.

Try PWA → Firefox Help

🔬

BioKhoj

Research radar

Personal literature monitor. Watch genes, drugs, and variants across PubMed and bioRxiv. Signal scoring ranks papers by relevance. Background checks, co-mention detection, and weekly digest. Chrome sidebar extension + PWA.

Open PWA → Firefox Help

A domain-specific
language for
bioinformatics

Why a domain-specific language?

Who is BioLang for?

What's included

Pipe-First Syntax

Bio-Native Types

750+ Builtins

Streaming I/O

15 API Clients

Rust Performance

Try it in your browser

DNA Operations

Translation & K-mers

Statistics

Pipes & Lambdas

BioLang vs Python

Benchmarked against BioPython & Bioconductor

Everything runs in your browser

Playground

Viewer

BioGist

BioKhoj

Free reference book

The BioLang Language

Get started in seconds

A domain-specific language for bioinformatics

Why a domain-specific language?

Who is BioLang for?

What's included

Pipe-First Syntax

Bio-Native Types

750+ Builtins

Streaming I/O

15 API Clients

Rust Performance

Try it in your browser

DNA Operations

Translation & K-mers

Statistics

Pipes & Lambdas

BioLang vs Python

Benchmarked against BioPython & Bioconductor

Everything runs in your browser

Playground

Viewer

BioGist

BioKhoj

Free reference book

The BioLang Language

Get started in seconds

A domain-specific
language for
bioinformatics