One-Liners

Quick, powerful snippets that demonstrate BioLang's expressiveness. Each of these fits on a single line (or a short pipe chain) and solves a real bioinformatics task.

Sequence Basics

Reverse complement a DNA sequence

dna"ATCGATCG" |> reverse_complement |> print

GC content as a percentage

dna"ATCGATCGCCGG" |> gc_content |> |x| x * 100.0 |> print

Transcribe DNA to RNA

dna"ATCGATCG" |> transcribe |> print

Translate DNA to protein

dna"ATGCGATCGTGA" |> translate |> print

Count bases in a sequence

dna"AATTCCGG" |> base_counts |> print

Check if sequence is palindromic

let seq = dna"GAATTC"; seq == reverse_complement(seq) |> print

K-mer Analysis

Count all 3-mers

dna"ATCGATCGATCG" |> kmers(3) |> frequencies |> print

Find unique k-mers

dna"ATCGATCGATCG" |> kmers(4) |> unique |> len |> print

Most frequent k-mer

dna"ATCGATCGATCG" |> kmers(3) |> frequencies |> sort_by(|a, b| b.1 - a.1) |> first |> print

FASTQ Operations

Count reads in a FASTQ file

read_fastq("data/reads.fastq") |> len |> print

Average read length

read_fastq("data/reads.fastq") |> map(|r| r.length) |> mean |> print

Filter reads by quality

read_fastq("data/reads.fastq") |> filter(|r| mean_phred(r.quality) >= 30.0) |> write_fastq("hq.fq")

Extract read names

read_fastq("data/reads.fastq") |> map(|r| r.id) |> take(10) |> print

Total bases in a FASTQ file

read_fastq("data/reads.fastq") |> map(|r| r.length) |> sum |> print

VCF Operations

Count variants per chromosome

read_vcf("variants.vcf") |> group_by("chrom") |> summarize(|chrom, rows| {chrom: chrom, count: len(rows)}) |> print

Filter to PASS variants only

read_vcf("in.vcf") |> filter(|v| v.filter == "PASS") |> write_vcf("pass.vcf")

Extract SNPs only

read_vcf("in.vcf") |> filter(|v| is_snp(v)) |> len |> print

Transition/transversion ratio

read_vcf("in.vcf") |> filter(|v| is_snp(v)) |> tstv_ratio |> print

BED and Intervals

Total bases covered by a BED file

read_bed("regions.bed") |> map(|r| r.end - r.start) |> sum |> print

Merge overlapping intervals

read_bed("regions.bed") |> sort_by(|a, b| [a.chrom, a.start]) |> write_bed("merged.bed")

FASTA Operations

Sequence lengths from a FASTA

read_fasta("data/sequences.fasta") |> map(|r| [r.id, len(r.seq)]) |> print

Filter sequences by length

read_fasta("data/sequences.fasta") |> filter(|r| len(r.seq) >= 1000) |> write_fasta("long.fa")

GC content of all sequences

read_fasta("data/sequences.fasta") |> map(|r| gc_content(r.seq)) |> mean |> print

Text and Data

Count lines in a file

read_lines("data.txt") |> len |> print

Extract a column from TSV

tsv("data.tsv") |> map(|row| row["gene_name"]) |> unique |> print

Sum a numeric column

read_csv("data/counts.csv") |> map(|row| row["count"] |> int) |> sum |> print