Tutorials

Step-by-step guides to help you learn BioLang from the ground up. Each tutorial builds a real project, so you write meaningful code from the very first lesson. Start with the beginner track and work your way through intermediate and advanced topics.

Sample data included: BioLang ships with sample FASTA, FASTQ, VCF, BED, GFF, CSV, TSV, and SAM files in examples/sample-data/. Run bl run examples/quickstart.bl to verify your setup. Tutorials reference these files where applicable.

Beginner

Intermediate

Intermediate ~30 min

Variant Analysis

Read a VCF file, filter variants by quality, annotate with gene information, and summarize variants per chromosome.

Intermediate ~35 min

RNA-seq Differential Expression

Load a count matrix, normalize, find differentially expressed genes, create volcano plots, and run pathway enrichment.

Intermediate ~30 min

Protein Analysis

Fetch sequences from UniProt, analyze properties, search for motifs, explore PDB structures, and visualize results.

Intermediate ~25 min

Streaming Large Files

Process multi-gigabyte files in constant memory using lazy streams, parallel mapping, and chunked output.

Intermediate ~30 min

Querying Databases

Connect to NCBI, Ensembl, UniProt, and KEGG. Run cross-database queries to enrich your analysis.

Intermediate ~30 min

Statistical Analysis

Run t-tests, ANOVA, correlation, PCA, and clustering on experimental data with built-in stats functions.

Intermediate ~25 min

Visualization

Create Manhattan plots, volcano plots, heatmaps, and more. ASCII output for terminal, SVG for publication-quality figures.

Intermediate ~30 min

Knowledge Graphs

Build protein interaction networks, find shortest paths, extract subgraphs, and combine with STRING API data.

Intermediate ~30 min

Enrichment Analysis

Run ORA and GSEA on gene lists. Load GMT files, apply BH correction, and combine results with knowledge graphs.

Intermediate ~20 min

LLM Chat

Use chat() and chat_code() with Anthropic, OpenAI, or Ollama. Pass data context, generate code, and interpret results with AI.

Advanced

Suggested Learning Path

  1. 1 Hello Genomics — Get comfortable with DNA/RNA literals and core operations.
  2. 2 FASTQ QC Pipeline — Learn the pipe operator and build your first analysis pipeline.
  3. 3 Tables — Master the dataframe for structured data analysis.
  4. 4 Variant Analysis + RNA-seq DE — Apply your skills to real-world genomics workflows.
  5. 5 Streaming + Databases — Scale up to production-sized datasets.
  6. 6 Multi-species + Plugins — Advanced workflows and extending BioLang.