Tutorial: Literate Notebooks
In this tutorial you'll create a BioLang notebook (.bln) that
documents a complete GC content analysis. You'll learn the notebook format,
cell directives, HTML export, and Jupyter conversion.
Prerequisites: BioLang installed
(bl --version). Familiarity with basic BioLang syntax from
Hello Genomics.
What you'll build
A reproducible analysis notebook that loads FASTA sequences, computes GC content statistics, identifies outlier contigs, and exports a shareable HTML report. The final notebook will use cell directives to keep the narrative clean.
bl run examples/tutorials/notebooks.bl
Step 1: Create a basic notebook
Create a file called gc_analysis.bln. A .bln file
is just text — Markdown prose interleaved with BioLang code blocks.
Start with a title and a single code block:
# GC Content Analysis
This notebook analyzes per-contig GC content from a FASTA file.
```biolang
let seq = dna"ATCGATCGATCG"
print(f"GC content: {gc_content(seq)}")
```
Run it:
bl notebook gc_analysis.bln
You'll see the heading rendered in bold, the prose text, then the output of the code block. The terminal uses ANSI colors for a clean reading experience.
Step 2: Add multiple cells
Notebooks carry state between code blocks. Variables defined in one cell are available in all later cells. Expand the notebook:
# GC Content Analysis
This notebook analyzes per-contig GC content from a FASTA file.
## Load Data
Read the sample FASTA file shipped with BioLang.
```biolang
let seqs = read_fasta("examples/sample-data/contigs.fa")
print(f"Loaded {len(seqs)} sequences")
```
## Compute Statistics
Calculate GC content for each sequence and summarize.
```biolang
let gc_values = seqs |> map(|s| gc_content(s.seq))
let mu = mean(gc_values)
let sigma = stdev(gc_values)
print(f"Mean GC: {mu:.3f} +/- {sigma:.4f}")
print(f"Range: {min(gc_values):.3f} - {max(gc_values):.3f}")
```
## Find Outliers
Flag contigs with GC more than 2 standard deviations from the mean.
These may indicate contamination or horizontal gene transfer.
```biolang
let outliers = seqs
|> filter(|s| abs(gc_content(s.seq) - mu) > 2.0 * sigma)
print(f"Found {len(outliers)} outlier contigs")
outliers |> each(|s| print(f" {s.id}: GC={gc_content(s.seq):.3f}"))
```
Run it again. Each section prints its heading, prose, and code output in sequence — a narrative that tells the story of your analysis.
Step 3: Use cell directives
Cell directives are special comments at the top of a code block that control how it behaves. Add them to clean up the notebook.
# @hide — silent setup
Configuration code that readers don't need to see. It runs but doesn't appear in the output:
## Setup
```biolang
# @hide
let threshold = 2.0
let min_length = 100
```
# @echo — show your work
For key analysis steps, show both the code and its output:
## Analysis
```biolang
# @echo
let gc_values = seqs |> map(|s| gc_content(s.seq))
let mu = mean(gc_values)
print(f"Mean GC: {mu:.3f}")
```
This prints the code first (dimmed), then executes it and shows the output. Readers see exactly what ran.
# @skip — draft cells
Temporarily disable a cell without deleting it. Useful during development:
```biolang
# @skip
# TODO: add k-mer analysis once data is ready
let kmers = seqs |> map(|s| kmer_count(s.seq, 21))
```
# @hide-output — quiet execution
Show the code but suppress printed output. Good for assignments that produce verbose intermediate results:
```biolang
# @hide-output
let gc_table = seqs
|> map(|s| {id: s.id, length: seq_len(s.seq), gc: gc_content(s.seq)})
|> table()
```
Step 4: Export to HTML
Generate a self-contained HTML report with syntax highlighting:
bl notebook gc_analysis.bln --export html > gc_report.html
Open gc_report.html in a browser. You'll see:
- Rendered Markdown headings and prose
- Syntax-highlighted code blocks (keywords in purple, strings in green, pipes in cyan)
- Code output in a separate block
- A dark-themed design with no external dependencies
The HTML is a single file — share it via email, put it on a web server, or include it in a lab notebook. No BioLang installation needed to view it.
Step 5: Jupyter interop
If you have existing Jupyter notebooks, convert them to .bln:
# Import: .ipynb to .bln
bl notebook experiment.ipynb --from-ipynb > experiment.bln
Markdown cells become prose sections. Code cells become fenced
```biolang blocks. Outputs are discarded (they'll regenerate when
you run the notebook).
Going the other way:
# Export: .bln to .ipynb
bl notebook gc_analysis.bln --to-ipynb > gc_analysis.ipynb
The resulting .ipynb uses nbformat v4 and opens in JupyterLab,
VS Code, or any notebook viewer. Code cells are tagged with
"language": "biolang".
Step 6: Dash delimiters (alternative syntax)
Instead of fenced code blocks, you can use --- on its own line
to delimit code. This is the original BioLang notebook format:
## Load Data
---
let seqs = read_fasta("contigs.fa")
print(f"Loaded {len(seqs)} sequences")
---
## Results
The output above shows the sequence count.
Both styles can be mixed in the same file. Fenced blocks are recommended for new notebooks since they're compatible with standard Markdown renderers.
Complete notebook
Here's the final version with all features combined:
# GC Content Analysis
A reproducible analysis of per-contig GC content.
Outlier contigs may indicate contamination or HGT events.
## Setup
```biolang
# @hide
let threshold = 2.0
```
## Load Data
Read the FASTA file and report basic counts.
```biolang
let seqs = read_fasta("examples/sample-data/contigs.fa")
print(f"Loaded {len(seqs)} sequences")
```
## Compute GC Statistics
```biolang
# @echo
let gc_values = seqs |> map(|s| gc_content(s.seq))
let mu = mean(gc_values)
let sigma = stdev(gc_values)
print(f"Mean GC: {mu:.3f} +/- {sigma:.4f}")
```
## Build Results Table
```biolang
# @hide-output
let gc_table = seqs
|> map(|s| {id: s.id, length: seq_len(s.seq), gc: gc_content(s.seq)})
|> table()
```
## Identify Outliers
Contigs more than **2 standard deviations** from the mean
may represent contamination or horizontal gene transfer.
```biolang
let outliers = gc_table
|> filter(|row| abs(row.gc - mu) > threshold * sigma)
|> arrange("-gc")
print(f"Found {len(outliers)} outlier contigs:")
outliers |> each(|row| print(f" {row.id}: GC={row.gc:.3f}, length={row.length}"))
```
## Summary
> Review flagged contigs before downstream assembly.
> Consider BLAST against nt database to confirm contamination.
Run, export, or convert:
# Terminal
bl notebook gc_analysis.bln
# HTML report
bl notebook gc_analysis.bln --export html > report.html
# Jupyter
bl notebook gc_analysis.bln --to-ipynb > gc_analysis.ipynb
Tips and best practices
| Practice | Why |
|---|---|
Use # @hide for setup |
Keeps the narrative focused on the science, not boilerplate |
Use # @echo for key steps |
Readers see exactly what code produced each result |
Use # @skip during development |
Disable expensive cells without deleting them |
Prefer fenced blocks over --- |
Standard Markdown — GitHub, editors, and viewers render them correctly |
| One concept per cell | Easier to understand, debug, and reorder |
Version control .bln files |
They're plain text — git diff works perfectly |
| Export HTML for sharing | Self-contained file, no BioLang needed to view |
What's next
- Notebooks reference — full format spec and directive details
- Variant Analysis — try writing it as a notebook instead of a script
- Visualization — add plots to your notebook reports