Collections

BioLang provides two core collection types: Lists (ordered sequences) and Records (named field collections). Both support functional operations via pipes and are used throughout bioinformatics workflows.

Lists

Lists are ordered, indexable sequences of values. Create them with bracket syntax:

# Create lists
let numbers = [1, 2, 3, 4, 5]
let genes = ["BRCA1", "TP53", "EGFR", "KRAS"]
let mixed = [42, "hello", true, dna"ATCG"]
let empty = []

# Nested lists
let matrix = [
  [1, 2, 3],
  [4, 5, 6],
  [7, 8, 9]
]

Indexing

let items = ["a", "b", "c", "d", "e"]

# Zero-based indexing
let first = items[0]       # "a"
let third = items[2]       # "c"

# Length
let n = len(items)         # 5
println(f"List has {n} elements")

push and pop

let a = [1, 2, 3]

# push returns a new list with the element appended
let b = push(a, 4)
println(b)   # [1, 2, 3, 4]

# pop returns a new list with the last element removed
let c = pop(a)
println(c)   # [1, 2]

# Original list is unchanged
println(a)   # [1, 2, 3]

List Operations

let a = [1, 2, 3]
let b = [4, 5, 6]

# Concatenation with ++ operator
let combined = a ++ b
println(combined)   # [1, 2, 3, 4, 5, 6]

# Or use concat()
let also_combined = concat(a, b)
println(also_combined)   # [1, 2, 3, 4, 5, 6]

# Reverse, sort, unique, flatten
let reversed = reverse(a)
println(reversed)   # [3, 2, 1]

let sorted = sort([3, 1, 2])
println(sorted)   # [1, 2, 3]

let deduped = unique([1, 2, 2, 3, 3])
println(deduped)   # [1, 2, 3]

let flat = flatten([[1, 2], [3, 4]])
println(flat)   # [1, 2, 3, 4]

Functional Operations on Lists

let scores = [85, 92, 78, 95, 88]

# map -- transform each element
let doubled = scores |> map(|s| s * 2)
println(doubled)   # [170, 184, 156, 190, 176]

# filter -- keep elements matching a predicate
let high = scores |> filter(|s| s >= 90)
println(high)   # [92, 95]

# reduce -- combine all elements into one value
let total = scores |> reduce(|acc, s| acc + s)
println(total)   # 438

# each -- iterate for side effects
scores |> each(|s| println(f"Score: {s}"))

# Chaining with pipes
let result = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
  |> filter(|x| x % 2 == 0)
  |> map(|x| x * x)
println(result)   # [4, 16, 36, 64, 100]

Comprehension-like Patterns

BioLang uses range() with map and filter to achieve comprehension-like patterns:

# Generate squares of 0..9
let squares = range(10) |> map(|i| i * i)
println(squares)   # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

# Filter and transform in a pipeline
let even_squares = range(10)
  |> filter(|i| i % 2 == 0)
  |> map(|i| i * i)
println(even_squares)   # [0, 4, 16, 36, 64]

Records

Records are collections of named fields. They use brace syntax with identifier keys:

# Create a record
let gene = {name: "BRCA1", score: 0.95, chrom: "chr17"}

# Access fields with dot notation
println(gene.name)    # BRCA1
println(gene.score)   # 0.95
println(gene.chrom)   # chr17

Records with String Keys

# String keys also work
let info = {"gene": "TP53", "impact": "HIGH", "frequency": 0.034}
println(info.gene)      # TP53
println(info.impact)    # HIGH

Record Spread

Use the spread operator ... to merge records, with later fields overriding earlier ones:

let defaults = {quality: 30, depth: 10, filter: "PASS"}
let custom = {...defaults, quality: 50, name: "sample_1"}
println(custom.quality)   # 50
println(custom.depth)     # 10
println(custom.name)      # sample_1

Nested Structures

Lists of records are a common pattern, especially for structured bioinformatics data:

# List of records
let variants = [
  {gene: "BRCA1", chrom: "chr17", impact: "HIGH"},
  {gene: "TP53", chrom: "chr17", impact: "HIGH"},
  {gene: "EGFR", chrom: "chr7", impact: "MODERATE"}
]

# Filter and iterate
let high_impact = variants |> filter(|v| v.impact == "HIGH")
high_impact |> each(|v| println(f"{v.gene} on {v.chrom}"))

# Transform to extract fields
let gene_names = variants |> map(|v| v.gene)
println(gene_names)   # [BRCA1, TP53, EGFR]

Sorting Lists

# Sort with default ordering
let sorted_nums = sort([3, 1, 4, 1, 5])
println(sorted_nums)   # [1, 1, 3, 4, 5]

# Sort by a computed key with sort_by
let genes = ["EGFR", "BRCA1", "TP53"]
let by_length = genes |> sort_by(|a, b| len(a) - len(b))
println(by_length)   # [TP53, EGFR, BRCA1]

# Reverse after sorting for descending order
let descending = sort([3, 1, 4, 1, 5]) |> reverse()
println(descending)   # [5, 4, 3, 1, 1]

Useful List Builtins

let data = [10, 20, 30, 40, 50]

# Length
println(len(data))   # 5

# take / skip
let first_three = take(data, 3)
println(first_three)   # [10, 20, 30]

let after_two = skip(data, 2)
println(after_two)   # [30, 40, 50]

# zip -- combine two lists element-wise
let names = ["Alice", "Bob", "Carol"]
let ages = [30, 25, 35]
let pairs = zip(names, ages)
println(pairs)   # [[Alice, 30], [Bob, 25], [Carol, 35]]

# enumerate -- get index and value
let items = ["a", "b", "c"]
let indexed = enumerate(items)
println(indexed)   # [[0, a], [1, b], [2, c]]