Other API Clients
Beyond the major databases, BioLang includes clients for KEGG, STRING, PDB, Reactome, Gene Ontology, COSMIC, and BioMart. All are available as global functions with no import required.
KEGG (Kyoto Encyclopedia of Genes and Genomes)
Search databases, retrieve entries, and find cross-references with three
functions: kegg_find, kegg_get, and kegg_link.
# Search KEGG for pathways related to apoptosis
let results = kegg_find("pathway", "apoptosis")
results |> map(|r| print(r.id, r.description))
# Get the raw KEGG entry for a specific pathway
let entry = kegg_get("hsa04110") # Cell cycle
print(entry) # raw KEGG flat-file text
# Find all human pathways mentioning BRCA1
results = kegg_find("genes", "brca1+hsa")
results |> map(|r| print(r.id, r.description))
# Cross-reference: find genes linked to a pathway
let links = kegg_link("hsa", "pathway:hsa04110")
links |> map(|l| print(l.source, "->", l.target))
# Cross-reference: find pathways for a gene
let pathways = kegg_link("pathway", "hsa:672") # BRCA1
pathways |> map(|p| print(p.target))
# Get compound information
let compound_text = kegg_get("C00002") # ATP
print(compound_text)
STRING (Protein-Protein Interaction Network)
Query protein interaction networks and functional enrichment with
string_network and string_enrichment.
The species argument is a numeric NCBI taxonomy ID (9606 = human).
# Get interaction network for a protein (first arg must be a list)
let network = string_network(["TP53"], 9606)
# Returns list of: {protein_a, protein_b, score}
network |> each(|i| print(i.protein_a + " <-> " + i.protein_b + " score: " + str(i.score)))
# Get network for multiple proteins
let proteins = ["BRCA1", "BRCA2", "RAD51", "ATM", "ATR"]
network = string_network(proteins, 9606)
# Filter by confidence score
let high_conf = network |> filter(|i| i.score > 0.9)
print(high_conf |> len(), "high-confidence interactions")
# Functional enrichment analysis
let enrichment = string_enrichment(proteins, 9606)
enrichment
|> filter(|e| e.category == "Process")
|> sort(.fdr)
|> take(10)
|> map(|e| print(e.term, e.description, "FDR:", e.fdr))
PDB (Protein Data Bank)
Look up 3D protein structures with pdb_entry and search
the archive with pdb_search.
# Get a structure entry
let entry = pdb_entry("4HHB") # Hemoglobin
print(entry.title)
print(entry.resolution) # => 1.74 Angstroms
print(entry.method) # => "X-RAY DIFFRACTION"
print(entry.deposition_date)
# List entities (chains / molecules)
entry.entities |> map(|e| print(e.description, e.type))
# Search PDB by keyword
let ids = pdb_search("CRISPR Cas9")
print(ids |> len(), "structures found")
# Look up details for each hit
ids |> take(5) |> map(|id| {
let e = pdb_entry(id)
print(e.id, e.title, e.resolution)
})
Reactome (Pathway Database)
Search curated biological pathways with reactome_search
and find pathways for specific genes with reactome_pathways.
# Search Reactome for DNA repair pathways
let results = reactome_search("DNA repair")
results
|> filter(|r| r.species == "Homo sapiens")
|> map(|r| print(r.id, r.name))
# Find pathways for a gene
let pathways = reactome_pathways("TP53")
pathways |> map(|p| print(p.id, p.name))
# Find pathways for multiple genes
let genes = ["BRCA1", "BRCA2", "RAD51"]
genes |> map(|g| {
let paths = reactome_pathways(g)
print("{g}: {paths |> len()} pathways")
})
Gene Ontology (GO)
Look up GO terms, retrieve annotations, and navigate
the ontology hierarchy. Functions include:
go_term, go_annotations,
go_children, go_parents,
go_ancestors, and go_descendants.
# Look up a GO term by ID
let term = go_term("GO:0006915") # Apoptotic process
print(term.name) # => "apoptotic process"
print(term.namespace) # => "biological_process"
print(term.definition)
# Get annotations for a gene
let annotations = go_annotations("TP53")
annotations |> map(|a| {
go_id: a.go_id,
name: a.term,
aspect: a.aspect
}) |> to_table() |> print()
# Get parent terms in the GO hierarchy
let parents = go_parents("GO:0006915")
parents |> map(|p| print(p.id, p.name))
# Get child terms in the GO hierarchy
let children = go_children("GO:0008150") # biological_process
print(children |> len(), "child terms")
children |> map(|c| print(c.id, c.name))
COSMIC (Catalogue of Somatic Mutations in Cancer)
Query somatic mutation data for a gene. Requires the
COSMIC_API_KEY environment variable.
# Query mutations in a gene
let mutations = cosmic_gene("TP53")
mutations |> map(|m| {
mutation_id: m.id,
position: m.genomic_position,
aa_change: m.aa_mutation,
cancer_type: m.primary_site,
count: m.count
}) |> to_table() |> print()
# Sort by mutation frequency
mutations
|> sort(.count)
|> reverse()
|> take(20)
|> map(|m| print(m.aa_mutation, "count:", m.count))
# Check another gene
let egfr = cosmic_gene("EGFR")
print("EGFR: {egfr |> len()} mutations found")
BioMart (Ensembl BioMart)
Bulk data retrieval from Ensembl BioMart via biomart_query.
Takes three positional arguments: dataset, attributes (list), and filters (record).
# Query gene attributes in a genomic region
let genes = biomart_query(
"hsapiens_gene_ensembl",
["ensembl_gene_id", "external_gene_name", "gene_biotype", "start_position", "end_position"],
{chromosome_name: "17", start: 43044295, end: 43170245}
)
genes |> to_table() |> print()
# Get GO annotations for a gene list
let go_annot = biomart_query(
"hsapiens_gene_ensembl",
["hgnc_symbol", "go_id", "name_1006", "namespace_1003"],
{hgnc_symbol: ["BRCA1", "TP53", "EGFR"]}
)
go_annot |> to_table() |> print()
# Get ortholog mappings
let orthologs = biomart_query(
"hsapiens_gene_ensembl",
["external_gene_name", "mmusculus_homolog_ensembl_gene", "mmusculus_homolog_orthology_type"],
{hgnc_symbol: ["BRCA1", "TP53"]}
)
orthologs |> to_table() |> print()
GO Graph Traversal
Navigate the Gene Ontology hierarchy with traversal functions.
| Function | Description |
|---|---|
go_term(id) | Look up a term by GO ID — returns {id, name, namespace, definition} |
go_annotations(gene) | Get GO annotations for a gene — returns list of {go_id, term, aspect} |
go_children(id) | Direct child terms |
go_parents(id) | Direct parent terms |
go_ancestors(id) | All ancestor terms |
go_descendants(id) | All descendant terms |
# Navigate the GO hierarchy
let children = go_children("GO:0008150")
print("Child terms: {children |> len()}")
children |> map(|c| print(c.id, c.name))
# Look up a specific term and its children
let term = go_term("GO:0005634") # nucleus
let kids = go_children("GO:0005634")
print("{term.name} has {kids |> len()} children")
# Get annotations and cross-reference with term details
let annotations = go_annotations("BRCA1")
annotations |> map(|a| {
let detail = go_term(a.go_id)
print(a.go_id, detail.name, "({detail.namespace})")
})