UniProt
The UniProt client provides access to the Universal Protein Resource, the most comprehensive protein knowledge base. It supports protein entry retrieval, full-text search, sequence BLAST, feature annotations, GO terms, and ID mapping between databases. No API key is required.
uniprot_entry
Fetch a complete UniProt entry by accession number. Returns detailed protein information including sequence, function, features, and cross-references:
# Fetch BRCA1 protein
let entry = uniprot_entry("P38398")
print(entry.accession) # => "P38398"
print(entry.name) # => "BRCA1_HUMAN"
print(entry.organism) # => "Homo sapiens"
print(entry.sequence_length) # => 1863
print(entry.gene_names) # => ["BRCA1"]
# Function annotation
print(entry.function)
# Features are fetched separately with uniprot_features()
let features = uniprot_features("P38398")
features
|> filter(|f| f.type == "Domain")
|> map(|f| { type: f.type, location: f.location, description: f.description })
|> to_table()
|> print()
uniprot_search
Search UniProt using free text or structured queries. Takes a query string and an optional max results limit. Supports the full UniProt query syntax:
# Simple text search (query, [max_results])
let results = uniprot_search("BRCA1 AND organism_id:9606")
results |> map(|r| print(r.accession, r.name))
# Structured query with limit
results = uniprot_search("kinase AND reviewed:true AND organism_id:9606", 50)
# Search by gene name
results = uniprot_search("gene:TP53 AND organism_id:9606 AND reviewed:true")
# Search by GO term — Apoptosis
results = uniprot_search("go:0006915 AND organism_id:9606")
results
|> map(|r| { accession: r.accession, name: r.name, length: r.sequence_length })
|> to_table()
|> print()
# Broad kinase search
results = uniprot_search("kinase AND organism_id:9606 AND reviewed:true", 100)
results |> to_table() |> print()
Features and Annotations
UniProt entries contain rich feature annotations including domains, active sites, binding sites, variants, and post-translational modifications:
# Get all features for a protein
let entry = uniprot_entry("P38398")
# Features are fetched with uniprot_features(accession)
let features = uniprot_features("P38398")
# Filter by feature type — each feature has: type, location, description
let domains = features |> filter(|f| f.type == "Domain")
domains |> map(|d| print(d.description, ":", d.location))
# Variants
let variants = features |> filter(|f| f.type == "Natural variant")
variants |> map(|v| print(v.description, "@", v.location))
# Post-translational modifications
let ptms = features |> filter(|f| f.type == "Modified residue")
ptms |> map(|p| print(p.type, "@", p.location, ":", p.description))
# GO terms are fetched with uniprot_go(accession)
let go = uniprot_go("P38398")
go
|> filter(|g| g.aspect == "biological_process")
|> map(|g| print(g.id, g.term))
Practical Example: Protein Comparison
# Compare properties of cancer-related proteins
let cancer_genes = ["BRCA1", "TP53", "EGFR", "KRAS", "PIK3CA", "PTEN"]
let comparison = cancer_genes |> map(|gene| {
let results = uniprot_search("gene:{gene} AND organism_id:9606 AND reviewed:true")
let entry = uniprot_entry(results[0].accession)
{
gene: gene,
accession: entry.accession,
length: entry.sequence_length,
domains: uniprot_features(entry.accession) |> filter(|f| f.type == "Domain") |> len(),
go_terms: uniprot_go(entry.accession) |> len()
}
})
comparison |> to_table() |> print()
comparison |> to_table() |> write_csv("protein_comparison.csv")
ID Mapping
# Map between database identifiers
# UniProt accession → Ensembl gene ID
# Fetch entries and compare across accessions
let accessions = ["P38398", "P04637", "P00533"]
accessions |> map(|acc| {
let entry = uniprot_entry(acc)
{
uniprot: acc,
name: entry.name,
genes: entry.gene_names,
length: entry.sequence_length
}
}) |> to_table() |> print()