Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Knowledge Graphs

BioLang includes a built-in graph data structure for modeling biological networks — protein interactions, gene regulatory networks, metabolic pathways, and more.

Creating Graphs

# Empty undirected graph
let g = graph()

# Directed graph
let g = graph(true)

Adding Nodes and Edges

Nodes are identified by string IDs and can carry arbitrary attributes:

let g = graph()
let g = add_node(g, "BRCA1", {biotype: "protein_coding", chrom: "chr17"})
let g = add_node(g, "TP53", {biotype: "protein_coding", chrom: "chr17"})
let g = add_edge(g, "BRCA1", "TP53", {score: 0.99, source: "STRING"})

Adding an edge auto-creates nodes that don’t exist yet:

let g = graph()
let g = add_edge(g, "A", "B")  # both A and B are created
has_node(g, "A")                # true

Querying the Graph

# Build a small PPI network
let g = graph()
let g = add_edge(g, "BRCA1", "TP53", {score: 0.99})
let g = add_edge(g, "TP53", "MDM2", {score: 0.97})
let g = add_edge(g, "BRCA1", "BARD1", {score: 0.95})
let g = add_edge(g, "MDM2", "CDKN2A", {score: 0.85})

# Direct neighbors
neighbors(g, "TP53")           # ["BRCA1", "MDM2"]

# Node degree
degree(g, "BRCA1")             # 2

# Shortest path (BFS)
shortest_path(g, "BARD1", "CDKN2A")  # ["BARD1", "BRCA1", "TP53", "MDM2", "CDKN2A"]

# All nodes and edges
nodes(g)                       # ["BARD1", "BRCA1", "CDKN2A", "MDM2", "TP53"]
edges(g)                       # Table with from, to, weight columns

Graph Analysis

Connected Components

Find disconnected subnetworks:

let g = graph()
let g = add_edge(g, "BRCA1", "TP53")
let g = add_edge(g, "BRCA1", "BARD1")
let g = add_node(g, "ISOLATED_GENE")
let g = add_edge(g, "CDK2", "CCND1")

let components = connected_components(g)
print("Number of components: " + str(len(components)))
# 3: [BRCA1, TP53, BARD1], [ISOLATED_GENE], [CDK2, CCND1]

Induced Subgraph

Extract a subgraph containing only specified nodes and their connecting edges:

let g = graph()
let g = add_edge(g, "A", "B")
let g = add_edge(g, "B", "C")
let g = add_edge(g, "C", "D")

let sub = subgraph(g, ["A", "B", "C"])
has_edge(sub, "A", "B")    # true
has_edge(sub, "C", "D")    # false (D not in subgraph)

Node Attributes

let g = graph()
let g = add_node(g, "EGFR", {
    biotype: "protein_coding",
    chrom: "chr7",
    pathway: "EGFR signaling"
})

let attrs = node_attr(g, "EGFR")
print(attrs.pathway)    # "EGFR signaling"

Removing Nodes and Edges

let g = graph()
let g = add_edge(g, "A", "B")
let g = add_edge(g, "B", "C")

# Remove a single edge
let g = remove_edge(g, "A", "B")
has_edge(g, "A", "B")    # false

# Remove a node (also removes all connected edges)
let g = remove_node(g, "B")
has_node(g, "B")          # false

Directed vs Undirected

By default, graphs are undirected — edges go both ways:

let g = graph()
let g = add_edge(g, "A", "B")
neighbors(g, "B")    # ["A"] — B sees A as neighbor

For directed graphs (e.g., regulatory networks):

let g = graph(true)
let g = add_edge(g, "TF", "TARGET")
neighbors(g, "TF")      # ["TARGET"]
neighbors(g, "TARGET")  # [] — directed, no reverse edge

Real-World Example: STRING Network Analysis

# Fetch protein interactions from STRING
let network = string_network(["BRCA1"], 9606)

# Build graph from API results — network is list of {protein_a, protein_b, score}
let g = graph()
let g = network |> reduce(g, |g, edge|
    add_edge(g, edge.protein_a, edge.protein_b, {score: edge.score})
)

# Find hub genes (highest degree)
let gene_degrees = nodes(g) |> map(|n| {gene: n, deg: degree(g, n)})
gene_degrees
  |> sort_by(|r| r.deg, desc: true)
  |> take(10)
  |> each(|r| print(r.gene + ": " + str(r.deg) + " interactions"))

# Check connectivity
let components = connected_components(g)
print("Connected components: " + str(len(components)))

Builtin Reference

FunctionArgsDescription
graph()[directed]Create empty graph (default: undirected)
add_node(g, id, [attrs])graph, Str, [Record]Add node with optional attributes
add_edge(g, from, to, [attrs])graph, Str, Str, [Record]Add edge (auto-creates nodes)
remove_node(g, id)graph, StrRemove node and its edges
remove_edge(g, from, to)graph, Str, StrRemove first matching edge
neighbors(g, id)graph, StrList of neighbor node IDs
degree(g, id)graph, StrNumber of edges for a node
shortest_path(g, from, to)graph, Str, StrBFS shortest path (nil if none)
connected_components(g)graphList of node-ID lists per component
nodes(g)graphSorted list of all node IDs
edges(g)graphTable with from, to, weight columns
has_node(g, id)graph, StrBool
has_edge(g, from, to)graph, Str, StrBool
subgraph(g, ids)graph, List[Str]Induced subgraph
node_attr(g, id)graph, StrAttributes record for a node