Math & Statistics
27 functions for descriptive statistics, hypothesis testing, mathematical operations, and ODE integration. Built for bioinformatics data analysis workflows.
Descriptive Statistics
mean
Arithmetic mean of a numeric list.
mean(list) -> float
| Parameter | Type | Description |
|---|---|---|
| list | list<number> | Numeric values |
mean([30, 28, 35, 22, 31]) # 29.2
let coverage = [12, 45, 89, 102, 67, 54]
println("Mean coverage:", mean(coverage)) # Mean coverage: 61.5
Edge case: Empty list returns NaN. Filter out nil values before computing.
median
Middle value of a sorted list. For even-length lists, returns the mean of the two middle values.
median(list) -> float
median([1, 3, 5, 7, 9]) # 5.0
median([1, 3, 5, 7]) # 4.0 (mean of 3 and 5)
stdev / variance
Sample standard deviation and variance (Bessel-corrected, divides by n-1).
stdev(list) -> float
variance(list) -> float
let data = [10, 12, 23, 23, 16, 23, 21, 16]
stdev(data) # 5.2372...
variance(data) # 27.4285...
sum
Sum all elements in a numeric list.
sum(list) -> number
sum([1, 2, 3, 4, 5]) # 15
let reads_per_sample = [1200000, 980000, 1500000, 870000]
println("Total reads:", sum(reads_per_sample)) # Total reads: 4550000
quantile
Compute the q-th quantile (0.0 to 1.0) using linear interpolation.
quantile(list, q) -> float
| Parameter | Type | Description |
|---|---|---|
| list | list<number> | Numeric values |
| q | float | Quantile (0.0 to 1.0) |
let data = [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]
quantile(data, 0.25) # 5.5 (Q1)
quantile(data, 0.50) # 11.0 (median)
quantile(data, 0.75) # 16.5 (Q3)
quantile(data, 0.95) # 19.1
Hypothesis Testing
ttest
Two-sample Student's t-test. Returns a map with t-statistic, p-value, and degrees of freedom.
ttest(a, b) -> map
let control = [5.1, 4.9, 5.3, 5.0, 4.8]
let treated = [6.2, 5.8, 6.5, 6.1, 5.9]
let result = ttest(control, treated)
# {"t_stat": -5.27, "p_value": 0.0006, "df": 8, "significant": true}
if result.p_value < 0.05 {
println("Significant difference (p =", result.p_value, ")")
}
anova
One-way ANOVA test across multiple groups.
anova(groups) -> map
let g1 = [23, 25, 28, 22]
let g2 = [30, 32, 35, 29]
let g3 = [18, 20, 22, 19]
let result = anova([g1, g2, g3])
# {"f_stat": 15.8, "p_value": 0.001, "df_between": 2, "df_within": 9}
chi_square
Chi-squared test for independence on a contingency table (2D list).
chi_square(observed, expected) -> map
let observed = [45, 55, 30, 70]
let expected = [50, 50, 50, 50]
let result = chi_square(observed, expected)
# {"chi2": 4.76, "p_value": 0.029, "df": 1}
cor
Pearson correlation coefficient between two numeric lists of equal length.
cor(a, b) -> float
let expression = [2.1, 4.3, 3.8, 6.1, 5.5]
let methylation = [0.8, 0.3, 0.4, 0.1, 0.2]
cor(expression, methylation) # -0.956 (strong negative)
Mathematical Functions
| Function | Signature | Description | Example |
|---|---|---|---|
| sqrt | sqrt(n) -> float | Square root | sqrt(16) # 4.0 |
| pow | pow(base, exp) -> float | Exponentiation | pow(2, 10) # 1024.0 |
| log | log(n) -> float | Natural logarithm (ln) | log(2.718) # ~1.0 |
| log2 | log2(n) -> float | Base-2 logarithm | log2(1024) # 10.0 |
| log10 | log10(n) -> float | Base-10 logarithm | log10(1000) # 3.0 |
| exp | exp(n) -> float | e raised to the power n | exp(1) # 2.718... |
| sin | sin(radians) -> float | Sine | sin(0) # 0.0 |
| cos | cos(radians) -> float | Cosine | cos(0) # 1.0 |
| tan | tan(radians) -> float | Tangent | tan(0) # 0.0 |
| ceil | ceil(n) -> int | Round up to integer | ceil(3.2) # 4 |
| floor | floor(n) -> int | Round down to integer | floor(3.8) # 3 |
| round | round(n, places?) -> float | Round to decimal places | round(3.14159, 2) # 3.14 |
| random | random() -> float | Random float in [0, 1) | random() # 0.7321... |
| sample | sample(list, n) -> list | Random sample without replacement | sample([1,2,3,4,5], 3) |
Common Bioinformatics Patterns
# log2 fold change
let control_mean = mean([5.1, 4.9, 5.3])
let treated_mean = mean([10.2, 9.8, 10.5])
let log2fc = log2(treated_mean / control_mean)
println("log2FC:", round(log2fc, 3)) # log2FC: 0.989
# -log10 p-value for volcano plots
let pvals = [0.001, 0.05, 0.0001, 0.5]
let neg_log10 = map(pvals, |p| -log10(p))
# [3.0, 1.301, 4.0, 0.301]
# Phred quality score conversion
let error_rate = 0.001
let phred = -10 * log10(error_rate) # 30.0
# Bootstrap confidence interval
let data = [23, 25, 28, 22, 30, 27]
let boots = range(0, 1000) |> map(|_| {
let s = sample(data, len(data))
mean(s)
})
println("95% CI:", quantile(boots, 0.025), "-", quantile(boots, 0.975))
into
Convert a value between types. Supports conversion to "List", "Set", "Table", "Str", "Int", "Float", and "Record".
into(value, type_str) -> value
into([1, 2, 3], "Set") # {1, 2, 3}
into({a: 1, b: 2}, "List") # [["a", 1], ["b", 2]]
into(3.14, "Int") # 3
into(42, "Str") # "42"
into("3.14", "Float") # 3.14
Kolmogorov-Smirnov Test
Two-sample KS test — tests whether two samples come from the same distribution.
let result = ks_test(sample_a, sample_b)
println("KS statistic:", result.statistic)
println("p-value:", result.pvalue)
# Compare quality distributions between lanes
let lane1_quals = map(reads_lane1, |r| r.mean_quality) |> collect()
let lane2_quals = map(reads_lane2, |r| r.mean_quality) |> collect()
let ks = ks_test(lane1_quals, lane2_quals)
if ks.pvalue < 0.05 { println("Significant difference between lanes") }
Correlation: Spearman & Kendall
Rank-based correlation coefficients for non-linear relationships.
# Spearman rank correlation
let sp = spearman(expression_a, expression_b)
println("Spearman rho:", sp.coefficient, "p:", sp.pvalue)
# Kendall's tau — robust to outliers
let kt = kendall(drug_dose, response)
println("Kendall tau:", kt.coefficient, "p:", kt.pvalue)
# Quick comparison — reassigning sp and kt
sp = spearman(x, y)
kt = kendall(x, y)
println("Spearman:", sp.coefficient, "Kendall:", kt.coefficient)
Survival Analysis
Kaplan-Meier estimator and Cox proportional hazards regression for time-to-event data.
# Kaplan-Meier survival curve
let km = kaplan_meier(times, events)
# Returns: {times, survival, ci_lower, ci_upper, at_risk}
# Access survival probabilities
for i in range(0, len(km.times)) {
println("t=", km.times[i], " S=", km.survival[i],
" [", km.ci_lower[i], ",", km.ci_upper[i], "]",
" n=", km.at_risk[i])
}
# Cox proportional hazards
let cox = cox_ph(time, event, covariates_table)
# Returns: {coefficients, hazard_ratios, pvalues, concordance}
println("Concordance:", cox.concordance)
for i in range(0, len(cox.coefficients)) {
println("HR:", cox.hazard_ratios[i], "p:", cox.pvalues[i])
}
Formula-Based Linear Models
The lm function fits a linear model from response and predictor data.
# Linear model: lm(response, predictors)
let model = lm(expression_vec, predictor_matrix)
# Returns: {coefficients, r_squared, adj_r_squared, residuals, predictions, ...}
println("R²:", model.r_squared)
println("Coefficients:", model.coefficients)