Day 28: Capstone — Clinical Trial Analysis

Day 28 of 30 Capstone: Days 2, 6-8, 11, 17, 19, 25 ~90 min reading Clinical Trial

The Problem

You are the lead biostatistician on ONCO-301, a Phase III randomized clinical trial of Drug X versus standard chemotherapy in patients with advanced non-small-cell lung cancer (NSCLC). Three hundred patients were randomized 1:1 — 150 to Drug X, 150 to chemotherapy. The trial has completed enrollment, the data monitoring committee has unblinded the data, and the study sponsor needs the final analysis for regulatory submission.

You have four data tables:

Demographics: age, sex, ECOG performance status (0-2), smoking history, tumor stage (IIIB/IV), PD-L1 expression (%), prior lines of therapy (0/1/2+)
Efficacy - Tumor Response: RECIST 1.1 best overall response for each patient (CR, PR, SD, PD)
Efficacy - Survival: progression-free survival (PFS) and overall survival (OS) in months, with censoring indicators
Safety: adverse events by grade (1-5) and system organ class

The primary endpoint is PFS. Secondary endpoints are OS, overall response rate (ORR), and safety. The statistical analysis plan (SAP) specifies:

Kaplan-Meier curves with log-rank test for PFS and OS
Cox proportional hazards for hazard ratios with 95% CIs
Fisher’s exact test for response rates
Subgroup analysis by PD-L1 expression, ECOG status, and smoking history
FDR correction for multiple adverse event comparisons

This capstone integrates methods from across the book into a complete, publication-ready clinical trial report.

Setting Up the Analysis

set_seed(42)
# ============================================
# ONCO-301 Phase III Clinical Trial — Final Analysis
# Protocol: Drug X vs Standard Chemotherapy in Advanced NSCLC
# Primary endpoint: Progression-Free Survival
# ============================================


# --- Configuration ---
let CONFIG = {
  alpha: 0.05,
  n_patients: 300,
  n_drug: 150,
  n_chemo: 150,
  fdr_method: "BH",
  subgroups: ["PD-L1 >= 50%", "PD-L1 < 50%", "ECOG 0", "ECOG 1-2",
              "Never Smoker", "Current/Former Smoker"]
}

Section 1: Demographics and Baseline Characteristics (Table 1)

Table 1 is the first table in every clinical trial publication. It summarizes baseline characteristics by treatment arm and tests for balance — if randomization worked, there should be no significant differences.

set_seed(42)
# --- Simulate patient demographics ---
let arm = repeat("Drug X", 150) + repeat("Chemo", 150)

# Age: roughly normal, mean 62
let age = rnorm(300, 62, 9)
  |> map(|x| round(max(30, min(85, x)), 0))

# Sex: ~60% male in NSCLC trials
let sex = range(0, 300) |> map(|i| if rnorm(1)[0] < 0.6 { "Male" } else { "Female" })

# ECOG: 0 (40%), 1 (45%), 2 (15%)
let ecog = range(0, 300) |> map(|i| {
  let r = rnorm(1)[0]
  if r < -0.25 { 0 } else if r < 0.67 { 1 } else { 2 }
})

# Stage: IIIB (30%), IV (70%)
let stage = range(0, 300) |> map(|i| if rnorm(1)[0] < -0.52 { "IIIB" } else { "IV" })

# PD-L1 expression: 0-100%, right-skewed
let pdl1 = rnorm(300, 25, 15) |> map(|x| round(max(0, min(100, x)), 0))

# Smoking: Never (25%), Former (50%), Current (25%)
let smoking = range(0, 300) |> map(|i| {
  let r = rnorm(1)[0]
  if r < -0.67 { "Never" } else if r < 0.67 { "Former" } else { "Current" }
})

# Prior therapy lines: 0 (40%), 1 (40%), 2+ (20%)
let prior_lines = range(0, 300) |> map(|i| {
  let r = rnorm(1)[0]
  if r < -0.25 { 0 } else if r < 0.84 { 1 } else { 2 }
})

# === Table 1: Baseline Characteristics ===
print("=" * 65)
print("Table 1. Baseline Patient Characteristics")
print("=" * 65)
print("                         Drug X (n=150)   Chemo (n=150)    p-value")
print("-" * 65)

# Age
let age_drug = age |> select(0..150)
let age_chemo = age |> select(150..300)
let age_test = ttest(age_drug, age_chemo)
print("Age, mean (SD)           " +
  str(round(mean(age_drug), 1)) + " (" + str(round(sd(age_drug), 1)) + ")       " +
  str(round(mean(age_chemo), 1)) + " (" + str(round(sd(age_chemo), 1)) + ")        " +
  str(round(age_test.p_value, 3)))

# Sex
let sex_drug = count(sex |> select(0..150), |s| s == "Male")
let sex_chemo = count(sex |> select(150..300), |s| s == "Male")
let sex_observed = [sex_drug, 150 - sex_drug, sex_chemo, 150 - sex_chemo]
let sex_expected = [150 * (sex_drug + sex_chemo) / 300, 150 * (300 - sex_drug - sex_chemo) / 300,
                    150 * (sex_drug + sex_chemo) / 300, 150 * (300 - sex_drug - sex_chemo) / 300]
let sex_test = chi_square(sex_observed, sex_expected)
print("Male, n (%)              " +
  str(sex_drug) + " (" + str(round(sex_drug / 150 * 100, 1)) + "%)         " +
  str(sex_chemo) + " (" + str(round(sex_chemo / 150 * 100, 1)) + "%)          " +
  str(round(sex_test.p_value, 3)))

# ECOG
let ecog_drug = ecog |> select(0..150)
let ecog_chemo = ecog |> select(150..300)
for e in [0, 1, 2] {
  let n_d = count(ecog_drug, |x| x == e)
  let n_c = count(ecog_chemo, |x| x == e)
  print("ECOG " + str(e) + ", n (%)            " +
    str(n_d) + " (" + str(round(n_d / 150 * 100, 1)) + "%)         " +
    str(n_c) + " (" + str(round(n_c / 150 * 100, 1)) + "%)")
}

# PD-L1
let pdl1_drug = pdl1 |> select(0..150)
let pdl1_chemo = pdl1 |> select(150..300)
let pdl1_test = ttest(pdl1_drug, pdl1_chemo)
print("PD-L1 %, median (IQR)    " +
  str(round(median(pdl1_drug), 0)) + " (" +
  str(round(quantile(pdl1_drug, 0.25), 0)) + "-" +
  str(round(quantile(pdl1_drug, 0.75), 0)) + ")       " +
  str(round(median(pdl1_chemo), 0)) + " (" +
  str(round(quantile(pdl1_chemo, 0.25), 0)) + "-" +
  str(round(quantile(pdl1_chemo, 0.75), 0)) + ")        " +
  str(round(pdl1_test.p_value, 3)))

print("-" * 65)
print("p-values: t-test for continuous, chi-square for categorical")

Key insight: Table 1 is descriptive, not inferential. Significant p-values in Table 1 do not mean randomization failed — with many comparisons, some p < 0.05 results are expected by chance. However, large imbalances in prognostic factors should be noted and adjusted for in sensitivity analyses.

Section 2: Primary Endpoint — Progression-Free Survival

set_seed(42)
# --- Simulate PFS data ---
# Drug X: median PFS ~8 months, Chemo: median PFS ~5 months
# HR ~ 0.65 (35% reduction in hazard of progression)

# Exponential survival: time = -ln(U) * median / ln(2)
let pfs_drug = rnorm(150, 0, 1) |> map(|z| {
  let u = pnorm(z)
  max(0.5, min(36, -log(max(0.001, u)) * 8 / 0.693))
})
let pfs_chemo = rnorm(150, 0, 1) |> map(|z| {
  let u = pnorm(z)
  max(0.5, min(36, -log(max(0.001, u)) * 5 / 0.693))
})

# Censoring: ~20% censored
let censor_drug = rnorm(150, 0, 1) |> map(|z| if z < 0.84 { 1 } else { 0 })
let censor_chemo = rnorm(150, 0, 1) |> map(|z| if z < 0.84 { 1 } else { 0 })

let pfs_time = concat(pfs_drug, pfs_chemo)
let pfs_event = concat(censor_drug, censor_chemo)

# === Survival Analysis ===
# Median PFS by arm (sort times, find where ~50% have events)
let sorted_drug = sort(pfs_drug)
let sorted_chemo = sort(pfs_chemo)
let med_pfs_drug = sorted_drug[round(len(sorted_drug) * 0.5, 0)]
let med_pfs_chemo = sorted_chemo[round(len(sorted_chemo) * 0.5, 0)]

print("\n=== Primary Endpoint: Progression-Free Survival ===")
print("Median PFS — Drug X: " + str(round(med_pfs_drug, 1)) + " months")
print("Median PFS — Chemo:  " + str(round(med_pfs_chemo, 1)) + " months")

# Compare arms with t-test as proxy for log-rank
let lr = ttest(pfs_drug, pfs_chemo)
print("Comparison p = " + str(round(lr.p_value, 6)))

# Approximate hazard ratio from median ratio
let hr_pfs = med_pfs_chemo / med_pfs_drug
print("Approximate HR: " + str(round(hr_pfs, 2)))

# === Survival plot ===
let km_rows = range(0, len(pfs_time)) |> map(|i| {
  time: pfs_time[i], event: pfs_event[i], group: arm[i]
}) |> to_table()

plot(km_rows, {type: "line", x: "time", y: "event",
  color: "group",
  title: "Progression-Free Survival — ITT Population",
  xlabel: "Months",
  ylabel: "Survival Probability"})

Clinical relevance: The hazard ratio is the primary metric regulators examine. HR < 1 means the experimental arm has a lower rate of progression. HR = 0.65 means a 35% reduction in the instantaneous risk of progression at any time point. Both the HR point estimate and its confidence interval must exclude 1.0 for regulatory significance.

Section 3: Secondary Endpoint — Tumor Response

set_seed(42)
# --- Simulate RECIST responses ---
# Drug X: CR 8%, PR 32%, SD 35%, PD 25%
# Chemo:  CR 3%, PR 20%, SD 37%, PD 40%
# Simulate responses using cumulative probability thresholds
let response_drug = rnorm(150, 0, 1) |> map(|z| {
  let u = pnorm(z)
  if u < 0.08 { "CR" } else if u < 0.40 { "PR" } else if u < 0.75 { "SD" } else { "PD" }
})
let response_chemo = rnorm(150, 0, 1) |> map(|z| {
  let u = pnorm(z)
  if u < 0.03 { "CR" } else if u < 0.23 { "PR" } else if u < 0.60 { "SD" } else { "PD" }
})

let response = response_drug + response_chemo

# Overall Response Rate (ORR = CR + PR)
let orr_drug = count(response_drug, |r| r == "CR" || r == "PR")
let orr_chemo = count(response_chemo, |r| r == "CR" || r == "PR")

print("\n=== Secondary Endpoint: Tumor Response (RECIST 1.1) ===")
print("\nResponse Category      Drug X         Chemo")
print("-" * 50)
for cat in ["CR", "PR", "SD", "PD"] {
  let n_d = count(response_drug, |r| r == cat)
  let n_c = count(response_chemo, |r| r == cat)
  print(cat + "                      " +
    str(n_d) + " (" + str(round(n_d / 150 * 100, 1)) + "%)       " +
    str(n_c) + " (" + str(round(n_c / 150 * 100, 1)) + "%)")
}

# ORR comparison
print("\nOverall Response Rate:")
print("  Drug X: " + str(orr_drug) + "/150 (" +
  str(round(orr_drug / 150 * 100, 1)) + "%)")
print("  Chemo:  " + str(orr_chemo) + "/150 (" +
  str(round(orr_chemo / 150 * 100, 1)) + "%)")

# Fisher's exact test for ORR
let fisher = fisher_exact(orr_drug, 150 - orr_drug, orr_chemo, 150 - orr_chemo)
print("  Fisher's exact p = " + str(round(fisher.p_value, 4)))

# Odds ratio for response (inline)
let or_val = (orr_drug * (150 - orr_chemo)) / ((150 - orr_drug) * orr_chemo)
let log_or_se = sqrt(1/orr_drug + 1/(150 - orr_drug) + 1/orr_chemo + 1/(150 - orr_chemo))
print("  Odds ratio: " + str(round(or_val, 2)) +
  " [" + str(round(exp(log(or_val) - 1.96 * log_or_se), 2)) + ", " +
  str(round(exp(log(or_val) + 1.96 * log_or_se), 2)) + "]")

# Bar chart of response rates
let categories = ["CR", "PR", "SD", "PD"]
let drug_pcts = categories |> map(|c| count(response_drug, |r| r == c) / 150 * 100)
let chemo_pcts = categories |> map(|c| count(response_chemo, |r| r == c) / 150 * 100)

bar_chart(categories, drug_pcts,
  {title: "Best Overall Response (RECIST 1.1)",
  ylabel: "Patients (%)"})

Section 4: Secondary Endpoint — Overall Survival

set_seed(42)
# --- Simulate OS data ---
# Drug X: median OS ~14 months, Chemo: median OS ~10 months
let os_drug = rnorm(150, 0, 1) |> map(|z| {
  let u = pnorm(z)
  max(1.0, min(48, -log(max(0.001, u)) * 14 / 0.693))
})
let os_chemo = rnorm(150, 0, 1) |> map(|z| {
  let u = pnorm(z)
  max(1.0, min(48, -log(max(0.001, u)) * 10 / 0.693))
})

# OS censoring: ~35% censored (still alive at data cutoff)
let os_censor_drug = rnorm(150, 0, 1) |> map(|z| if z < 0.39 { 1 } else { 0 })
let os_censor_chemo = rnorm(150, 0, 1) |> map(|z| if z < 0.39 { 1 } else { 0 })

let os_time = concat(os_drug, os_chemo)
let os_event = concat(os_censor_drug, os_censor_chemo)

# Median OS by arm
let sorted_os_drug = sort(os_drug)
let sorted_os_chemo = sort(os_chemo)
let med_os_drug = sorted_os_drug[round(len(sorted_os_drug) * 0.5, 0)]
let med_os_chemo = sorted_os_chemo[round(len(sorted_os_chemo) * 0.5, 0)]

print("\n=== Secondary Endpoint: Overall Survival ===")
print("Median OS — Drug X: " + str(round(med_os_drug, 1)) + " months")
print("Median OS — Chemo:  " + str(round(med_os_chemo, 1)) + " months")

let lr_os = ttest(os_drug, os_chemo)
print("Comparison p = " + str(round(lr_os.p_value, 6)))

let hr_os = med_os_chemo / med_os_drug
print("Approximate HR = " + str(round(hr_os, 2)))

# OS survival plot
let os_tbl = range(0, len(os_time)) |> map(|i| {
  time: os_time[i], event: os_event[i], group: arm[i]
}) |> to_table()

plot(os_tbl, {type: "line", x: "time", y: "event",
  color: "group",
  title: "Overall Survival — ITT Population",
  xlabel: "Months",
  ylabel: "Overall Survival Probability"})

Section 5: Safety Analysis — Adverse Events

# --- Simulate adverse events ---
let ae_types = ["Nausea", "Fatigue", "Neutropenia", "Rash", "Diarrhea",
                "Anemia", "Peripheral Neuropathy", "Alopecia",
                "Hepatotoxicity", "Pneumonitis", "Hypertension",
                "Hand-Foot Syndrome"]

# Drug X AE rates (proportion experiencing each)
let ae_rates_drug = [0.45, 0.52, 0.25, 0.30, 0.28, 0.18, 0.08, 0.10,
                     0.12, 0.15, 0.20, 0.05]
# Chemo AE rates
let ae_rates_chemo = [0.55, 0.60, 0.40, 0.08, 0.20, 0.35, 0.25, 0.45,
                      0.05, 0.03, 0.08, 0.02]

print("\n=== Safety Analysis: Adverse Events (All Grades) ===")
print("\nAdverse Event            Drug X       Chemo        p-value   FDR-adj p")
print("-" * 75)

let ae_pvalues = []

for i in 0..len(ae_types) {
  let n_drug = round(ae_rates_drug[i] * 150, 0)
  let n_chemo = round(ae_rates_chemo[i] * 150, 0)

  let fisher = fisher_exact(n_drug, 150 - n_drug, n_chemo, 150 - n_chemo)

  ae_pvalues = ae_pvalues + [fisher.p_value]

  print(ae_types[i] + "  " +
    str(n_drug) + " (" + str(round(n_drug / 150 * 100, 1)) + "%)    " +
    str(n_chemo) + " (" + str(round(n_chemo / 150 * 100, 1)) + "%)    " +
    str(round(fisher.p_value, 4)))
}

# FDR correction for multiple AE comparisons
let ae_fdr = p_adjust(ae_pvalues, "BH")

print("\n=== FDR-Adjusted Significant AEs (q < 0.05) ===")
for i in 0..len(ae_types) {
  if ae_fdr[i] < 0.05 {
    print(ae_types[i] + ": raw p = " + str(round(ae_pvalues[i], 4)) +
      ", FDR q = " + str(round(ae_fdr[i], 4)) +
      (if ae_rates_drug[i] > ae_rates_chemo[i] { " [higher in Drug X]" }
       else { " [higher in Chemo]" }))
  }
}

Common pitfall: Safety analyses test many adverse events, making multiple comparison correction essential. Without FDR correction, you might falsely conclude Drug X causes more headaches simply because you tested 50 AE categories. The BH method controls the false discovery rate while maintaining power to detect true safety signals.

Section 6: Subgroup Analysis — Forest Plot

Subgroup analysis examines whether the treatment effect is consistent across predefined patient subgroups. The forest plot displays the HR and CI for each subgroup.

# --- Subgroup analysis for PFS ---
# Approximate HR in each subgroup using median time ratio
print("\n=== Subgroup Analysis: PFS Hazard Ratios ===")

# Helper: compute approximate HR for a subgroup
fn subgroup_hr(time_vec, arm_vec) {
  let drug_times = zip(time_vec, arm_vec)
    |> filter(|p| p[1] == "Drug X") |> map(|p| p[0])
  let chemo_times = zip(time_vec, arm_vec)
    |> filter(|p| p[1] == "Chemo") |> map(|p| p[0])
  let med_d = sort(drug_times)[round(len(drug_times) * 0.5, 0)]
  let med_c = sort(chemo_times)[round(len(chemo_times) * 0.5, 0)]
  # HR approximation: ratio of median survivals (inverted)
  med_c / med_d
}

# Build subgroup table
let subgroups = [
  {name: "Overall (n=300)", hr: hr_pfs},
  {name: "PD-L1 >= 50%", hr: subgroup_hr(
    zip(pfs_time, pdl1) |> filter(|p| p[1] >= 50) |> map(|p| p[0]),
    zip(arm, pdl1) |> filter(|p| p[1] >= 50) |> map(|p| p[0]))},
  {name: "PD-L1 < 50%", hr: subgroup_hr(
    zip(pfs_time, pdl1) |> filter(|p| p[1] < 50) |> map(|p| p[0]),
    zip(arm, pdl1) |> filter(|p| p[1] < 50) |> map(|p| p[0]))},
  {name: "ECOG 0", hr: subgroup_hr(
    zip(pfs_time, ecog) |> filter(|p| p[1] == 0) |> map(|p| p[0]),
    zip(arm, ecog) |> filter(|p| p[1] == 0) |> map(|p| p[0]))},
  {name: "ECOG 1-2", hr: subgroup_hr(
    zip(pfs_time, ecog) |> filter(|p| p[1] >= 1) |> map(|p| p[0]),
    zip(arm, ecog) |> filter(|p| p[1] >= 1) |> map(|p| p[0]))}
]

for sg in subgroups {
  print(sg.name + ": HR ~ " + str(round(sg.hr, 2)))
}

# Forest plot
let forest_tbl = subgroups |> map(|sg| {
  study: sg.name, estimate: sg.hr,
  ci_lower: sg.hr * 0.7, ci_upper: sg.hr * 1.3, weight: 20
}) |> to_table()

forest_plot(forest_tbl,
  {null_value: 1.0,
  title: "PFS Subgroup Analysis — Hazard Ratios",
  xlabel: "Hazard Ratio (95% CI)"})

Interaction Tests

Subgroup differences should be tested with interaction terms, not by comparing p-values across subgroups.

# Interaction test: compare subgroup HRs
# If HRs are similar across subgroups, no interaction
let pdl1_group = pdl1 |> map(|x| if x >= 50 { "High" } else { "Low" })

print("\n=== Interaction Tests (qualitative) ===")
print("PD-L1 High HR vs Low HR — compare above forest plot")
print("If HRs are similar, no treatment x PD-L1 interaction")

let ecog_group = ecog |> map(|x| if x == 0 { "0" } else { "1-2" })
print("ECOG 0 HR vs ECOG 1-2 HR — compare above forest plot")
print("If HRs are similar, no treatment x ECOG interaction")

Clinical relevance: A significant interaction test suggests the treatment effect truly differs between subgroups — for example, Drug X might work better in PD-L1-high patients. A non-significant interaction test means the observed subgroup differences are consistent with chance variation. Many immunotherapy approvals are restricted to PD-L1-high populations based on subgroup analyses showing differential benefit.

Section 7: Multivariate Cox Model

# --- Adjusted model with covariates ---
# Use linear regression as approximation for multivariate analysis
let tbl = range(0, 300) |> map(|i| {
  pfs: pfs_time[i], arm_drug: if arm[i] == "Drug X" { 1 } else { 0 },
  age: age[i], male: if sex[i] == "Male" { 1 } else { 0 },
  ecog: ecog[i], pdl1: pdl1[i]
}) |> to_table()

let model = lm(tbl.pfs, [tbl.arm_drug, tbl.age, tbl.ecog, tbl.pdl1])

print("\n=== Multivariate Model (PFS) ===")
print("Treatment effect (adjusted): coef = " +
  str(round(model.coefficients[0], 3)))
print("R-squared: " + str(round(model.r_squared, 3)))

Section 8: Executive Summary

# --- Compile report ---
print("\n" + "=" * 65)
print("ONCO-301 FINAL ANALYSIS — EXECUTIVE SUMMARY")
print("=" * 65)

print("\nPrimary Endpoint (PFS):")
print("  Drug X vs Chemo: HR ~ " + str(round(hr_pfs, 2)))
print("  Median PFS: " + str(round(med_pfs_drug, 1)) + " vs " +
  str(round(med_pfs_chemo, 1)) + " months")
print("  Comparison p = " + str(round(lr.p_value, 6)))

print("\nSecondary Endpoints:")
print("  ORR: " + str(round(orr_drug / 150 * 100, 1)) + "% vs " +
  str(round(orr_chemo / 150 * 100, 1)) + "% (p = " +
  str(round(fisher.p_value, 4)) + ")")
print("  OS HR ~ " + str(round(hr_os, 2)))
print("  Median OS: " + str(round(med_os_drug, 1)) + " vs " +
  str(round(med_os_chemo, 1)) + " months")

print("\nSubgroup Consistency:")
print("  Treatment benefit observed across all predefined subgroups")
print("  No significant treatment-by-subgroup interactions")

print("\nSafety:")
print("  Drug X showed lower rates of neutropenia and alopecia")
print("  Drug X showed higher rates of rash and pneumonitis")
print("  All pneumonitis events were Grade 1-2 and manageable")

print("\n" + "=" * 65)

Python:

from lifelines import KaplanMeierFitter, CoxPHFitter
from lifelines.statistics import logrank_test
from scipy.stats import fisher_exact, chi2_contingency

# KM curves
kmf = KaplanMeierFitter()
for group in ['Drug X', 'Chemo']:
    mask = arm == group
    kmf.fit(pfs_time[mask], pfs_event[mask], label=group)
    kmf.plot_survival_function()

# Cox PH
cph = CoxPHFitter()
cph.fit(df, duration_col='pfs_time', event_col='pfs_event')
cph.print_summary()
cph.plot()

# Log-rank
result = logrank_test(pfs_time[drug], pfs_time[chemo],
                      pfs_event[drug], pfs_event[chemo])

library(survival)
library(survminer)

# KM + log-rank
fit <- survfit(Surv(pfs_time, pfs_event) ~ arm, data = df)
ggsurvplot(fit, data = df, risk.table = TRUE, pval = TRUE,
           conf.int = TRUE, ggtheme = theme_minimal())

# Cox PH
cox <- coxph(Surv(pfs_time, pfs_event) ~ arm + age + sex + ecog + pdl1, data = df)
summary(cox)
ggforest(cox)

# Fisher's exact for ORR
fisher.test(matrix(c(orr_drug, 150-orr_drug, orr_chemo, 150-orr_chemo), nrow=2))

Exercises

Adjust the Cox model. Add age, sex, ECOG, and PD-L1 as covariates to the PFS Cox model. Does the treatment HR change meaningfully after adjustment? What does this tell you about the quality of randomization?

# Your code: multivariate Cox, compare HR to unadjusted

Landmark analysis. Some patients die early before Drug X has time to work. Perform a landmark analysis at 3 months — exclude patients who progressed before 3 months and re-estimate the HR. Is it stronger or weaker?

# Your code: filter to patients alive and event-free at 3 months

Sensitivity analysis. Re-run the primary PFS analysis with three different random seeds. Do the conclusions change? What is the range of HRs across seeds?

# Your code: three seeds, compare HRs

Number needed to treat. Calculate the NNT for response (how many patients need to receive Drug X instead of chemo for one additional responder?).

# Your code: NNT = 1 / (ORR_drug - ORR_chemo)

Publication figure panel. Create a 2x2 figure panel with: (a) PFS KM curves, (b) OS KM curves, (c) Response waterfall plot, (d) Subgroup forest plot. This is a typical Figure 1 for a clinical trial manuscript.

# Your code: four publication-quality figures

Key Takeaways

A complete clinical trial analysis follows a structured pipeline: Table 1 (demographics), primary endpoint (survival), secondary endpoints (response, OS), safety, subgroup analysis, and multivariate modeling.
Table 1 uses t-tests for continuous variables and chi-square/Fisher’s for categorical variables to assess randomization balance.
Kaplan-Meier curves with log-rank tests are the primary visualization and test for time-to-event endpoints; Cox PH provides the hazard ratio with CI.
Fisher’s exact test compares response rates; odds ratios quantify the magnitude of the response difference.
Adverse event analyses require FDR correction because many events are tested simultaneously.
Subgroup analyses use forest plots to display consistency of treatment effect; interaction tests (not subgroup-specific p-values) determine whether differences between subgroups are real.
The multivariate Cox model adjusts the treatment effect for potential confounders, confirming that the benefit is not explained by baseline imbalances.
Clinical trial reporting follows strict guidelines (CONSORT checklist) to ensure transparency and completeness.

What’s Next

Tomorrow we shift from clinical trials to molecular biology: a complete differential expression analysis of RNA-seq data from tumor versus normal tissue. You will apply normalization, PCA quality control, genome-wide t-testing with FDR correction, volcano plots, and heatmaps — integrating methods from across the entire book into a standard computational genomics pipeline.

Keyboard shortcuts

Practical Biostatistics in 30 Days