Awesome-Agent-Skills-for-Empirical-Research clinical-trial-design-guide

Clinical trial methodology, biostatistics, and study design guidance

install

source · Clone the upstream repo

git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/43-wentorai-research-plugins/skills/domains/pharma/clinical-trial-design-guide" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-clinical-trial-de && rm -rf "$T"

manifest: skills/43-wentorai-research-plugins/skills/domains/pharma/clinical-trial-design-guide/SKILL.md

source content

Clinical Trial Design Guide

A skill for designing and analyzing clinical trials, covering study design selection, sample size calculation, randomization methods, interim analysis, survival endpoints, and regulatory considerations. Essential for pharmaceutical researchers, biostatisticians, and clinical scientists.

Clinical Trial Phases

Phase Overview

Phase	Objective	Typical N	Duration	Primary Endpoints
Phase I	Safety, dose-finding	20-80	Months	MTD, DLT, PK profile
Phase II	Efficacy signal, dosing	100-300	1-2 years	Response rate, biomarker
Phase III	Confirmatory efficacy	300-3,000+	2-4 years	OS, PFS, clinical outcome
Phase IV	Post-marketing surveillance	1,000+	Ongoing	Safety, real-world effectiveness

Study Design Selection

Common Designs

Parallel Group (most common Phase III):
  R --> Treatment A --> Outcome assessment
  R --> Treatment B --> Outcome assessment

Crossover:
  R --> Treatment A --> Washout --> Treatment B --> Outcome
  R --> Treatment B --> Washout --> Treatment A --> Outcome

Factorial (2x2):
  R --> Drug A + Drug B
  R --> Drug A + Placebo B
  R --> Placebo A + Drug B
  R --> Placebo A + Placebo B

Adaptive:
  Stage 1: Enroll n1 patients --> Interim analysis
  Stage 2: Modify design (dose, sample size, arm dropping) --> Continue

Design Selection Criteria

Factor	Recommended Design
Chronic disease, stable condition	Crossover (within-subject comparison)
Acute condition, one-time treatment	Parallel group
Multiple drugs to evaluate	Factorial or multi-arm
High uncertainty in effect size	Adaptive (sample size re-estimation)
Rare disease, limited patients	Bayesian adaptive, single-arm with historical control

Sample Size Calculation

Two-Sample Comparison of Means

from scipy.stats import norm
import numpy as np

def sample_size_two_means(delta: float, sigma: float,
                           alpha: float = 0.05, power: float = 0.80,
                           ratio: float = 1.0) -> dict:
    """
    Sample size for comparing two group means (two-sided test).
    delta: minimum clinically important difference
    sigma: pooled standard deviation
    alpha: type I error rate
    power: desired power (1 - beta)
    ratio: allocation ratio (n2/n1)
    """
    z_alpha = norm.ppf(1 - alpha / 2)
    z_beta = norm.ppf(power)
    effect = delta / sigma

    n1 = ((z_alpha + z_beta) ** 2 * (1 + 1 / ratio)) / effect ** 2
    n2 = ratio * n1

    return {
        "n_per_group_1": int(np.ceil(n1)),
        "n_per_group_2": int(np.ceil(n2)),
        "total": int(np.ceil(n1) + np.ceil(n2)),
        "effect_size": round(effect, 3),
    }

# Example: detect 5-point difference, SD=15, 80% power
result = sample_size_two_means(delta=5, sigma=15)
print(f"Required: {result['total']} total patients")

Sample Size for Survival Endpoints

def sample_size_logrank(hazard_ratio: float, alpha: float = 0.05,
                         power: float = 0.80, ratio: float = 1.0,
                         median_control: float = 12.0,
                         accrual_time: float = 24.0,
                         followup_time: float = 12.0) -> dict:
    """
    Sample size for log-rank test comparing two survival curves.
    hazard_ratio: expected HR (treatment/control), <1 means treatment better
    median_control: median survival in control arm (months)
    """
    z_alpha = norm.ppf(1 - alpha / 2)
    z_beta = norm.ppf(power)

    # Required number of events (Schoenfeld formula)
    d = ((z_alpha + z_beta) ** 2 * (1 + ratio) ** 2) / (
        ratio * (np.log(hazard_ratio)) ** 2
    )
    d = int(np.ceil(d))

    # Estimate probability of event during study
    lambda_c = np.log(2) / median_control
    lambda_t = lambda_c * hazard_ratio

    # Average probability of event (simplified uniform accrual)
    p_event_c = 1 - np.exp(-lambda_c * followup_time)
    p_event_t = 1 - np.exp(-lambda_t * followup_time)
    p_event_avg = (p_event_c + ratio * p_event_t) / (1 + ratio)

    n_total = int(np.ceil(d / p_event_avg))

    return {
        "events_required": d,
        "total_patients": n_total,
        "hazard_ratio": hazard_ratio,
        "p_event_avg": round(p_event_avg, 3),
    }

Randomization Methods

Implementation

import random

def stratified_block_randomization(strata: list[str],
                                     block_sizes: list[int] = [4, 6],
                                     ratio: tuple = (1, 1),
                                     seed: int = 42) -> list[str]:
    """
    Stratified permuted block randomization.
    strata: list of stratum labels for each patient (in enrollment order)
    block_sizes: list of possible block sizes (randomly selected)
    ratio: allocation ratio (e.g., (1,1) for 1:1, (2,1) for 2:1)
    Returns list of treatment assignments ('A' or 'B').
    """
    rng = random.Random(seed)
    stratum_queues = {}
    assignments = []

    for stratum in strata:
        if stratum not in stratum_queues:
            stratum_queues[stratum] = []

        if not stratum_queues[stratum]:
            # Generate new block
            block_size = rng.choice(block_sizes)
            n_a = block_size * ratio[0] // sum(ratio)
            n_b = block_size - n_a
            block = ["A"] * n_a + ["B"] * n_b
            rng.shuffle(block)
            stratum_queues[stratum] = block

        assignments.append(stratum_queues[stratum].pop(0))

    return assignments

Interim Analysis and Monitoring

Group Sequential Design

def obrien_fleming_boundary(n_looks: int, alpha: float = 0.05) -> list[float]:
    """
    Compute O'Brien-Fleming spending function boundaries.
    Provides very conservative early stopping with near-nominal final alpha.
    """
    from scipy.stats import norm
    boundaries = []
    for k in range(1, n_looks + 1):
        info_fraction = k / n_looks
        z_boundary = norm.ppf(1 - alpha / 2) / np.sqrt(info_fraction)
        p_boundary = 2 * (1 - norm.cdf(z_boundary))
        boundaries.append({
            "look": k,
            "info_fraction": round(info_fraction, 3),
            "z_boundary": round(z_boundary, 4),
            "p_boundary": round(p_boundary, 6),
        })
    return boundaries

# Example: 3 interim looks + 1 final
boundaries = obrien_fleming_boundary(4)
for b in boundaries:
    print(f"Look {b['look']}: Z={b['z_boundary']}, p={b['p_boundary']}")

Survival Analysis

Kaplan-Meier and Log-Rank Test

from lifelines import KaplanMeierFitter
from lifelines.statistics import logrank_test

def analyze_survival(time: pd.Series, event: pd.Series,
                      group: pd.Series) -> dict:
    """
    Perform Kaplan-Meier estimation and log-rank test.
    time: follow-up duration
    event: 1=event occurred, 0=censored
    group: treatment group labels
    """
    groups = group.unique()
    kmf_results = {}

    for g in groups:
        mask = group == g
        kmf = KaplanMeierFitter()
        kmf.fit(time[mask], event[mask], label=str(g))
        kmf_results[g] = {
            "median_survival": kmf.median_survival_time_,
            "survival_at_12m": kmf.predict(12),
        }

    # Log-rank test
    mask_a = group == groups[0]
    lr = logrank_test(
        time[mask_a], time[~mask_a],
        event[mask_a], event[~mask_a],
    )

    return {
        "group_results": kmf_results,
        "logrank_statistic": lr.test_statistic,
        "logrank_p_value": lr.p_value,
    }

Regulatory Considerations

Key regulatory documents for clinical trial design:

ICH E6 (R2): Good Clinical Practice guidelines
ICH E9 (R1): Statistical Principles, estimands framework
ICH E8 (R1): General Considerations for Clinical Studies
FDA 21 CFR Part 312: Investigational New Drug regulations
EMA Scientific Guidelines: Disease-specific guidance documents

Tools and Software

R survival package: Kaplan-Meier, Cox regression, log-rank test
lifelines (Python): Survival analysis library
gsDesign (R): Group sequential design and monitoring boundaries
PASS / nQuery: Commercial sample size software
EAST (Cytel): Adaptive and group sequential design software
REDCap: Electronic data capture for clinical research
ClinicalTrials.gov API: Trial registry search and data access