Medical-research-skills outcome-extraction-for-clinical-trials

Clinical research outcome extraction for meta-analysis. Use when users need to extract outcome measures (binary, continuous, or survival data) from clinical research papers for systematic review and meta-analysis. Handles both database lookup by PMID and real-time LLM extraction.

install
source · Clone the upstream repo
git clone https://github.com/aipoch/medical-research-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/aipoch/medical-research-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/scientific-skills/Data Analysis/outcome-extraction-for-clinical-trials" ~/.claude/skills/aipoch-medical-research-skills-outcome-extraction-for-clinical-trials && rm -rf "$T"
manifest: scientific-skills/Data Analysis/outcome-extraction-for-clinical-trials/SKILL.md
source content

Source: https://github.com/aipoch/medical-research-skills

Clinical Outcome Extraction

Extract structured outcome data from clinical research papers for meta-analysis.

When to Use

  • Use this skill when you need clinical research outcome extraction for meta-analysis. use when users need to extract outcome measures (binary, continuous, or survival data) from clinical research papers for systematic review and meta-analysis. handles both database lookup by pmid and real-time llm extraction in a reproducible workflow.
  • Use this skill when a data analytics task needs a packaged method instead of ad-hoc freeform output.
  • Use this skill when the user expects a concrete deliverable, validation step, or file-based result.
  • Use this skill when
    scripts/extract_pdf.py
    is the most direct path to complete the request.
  • Use this skill when you need the
    outcome-extraction for clinical trials
    package behavior rather than a generic answer.

Key Features

  • Scope-focused workflow aligned to: Clinical research outcome extraction for meta-analysis. Use when users need to extract outcome measures (binary, continuous, or survival data) from clinical research papers for systematic review and meta-analysis. Handles both database lookup by PMID and real-time LLM extraction.
  • Packaged executable path(s):
    scripts/extract_pdf.py
    .
  • Reference material available in
    references/
    for task-specific guidance.
  • Structured execution path designed to keep outputs consistent and reviewable.

Dependencies

  • Python
    :
    3.10+
    . Repository baseline for current packaged skills.
  • Third-party packages
    :
    not explicitly version-pinned in this skill package
    . Add pinned versions if this skill needs stricter environment control.

Example Usage

cd "20260316/scientific-skills/Data Analytics/outcome-extraction-for-clinical-trials"
python -m py_compile scripts/extract_pdf.py
python scripts/extract_pdf.py --help

Example run plan:

  1. Confirm the user input, output path, and any required config values.
  2. Edit the in-file
    CONFIG
    block or documented parameters if the script uses fixed settings.
  3. Run
    python scripts/extract_pdf.py
    with the validated inputs.
  4. Review the generated output and return the final artifact with any assumptions called out.

Implementation Details

See

## Workflow
above for related details.

  • Execution model: validate the request, choose the packaged workflow, and produce a bounded deliverable.
  • Input controls: confirm the source files, scope limits, output format, and acceptance criteria before running any script.
  • Primary implementation surface:
    scripts/extract_pdf.py
    .
  • Reference guidance:
    references/
    contains supporting rules, prompts, or checklists.
  • Parameters to clarify first: input path, output path, scope filters, thresholds, and any domain-specific constraints.
  • Output discipline: keep results reproducible, identify assumptions explicitly, and avoid undocumented side effects.

Workflow

  1. Input Processing

    • User provides: full paper text + optional PMID
    • If PMID provided: query database first for existing results
    • If no PMID or no database match: proceed to LLM extraction
  2. Outcome Identification (LLM)

    • Extract all outcome measures from the paper
    • Determine outcome types: binary, continuous, or survival
    • Identify measurement time points
    • Output JSON format with outcome classification
  3. Data Classification (Code)

    • Separate outcomes into three categories:
      • bi_outcomes
        : Binary/dichotomous outcomes
      • con_outcomes
        : Continuous outcomes
      • sur_outcomes
        : Survival outcomes
  4. Data Extraction by Type

Binary Outcomes

Extract for each intervention group:

  • Sample size (n)
  • Number of events (event)

Continuous Outcomes

Extract for each intervention group:

  • Sample size (n)
  • Mean (mean)
  • Standard deviation (sd)

Survival Outcomes

Extract for each intervention group:

  • Sample size (n)
  • Hazard ratio (HR)
  • 95% Lower CI
  • 95% Upper CI
  1. Output Formatting
    • Combine all extracted data
    • Ensure consistent JSON structure
    • Convert values to strings

Output Format

[
  {
    "outcome_name": "PFS",
    "detection_time_point": "12 months",
    "groups": [
      {
        "group_name": "Treatment A",
        "sample_size": "100",
        "outcome_type": "Binary|Continuous|Survival",
        "data": [
          {"value_type": "Events|Mean|SD|HR|95%Lower CI|95%Upper CI", "value": "25"}
        ]
      }
    ]
  }
]

‼️‼️‼️See references (extraction-promots.md) for detailed JSON structures for each outcome type (binary, continuous, survival)‼️‼️‼️

Requirements

  • Extract from full text, not just abstract
  • Consider ALL intervention groups in the paper
  • Include ALL outcome measures of interest
  • Report all data regardless of statistical significance
  • Use specific group names (intervention names in English), not generic terms like "treatment group"
  • Output in JSON format
  • Output language: English for all field values
  • If data not found: output blank space ""