Medical-research-skills outcome-extraction-for-clinical-trials

Clinical research outcome extraction for meta-analysis. Use when users need to extract outcome measures (binary, continuous, or survival data) from clinical research papers for systematic review and meta-analysis. Handles both database lookup by PMID and real-time LLM extraction.

install

source · Clone the upstream repo

git clone https://github.com/aipoch/medical-research-skills

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/aipoch/medical-research-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/scientific-skills/Data Analysis/outcome-extraction-for-clinical-trials" ~/.claude/skills/aipoch-medical-research-skills-outcome-extraction-for-clinical-trials && rm -rf "$T"

manifest: scientific-skills/Data Analysis/outcome-extraction-for-clinical-trials/SKILL.md

source content

Source: https://github.com/aipoch/medical-research-skills

Clinical Outcome Extraction

Extract structured outcome data from clinical research papers for meta-analysis.

When to Use

Use this skill when you need clinical research outcome extraction for meta-analysis. use when users need to extract outcome measures (binary, continuous, or survival data) from clinical research papers for systematic review and meta-analysis. handles both database lookup by pmid and real-time llm extraction in a reproducible workflow.
Use this skill when a data analytics task needs a packaged method instead of ad-hoc freeform output.
Use this skill when the user expects a concrete deliverable, validation step, or file-based result.
Use this skill when
```
scripts/extract_pdf.py
```
is the most direct path to complete the request.
Use this skill when you need the
```
outcome-extraction for clinical trials
```
package behavior rather than a generic answer.

Key Features

Scope-focused workflow aligned to: Clinical research outcome extraction for meta-analysis. Use when users need to extract outcome measures (binary, continuous, or survival data) from clinical research papers for systematic review and meta-analysis. Handles both database lookup by PMID and real-time LLM extraction.
Packaged executable path(s):
```
scripts/extract_pdf.py
```
.
Reference material available in
```
references/
```
for task-specific guidance.
Structured execution path designed to keep outputs consistent and reviewable.

Dependencies

```
Python
```
:
```
3.10+
```
. Repository baseline for current packaged skills.
```
Third-party packages
```
:
```
not explicitly version-pinned in this skill package
```
. Add pinned versions if this skill needs stricter environment control.

Example Usage

cd "20260316/scientific-skills/Data Analytics/outcome-extraction-for-clinical-trials"
python -m py_compile scripts/extract_pdf.py
python scripts/extract_pdf.py --help

Example run plan:

Confirm the user input, output path, and any required config values.
Edit the in-file
```
CONFIG
```
block or documented parameters if the script uses fixed settings.
Run
```
python scripts/extract_pdf.py
```
with the validated inputs.
Review the generated output and return the final artifact with any assumptions called out.

Implementation Details

See

## Workflow

above for related details.

Execution model: validate the request, choose the packaged workflow, and produce a bounded deliverable.
Input controls: confirm the source files, scope limits, output format, and acceptance criteria before running any script.
Primary implementation surface:
```
scripts/extract_pdf.py
```
.
Reference guidance:
```
references/
```
contains supporting rules, prompts, or checklists.
Parameters to clarify first: input path, output path, scope filters, thresholds, and any domain-specific constraints.
Output discipline: keep results reproducible, identify assumptions explicitly, and avoid undocumented side effects.

Workflow

Input Processing
- User provides: full paper text + optional PMID
- If PMID provided: query database first for existing results
- If no PMID or no database match: proceed to LLM extraction
Outcome Identification (LLM)
- Extract all outcome measures from the paper
- Determine outcome types: binary, continuous, or survival
- Identify measurement time points
- Output JSON format with outcome classification
Data Classification (Code)
- Separate outcomes into three categories:
  - ```
  bi_outcomes
```
  : Binary/dichotomous outcomes
- ```
con_outcomes
```
    : Continuous outcomes
  - ```
  sur_outcomes
```
  : Survival outcomes
Data Extraction by Type

Binary Outcomes

Extract for each intervention group:

Sample size (n)
Number of events (event)

Continuous Outcomes

Extract for each intervention group:

Sample size (n)
Mean (mean)
Standard deviation (sd)

Survival Outcomes

Extract for each intervention group:

Sample size (n)
Hazard ratio (HR)
95% Lower CI
95% Upper CI

Output Formatting
- Combine all extracted data
- Ensure consistent JSON structure
- Convert values to strings

Output Format

[
  {
    "outcome_name": "PFS",
    "detection_time_point": "12 months",
    "groups": [
      {
        "group_name": "Treatment A",
        "sample_size": "100",
        "outcome_type": "Binary|Continuous|Survival",
        "data": [
          {"value_type": "Events|Mean|SD|HR|95%Lower CI|95%Upper CI", "value": "25"}
        ]
      }
    ]
  }
]

‼️‼️‼️See references (extraction-promots.md) for detailed JSON structures for each outcome type (binary, continuous, survival)‼️‼️‼️

Requirements

Extract from full text, not just abstract
Consider ALL intervention groups in the paper
Include ALL outcome measures of interest
Report all data regardless of statistical significance
Use specific group names (intervention names in English), not generic terms like "treatment group"
Output in JSON format
Output language: English for all field values
If data not found: output blank space ""