Medical-research-skills outcome-extraction-for-clinical-trials
Clinical research outcome extraction for meta-analysis. Use when users need to extract outcome measures (binary, continuous, or survival data) from clinical research papers for systematic review and meta-analysis. Handles both database lookup by PMID and real-time LLM extraction.
install
source · Clone the upstream repo
git clone https://github.com/aipoch/medical-research-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/aipoch/medical-research-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/scientific-skills/Data Analysis/outcome-extraction-for-clinical-trials" ~/.claude/skills/aipoch-medical-research-skills-outcome-extraction-for-clinical-trials && rm -rf "$T"
manifest:
scientific-skills/Data Analysis/outcome-extraction-for-clinical-trials/SKILL.mdsource content
Clinical Outcome Extraction
Extract structured outcome data from clinical research papers for meta-analysis.
When to Use
- Use this skill when you need clinical research outcome extraction for meta-analysis. use when users need to extract outcome measures (binary, continuous, or survival data) from clinical research papers for systematic review and meta-analysis. handles both database lookup by pmid and real-time llm extraction in a reproducible workflow.
- Use this skill when a data analytics task needs a packaged method instead of ad-hoc freeform output.
- Use this skill when the user expects a concrete deliverable, validation step, or file-based result.
- Use this skill when
is the most direct path to complete the request.scripts/extract_pdf.py - Use this skill when you need the
package behavior rather than a generic answer.outcome-extraction for clinical trials
Key Features
- Scope-focused workflow aligned to: Clinical research outcome extraction for meta-analysis. Use when users need to extract outcome measures (binary, continuous, or survival data) from clinical research papers for systematic review and meta-analysis. Handles both database lookup by PMID and real-time LLM extraction.
- Packaged executable path(s):
.scripts/extract_pdf.py - Reference material available in
for task-specific guidance.references/ - Structured execution path designed to keep outputs consistent and reviewable.
Dependencies
:Python
. Repository baseline for current packaged skills.3.10+
:Third-party packages
. Add pinned versions if this skill needs stricter environment control.not explicitly version-pinned in this skill package
Example Usage
cd "20260316/scientific-skills/Data Analytics/outcome-extraction-for-clinical-trials" python -m py_compile scripts/extract_pdf.py python scripts/extract_pdf.py --help
Example run plan:
- Confirm the user input, output path, and any required config values.
- Edit the in-file
block or documented parameters if the script uses fixed settings.CONFIG - Run
with the validated inputs.python scripts/extract_pdf.py - Review the generated output and return the final artifact with any assumptions called out.
Implementation Details
See
## Workflow above for related details.
- Execution model: validate the request, choose the packaged workflow, and produce a bounded deliverable.
- Input controls: confirm the source files, scope limits, output format, and acceptance criteria before running any script.
- Primary implementation surface:
.scripts/extract_pdf.py - Reference guidance:
contains supporting rules, prompts, or checklists.references/ - Parameters to clarify first: input path, output path, scope filters, thresholds, and any domain-specific constraints.
- Output discipline: keep results reproducible, identify assumptions explicitly, and avoid undocumented side effects.
Workflow
-
Input Processing
- User provides: full paper text + optional PMID
- If PMID provided: query database first for existing results
- If no PMID or no database match: proceed to LLM extraction
-
Outcome Identification (LLM)
- Extract all outcome measures from the paper
- Determine outcome types: binary, continuous, or survival
- Identify measurement time points
- Output JSON format with outcome classification
-
Data Classification (Code)
- Separate outcomes into three categories:
: Binary/dichotomous outcomesbi_outcomes
: Continuous outcomescon_outcomes
: Survival outcomessur_outcomes
- Separate outcomes into three categories:
-
Data Extraction by Type
Binary Outcomes
Extract for each intervention group:
- Sample size (n)
- Number of events (event)
Continuous Outcomes
Extract for each intervention group:
- Sample size (n)
- Mean (mean)
- Standard deviation (sd)
Survival Outcomes
Extract for each intervention group:
- Sample size (n)
- Hazard ratio (HR)
- 95% Lower CI
- 95% Upper CI
- Output Formatting
- Combine all extracted data
- Ensure consistent JSON structure
- Convert values to strings
Output Format
[ { "outcome_name": "PFS", "detection_time_point": "12 months", "groups": [ { "group_name": "Treatment A", "sample_size": "100", "outcome_type": "Binary|Continuous|Survival", "data": [ {"value_type": "Events|Mean|SD|HR|95%Lower CI|95%Upper CI", "value": "25"} ] } ] } ]
‼️‼️‼️See references (extraction-promots.md) for detailed JSON structures for each outcome type (binary, continuous, survival)‼️‼️‼️
Requirements
- Extract from full text, not just abstract
- Consider ALL intervention groups in the paper
- Include ALL outcome measures of interest
- Report all data regardless of statistical significance
- Use specific group names (intervention names in English), not generic terms like "treatment group"
- Output in JSON format
- Output language: English for all field values
- If data not found: output blank space ""