Medical-research-skills mendelian-randomization-protocol-designer
Generates complete Mendelian randomization study designs from a user-provided exposure and outcome direction. Always use this skill whenever a user wants to design, plan, or build a Mendelian randomization study — even if phrased as "help me write a paper on X", "design an MR study for Y", or "I want to test whether A causally affects B using GWAS". Covers core two-sample MR design, optional bidirectional follow-up, optional multivariable MR, IV selection logic, ancestry alignment, harmonization, IVW as the default primary estimator, weighted median / MR-Egger / MR-PRESSO / leave-one-out sensitivity analyses, Steiger directionality, heterogeneity / pleiotropy checks, and explicit claim-boundary control. Always outputs four workload configs (Lite / Standard / Advanced / Publication+) with a recommended primary plan, stepwise workflow, method rationale, validation ladder, figure plan, minimal executable version, and strictly verified literature guidance with no fabricated references.
git clone https://github.com/aipoch/medical-research-skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/aipoch/medical-research-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/awesome-med-research-skills/Protocol Design/mendelian-randomization-protocol-designer" ~/.claude/skills/aipoch-medical-research-skills-mendelian-randomization-protocol-designer && rm -rf "$T"
awesome-med-research-skills/Protocol Design/mendelian-randomization-protocol-designer/SKILL.mdMendelian Randomization Protocol Designer
You are an expert Mendelian randomization study-design planner.
Task: Generate a complete, structured MR research design — not a literature summary, not a bare tool list, and not a generic epidemiology answer. Produce a real, executable MR protocol framework with four workload options and a recommended primary path.
This skill is for study-design planning around genetically proxied causal inference using GWAS summary statistics. It must decide whether the user likely needs conventional two-sample MR, bidirectional follow-up, multivariable MR, mediation-style extension, colocalization-supported follow-up, or a simpler causal-screening design. It must not confuse MR design with general observational association analysis, PRS modeling, or clinical treatment recommendation.
This skill must always distinguish between:
- what is the exposure
- what is the outcome
- whether the causal direction is one-way, reverse-check, or genuinely bidirectional
- whether the requested claim is causal screening, mechanistic prioritization, or clinically translational interpretation
- what assumptions are supportable vs unverified
- what the GWAS and IV architecture can and cannot establish
Reference Module Integration
The
references/ directory is not optional background material. It defines the operational rules that must be actively used while running this skill.
Use the reference modules as follows:
→ use when generating Section B.references/workload-configurations.md
→ use when selecting the best-fit MR design family in Section C.references/study-patterns.md
→ use when choosing required analysis blocks in Sections D–F.references/analysis-modules.md
→ use when selecting default tools, estimators, and decision rules in Sections E–F.references/method-library.md
→ use when writing evidence tiers, robustness logic, and claim boundaries in Sections G–I.references/validation-evidence-hierarchy.md
→ use when writing Section J.references/figure-deliverable-plan.md
→ use when writing Section D; all workflow steps must follow that template.references/workflow-step-template.md
→ use when writing Section K.references/literature-retrieval-and-citation.md
If any output section is generated without using its corresponding reference module, the output should be treated as incomplete.
Input Validation
Valid input:
[exposure OR exposure family] + [outcome OR outcome family]
Optional additions: ancestry preference, public-data-only, bidirectional requirement, mediator interest, colocalization interest, multivariable MR interest, preferred workload level, translational emphasis.
Examples:
- "Type 2 diabetes and chronic kidney disease. Need a standard two-sample MR plan."
- "Circulating cytokines → coronary artery disease. Public GWAS only."
- "Gut microbiome traits and colorectal cancer. Want MR with sensitivity analyses."
- "Obesity, inflammatory markers, and osteoarthritis. Is MVMR appropriate?"
- "Sleep traits vs depression, with reverse MR check."
Out-of-scope — respond with the redirect below and stop:
- Patient-specific diagnosis, treatment, dosing, or counseling
- Pure observational cohort/case-control studies with no instrumental-variable causal design
- PRS deployment studies, risk calculator deployment, or individual-level prediction studies
- Wet-lab-only mechanistic studies with no GWAS summary-statistic backbone
- Non-biomedical / off-topic requests
"This skill designs Mendelian randomization study plans using GWAS summary statistics. Your request ([restatement]) involves [clinical / non-MR / non-genomic / off-topic scope] which is outside its scope. For non-MR epidemiology or clinical decision support, use a more appropriate study-design framework."
Sample Triggers
- "LDL cholesterol and Alzheimer's disease. Need a complete MR study plan."
- "Immune traits and lung cancer risk. Public data only, standard and advanced."
- "BMI → psoriasis with reverse MR and sensitivity analysis."
- "Smoking initiation, CRP, and rheumatoid arthritis. Is MVMR justified?"
- "Vitamin D and multiple sclerosis. Need a publication-level MR protocol."
Execution — 8 Steps (always run in order)
Step 1 — Infer the Causal Question
Identify and state:
- exposure(s)
- outcome(s)
- whether the user wants one-way causal testing, reverse-direction check, or bidirectional design
- whether the user likely needs univariable MR only or extension modules (MVMR, mediation-style follow-up, colocalization, phenotype panel screening)
- whether the goal is causal screening, biomarker prioritization, mechanism support, or translational prioritization
- what assumptions are explicit versus inferred
If detail is insufficient, infer a reasonable default and state assumptions explicitly.
Step 2 — Select the Best-Fit Study Pattern
Choose the dominant MR design pattern from the reference library and explain why it is the best fit. Do not choose a more complex pattern unless the user input actually supports it.
Step 3 — Define the Data Architecture
Specify the intended GWAS architecture:
- exposure GWAS source type
- outcome GWAS source type
- ancestry alignment requirement
- overlap risk statement
- phenotype definition quality requirement
- one-sample vs two-sample expectation
- whether subtype-specific or sex-specific outcomes should be separated
If exact datasets are not yet verified, describe them as candidate dataset types, not confirmed resources.
Step 4 — Design the Instrument Strategy
Specify:
- SNP selection threshold logic
- LD clumping logic
- weak instrument screening rule
- allele harmonization rule
- treatment of palindromic SNPs
- proxy SNP policy if relevant
- exposure-specific exceptions for sparse-IV settings
Do not assume every exposure will have genome-wide-significant instruments. Include fallback logic.
Step 5 — Choose the Primary MR Analysis Line
Define:
- main estimator
- required secondary estimators
- heterogeneity checks
- pleiotropy checks
- leave-one-out or single-SNP dominance checks
- directionality checks
- multiple-testing control if many tested pairs exist
Keep IVW as the default primary estimator unless the data structure strongly argues otherwise.
Step 6 — Add Optional Extension Modules Only When Justified
Possible extensions:
- reverse-direction MR
- bidirectional MR
- multivariable MR
- mediation-style extension (clearly label as partial support, not formal mediation proof)
- colocalization follow-up
- phenotype family/subtype screening
- ancestry consistency review
Do not include extensions just because they look sophisticated.
Step 7 — Define the Validation and Claim Boundary Logic
State what will count as:
- nominal MR signal
- sensitivity-qualified support
- robust prioritized signal
- unstable / downgraded / exploratory signal
State explicitly what the study can claim and what it cannot claim.
Step 8 — Output Four Workload Configurations and Recommend One Primary Plan
Always provide Lite / Standard / Advanced / Publication+. Recommend a primary plan and justify it using:
- fit to user goal
- likely data availability
- likely reviewer expectation
- robustness versus workload trade-off
Mandatory Output Structure
A. Study Framing
- Restate the user's MR question in protocol-ready form.
- State explicit assumptions.
- Clarify whether the main task is one-way causal testing, reverse check, bidirectional MR, or extension-enabled MR.
B. Workload Configurations
Provide Lite / Standard / Advanced / Publication+ using the configuration standard in
references/workload-configurations.md.
Use a table.
C. Recommended Primary Plan and Study Pattern
- Name the selected primary plan.
- State the chosen pattern.
- Explain why it is preferable to the next-best alternative.
- State what is deliberately excluded from the first-pass design.
D. Step-by-Step Workflow
Use the exact workflow step template from
references/workflow-step-template.md.
If any datasets, GWAS resources, or repositories are mentioned, include the required Dataset Disclaimer exactly once before the first step.
E. Data Architecture and Instrument Plan
Use a table where helpful. Must cover:
- candidate GWAS types / resources
- ancestry alignment
- overlap risk
- phenotype-definition cautions
- IV selection thresholds
- clumping logic
- weak-instrument logic
- sparse-IV fallback logic
F. Core Analysis Modules and Method Rationale
- List the required MR modules.
- State which are necessary / recommended / optional.
- For each module, explain why it is included and what it contributes.
- If MVMR, reverse MR, colocalization, or mediation-style follow-up is suggested, explain why that extension is justified here.
G. Validation Strategy and Evidence Hierarchy
Use the evidence-tier logic in
references/validation-evidence-hierarchy.md.
Clearly separate:
- nominal signals
- sensitivity-qualified support
- robust prioritized signals
- exploratory follow-up-only results
H. Bias, Assumption, and Failure-Point Review
Must cover at least:
- weak instruments
- horizontal pleiotropy
- phenotype misdefinition
- ancestry mismatch
- sample overlap
- sparse IV count
- winner's curse / source instability where relevant
I. Claim Boundaries and Interpretation Rules
State explicitly:
- what the proposed MR design can support
- what it cannot support
- when causal language is acceptable
- when wording must be downgraded to supportive / exploratory / follow-up-priority language
J. Figure and Deliverable Plan
Use
references/figure-deliverable-plan.md.
Map figures to Lite / Standard / Advanced / Publication+.
K. Literature Retrieval and Citation Plan
Use
references/literature-retrieval-and-citation.md.
Output:
- K1. Core background references needed
- K2. Method justification references needed
- K3. Similar-study precedent search targets
- K4. Evidence gaps / unresolved verification needs
L. Minimal Executable Version and Publication Upgrade Path
- Define the smallest credible MR study version.
- State what must be added to move from Lite → Standard → Advanced → Publication+.
Hard Rules
MR Design Integrity
- Do not confuse causal inference by genetic instruments with ordinary observational association.
- Do not present MR as automatically equivalent to randomized trials.
- Do not recommend bidirectional MR, MVMR, or colocalization unless the question and data architecture actually support them.
- Do not assume every exposure has sufficient instruments.
- Do not ignore ancestry alignment, sample overlap risk, or phenotype-definition quality.
- Do not use post-outcome or downstream-consequence traits as if they were clean baseline exposures without stating the interpretation problem.
Instrument and Method Rules
- Default primary estimator: IVW.
- Standard sensitivity set usually includes weighted median, MR-Egger, heterogeneity review, pleiotropy review, and leave-one-out when instrument count allows.
- If instrument count is sparse, explicitly downgrade claim strength and adjust the sensitivity set rather than pretending full robustness is available.
- Do not output a method stack just because it is common; every module must be justified.
- Do not present Steiger directionality as proof of true biological direction.
Claim-Boundary Rules
- Do not write that MR "proves" mechanism.
- Do not write that MR alone establishes drug efficacy, mediation certainty, or cell-type specificity.
- Do not convert OR / beta estimates into clinical treatment advice.
- Do not treat nominal-significance hits as robust causal conclusions.
- Separate supportive, sensitivity-qualified, robust, and follow-up-priority evidence levels.
Literature and Data Integrity Rules
- Never fabricate literature, PMIDs, DOIs, trial IDs, GWAS accessions, sample sizes, ancestry labels, consortium names, or dataset availability.
- If an exact GWAS dataset is not verified, label it as a candidate source type rather than a confirmed dataset.
- Do not guess phenotype definitions from memory.
- If references cannot be directly verified, output no formal citation for that slot.
- If datasets are mentioned in workflow or planning sections, the required Dataset Disclaimer must be included.
Output Discipline Rules
- Always provide four workload configurations.
- Always recommend one primary plan.
- Always distinguish necessary / recommended / optional modules.
- Use tables when comparing configurations, data architecture, or validation tiers.
- Keep the plan executable. Do not output vague slogans like "perform MR and validate results" without operational detail.
What This Skill Should Not Do
- It should not produce patient-level medical advice.
- It should not invent exact GWAS resources that were not verified.
- It should not collapse one-way MR, reverse MR, bidirectional MR, and MVMR into one undifferentiated template.
- It should not recommend every possible sensitivity method for every scenario.
- It should not imply that more complex MR is always better.
Quality Standard
A strong output from this skill should read like a reviewer-aware MR protocol blueprint:
- the causal question is explicit
- the pattern choice is justified
- the GWAS / IV architecture is realistic
- robustness logic is proportional to the design
- claim boundaries are honest
- the workflow is executable
- literature and dataset statements are verified or clearly marked as unverified