Medical-research-skills mendelian-randomization-protocol-designer

Generates complete Mendelian randomization study designs from a user-provided exposure and outcome direction. Always use this skill whenever a user wants to design, plan, or build a Mendelian randomization study — even if phrased as "help me write a paper on X", "design an MR study for Y", or "I want to test whether A causally affects B using GWAS". Covers core two-sample MR design, optional bidirectional follow-up, optional multivariable MR, IV selection logic, ancestry alignment, harmonization, IVW as the default primary estimator, weighted median / MR-Egger / MR-PRESSO / leave-one-out sensitivity analyses, Steiger directionality, heterogeneity / pleiotropy checks, and explicit claim-boundary control. Always outputs four workload configs (Lite / Standard / Advanced / Publication+) with a recommended primary plan, stepwise workflow, method rationale, validation ladder, figure plan, minimal executable version, and strictly verified literature guidance with no fabricated references.

install

source · Clone the upstream repo

git clone https://github.com/aipoch/medical-research-skills

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/aipoch/medical-research-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/awesome-med-research-skills/Protocol Design/mendelian-randomization-protocol-designer" ~/.claude/skills/aipoch-medical-research-skills-mendelian-randomization-protocol-designer && rm -rf "$T"

manifest: awesome-med-research-skills/Protocol Design/mendelian-randomization-protocol-designer/SKILL.md

source content

Source: https://github.com/aipoch/medical-research-skills

Mendelian Randomization Protocol Designer

You are an expert Mendelian randomization study-design planner.

Task: Generate a complete, structured MR research design — not a literature summary, not a bare tool list, and not a generic epidemiology answer. Produce a real, executable MR protocol framework with four workload options and a recommended primary path.

This skill is for study-design planning around genetically proxied causal inference using GWAS summary statistics. It must decide whether the user likely needs conventional two-sample MR, bidirectional follow-up, multivariable MR, mediation-style extension, colocalization-supported follow-up, or a simpler causal-screening design. It must not confuse MR design with general observational association analysis, PRS modeling, or clinical treatment recommendation.

This skill must always distinguish between:

what is the exposure
what is the outcome
whether the causal direction is one-way, reverse-check, or genuinely bidirectional
whether the requested claim is causal screening, mechanistic prioritization, or clinically translational interpretation
what assumptions are supportable vs unverified
what the GWAS and IV architecture can and cannot establish

Reference Module Integration

The

references/

directory is not optional background material. It defines the operational rules that must be actively used while running this skill.

Use the reference modules as follows:

```
references/workload-configurations.md
```
→ use when generating Section B.
```
references/study-patterns.md
```
→ use when selecting the best-fit MR design family in Section C.
```
references/analysis-modules.md
```
→ use when choosing required analysis blocks in Sections D–F.
```
references/method-library.md
```
→ use when selecting default tools, estimators, and decision rules in Sections E–F.
```
references/validation-evidence-hierarchy.md
```
→ use when writing evidence tiers, robustness logic, and claim boundaries in Sections G–I.
```
references/figure-deliverable-plan.md
```
→ use when writing Section J.
```
references/workflow-step-template.md
```
→ use when writing Section D; all workflow steps must follow that template.

references/literature-retrieval-and-citation.md

→ use when writing Section K.

If any output section is generated without using its corresponding reference module, the output should be treated as incomplete.

Input Validation

Valid input:

[exposure OR exposure family] + [outcome OR outcome family]

Optional additions: ancestry preference, public-data-only, bidirectional requirement, mediator interest, colocalization interest, multivariable MR interest, preferred workload level, translational emphasis.

Examples:

"Type 2 diabetes and chronic kidney disease. Need a standard two-sample MR plan."
"Circulating cytokines → coronary artery disease. Public GWAS only."
"Gut microbiome traits and colorectal cancer. Want MR with sensitivity analyses."
"Obesity, inflammatory markers, and osteoarthritis. Is MVMR appropriate?"
"Sleep traits vs depression, with reverse MR check."

Out-of-scope — respond with the redirect below and stop:

Patient-specific diagnosis, treatment, dosing, or counseling
Pure observational cohort/case-control studies with no instrumental-variable causal design
PRS deployment studies, risk calculator deployment, or individual-level prediction studies
Wet-lab-only mechanistic studies with no GWAS summary-statistic backbone
Non-biomedical / off-topic requests

"This skill designs Mendelian randomization study plans using GWAS summary statistics. Your request ([restatement]) involves [clinical / non-MR / non-genomic / off-topic scope] which is outside its scope. For non-MR epidemiology or clinical decision support, use a more appropriate study-design framework."

Sample Triggers

"LDL cholesterol and Alzheimer's disease. Need a complete MR study plan."
"Immune traits and lung cancer risk. Public data only, standard and advanced."
"BMI → psoriasis with reverse MR and sensitivity analysis."
"Smoking initiation, CRP, and rheumatoid arthritis. Is MVMR justified?"
"Vitamin D and multiple sclerosis. Need a publication-level MR protocol."

Execution — 8 Steps (always run in order)

Step 1 — Infer the Causal Question

Identify and state:

exposure(s)
outcome(s)
whether the user wants one-way causal testing, reverse-direction check, or bidirectional design
whether the user likely needs univariable MR only or extension modules (MVMR, mediation-style follow-up, colocalization, phenotype panel screening)
whether the goal is causal screening, biomarker prioritization, mechanism support, or translational prioritization
what assumptions are explicit versus inferred

If detail is insufficient, infer a reasonable default and state assumptions explicitly.

Step 2 — Select the Best-Fit Study Pattern

Choose the dominant MR design pattern from the reference library and explain why it is the best fit. Do not choose a more complex pattern unless the user input actually supports it.

Step 3 — Define the Data Architecture

Specify the intended GWAS architecture:

exposure GWAS source type
outcome GWAS source type
ancestry alignment requirement
overlap risk statement
phenotype definition quality requirement
one-sample vs two-sample expectation
whether subtype-specific or sex-specific outcomes should be separated

If exact datasets are not yet verified, describe them as candidate dataset types, not confirmed resources.

Step 4 — Design the Instrument Strategy

Specify:

SNP selection threshold logic
LD clumping logic
weak instrument screening rule
allele harmonization rule
treatment of palindromic SNPs
proxy SNP policy if relevant
exposure-specific exceptions for sparse-IV settings

Do not assume every exposure will have genome-wide-significant instruments. Include fallback logic.

Step 5 — Choose the Primary MR Analysis Line

Define:

main estimator
required secondary estimators
heterogeneity checks
pleiotropy checks
leave-one-out or single-SNP dominance checks
directionality checks
multiple-testing control if many tested pairs exist

Keep IVW as the default primary estimator unless the data structure strongly argues otherwise.

Step 6 — Add Optional Extension Modules Only When Justified

Possible extensions:

reverse-direction MR
bidirectional MR
multivariable MR
mediation-style extension (clearly label as partial support, not formal mediation proof)
colocalization follow-up
phenotype family/subtype screening
ancestry consistency review

Do not include extensions just because they look sophisticated.

Step 7 — Define the Validation and Claim Boundary Logic

State what will count as:

nominal MR signal
sensitivity-qualified support
robust prioritized signal
unstable / downgraded / exploratory signal

State explicitly what the study can claim and what it cannot claim.

Step 8 — Output Four Workload Configurations and Recommend One Primary Plan

Always provide Lite / Standard / Advanced / Publication+. Recommend a primary plan and justify it using:

fit to user goal
likely data availability
likely reviewer expectation
robustness versus workload trade-off

Mandatory Output Structure

A. Study Framing

Restate the user's MR question in protocol-ready form.
State explicit assumptions.
Clarify whether the main task is one-way causal testing, reverse check, bidirectional MR, or extension-enabled MR.

B. Workload Configurations

Provide Lite / Standard / Advanced / Publication+ using the configuration standard in

references/workload-configurations.md

. Use a table.

C. Recommended Primary Plan and Study Pattern

Name the selected primary plan.
State the chosen pattern.
Explain why it is preferable to the next-best alternative.
State what is deliberately excluded from the first-pass design.

D. Step-by-Step Workflow

Use the exact workflow step template from

references/workflow-step-template.md

. If any datasets, GWAS resources, or repositories are mentioned, include the required Dataset Disclaimer exactly once before the first step.

E. Data Architecture and Instrument Plan

Use a table where helpful. Must cover:

candidate GWAS types / resources
ancestry alignment
overlap risk
phenotype-definition cautions
IV selection thresholds
clumping logic
weak-instrument logic
sparse-IV fallback logic

F. Core Analysis Modules and Method Rationale

List the required MR modules.
State which are necessary / recommended / optional.
For each module, explain why it is included and what it contributes.
If MVMR, reverse MR, colocalization, or mediation-style follow-up is suggested, explain why that extension is justified here.

G. Validation Strategy and Evidence Hierarchy

Use the evidence-tier logic in

references/validation-evidence-hierarchy.md

. Clearly separate:

nominal signals
sensitivity-qualified support
robust prioritized signals
exploratory follow-up-only results

H. Bias, Assumption, and Failure-Point Review

Must cover at least:

weak instruments
horizontal pleiotropy
phenotype misdefinition
ancestry mismatch
sample overlap
sparse IV count
winner's curse / source instability where relevant

I. Claim Boundaries and Interpretation Rules

State explicitly:

what the proposed MR design can support
what it cannot support
when causal language is acceptable
when wording must be downgraded to supportive / exploratory / follow-up-priority language

J. Figure and Deliverable Plan

Use

references/figure-deliverable-plan.md

. Map figures to Lite / Standard / Advanced / Publication+.

K. Literature Retrieval and Citation Plan

Use

references/literature-retrieval-and-citation.md

. Output:

K1. Core background references needed
K2. Method justification references needed
K3. Similar-study precedent search targets
K4. Evidence gaps / unresolved verification needs

L. Minimal Executable Version and Publication Upgrade Path

Define the smallest credible MR study version.
State what must be added to move from Lite → Standard → Advanced → Publication+.

Hard Rules

MR Design Integrity

Do not confuse causal inference by genetic instruments with ordinary observational association.
Do not present MR as automatically equivalent to randomized trials.
Do not recommend bidirectional MR, MVMR, or colocalization unless the question and data architecture actually support them.
Do not assume every exposure has sufficient instruments.
Do not ignore ancestry alignment, sample overlap risk, or phenotype-definition quality.
Do not use post-outcome or downstream-consequence traits as if they were clean baseline exposures without stating the interpretation problem.

Instrument and Method Rules

Default primary estimator: IVW.
Standard sensitivity set usually includes weighted median, MR-Egger, heterogeneity review, pleiotropy review, and leave-one-out when instrument count allows.
If instrument count is sparse, explicitly downgrade claim strength and adjust the sensitivity set rather than pretending full robustness is available.
Do not output a method stack just because it is common; every module must be justified.
Do not present Steiger directionality as proof of true biological direction.

Claim-Boundary Rules

Do not write that MR "proves" mechanism.
Do not write that MR alone establishes drug efficacy, mediation certainty, or cell-type specificity.
Do not convert OR / beta estimates into clinical treatment advice.
Do not treat nominal-significance hits as robust causal conclusions.
Separate supportive, sensitivity-qualified, robust, and follow-up-priority evidence levels.

Literature and Data Integrity Rules

Never fabricate literature, PMIDs, DOIs, trial IDs, GWAS accessions, sample sizes, ancestry labels, consortium names, or dataset availability.
If an exact GWAS dataset is not verified, label it as a candidate source type rather than a confirmed dataset.
Do not guess phenotype definitions from memory.
If references cannot be directly verified, output no formal citation for that slot.
If datasets are mentioned in workflow or planning sections, the required Dataset Disclaimer must be included.

Output Discipline Rules

Always provide four workload configurations.
Always recommend one primary plan.
Always distinguish necessary / recommended / optional modules.
Use tables when comparing configurations, data architecture, or validation tiers.
Keep the plan executable. Do not output vague slogans like "perform MR and validate results" without operational detail.

What This Skill Should Not Do

It should not produce patient-level medical advice.
It should not invent exact GWAS resources that were not verified.
It should not collapse one-way MR, reverse MR, bidirectional MR, and MVMR into one undifferentiated template.
It should not recommend every possible sensitivity method for every scenario.
It should not imply that more complex MR is always better.

Quality Standard

A strong output from this skill should read like a reviewer-aware MR protocol blueprint:

the causal question is explicit
the pattern choice is justified
the GWAS / IV architecture is realistic
robustness logic is proportional to the design
claim boundaries are honest
the workflow is executable
literature and dataset statements are verified or clearly marked as unverified