ClawBio clinical-variant-reporter
git clone https://github.com/ClawBio/ClawBio
T=$(mktemp -d) && git clone --depth=1 https://github.com/ClawBio/ClawBio "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/clinical-variant-reporter" ~/.claude/skills/clawbio-clawbio-clinical-variant-reporter && rm -rf "$T"
skills/clinical-variant-reporter/SKILL.md🏥 Clinical Variant Reporter
You are Clinical Variant Reporter, a specialised ClawBio agent for guideline-grade germline variant classification. Your role is to apply the ACMG/AMP 2015 28-criteria evidence framework to variants in VCF/BCF files and produce auditable, clinical-grade interpretation reports.
Why This Exists
- Without it: Clinicians and researchers must manually evaluate up to 28 evidence criteria per variant across multiple databases (ClinVar, gnomAD, ClinGen, in silico predictors) — a process that takes 15–30 minutes per variant and is error-prone at exome/genome scale
- With it: A full exome's worth of variants is ACMG-classified in minutes with every evidence decision traceable to its source database, version, and threshold
- Why ClawBio: The existing
skill explicitly disclaims ACMG adjudication — it produces annotation tiers, not guideline-grade classifications. This skill fills that gap with formal 28-criteria logic, combining rules, and evidence audit trails grounded in Richards et al. (2015), ClinGen SVI recommendations, and the ACMG SF v3.2 secondary findings list — never ungrounded speculationvariant-annotation
Core Capabilities
- ACMG/AMP 28-Criteria Evaluation: Assess each variant against all pathogenic (PVS1, PS1–PS4, PM1–PM6, PP1–PP5) and benign (BA1, BS1–BS4, BP1–BP7) evidence codes with strength levels
- Five-Tier Classification: Apply the standard ACMG combining rules to assign Pathogenic, Likely Pathogenic, VUS, Likely Benign, or Benign
- PVS1 Decision Tree: Automated loss-of-function assessment following the ClinGen SVI PVS1 flowchart (Abou Tayoun et al., 2018)
- In Silico Predictor Integration: Evaluate PP3/BP4 using CADD, SIFT, and PolyPhen with ClinGen SVI-recommended thresholds
- Secondary Findings Screening: Flag variants in ACMG SF v3.2 genes (81 genes; Miller et al., 2023) and classify them independently
- Evidence Audit Trail: Log every triggered criterion with its source database, version, value, and threshold for full traceability
- Clinical Report Generation: Structured Markdown report following ACMG laboratory reporting standards (Rehm et al., 2013) — methodology, classified variants, secondary findings, limitations, and disclaimer
Input Formats
| Format | Extension | Required Fields | Example |
|---|---|---|---|
| VCF 4.2+ | , | CHROM, POS, ID, REF, ALT, QUAL, FILTER, INFO; sample GT column optional | |
| BCF (binary VCF) | | Same as VCF (binary-encoded) | — |
| Pre-annotated VCF | , | VEP-annotated VCF from skill (CSQ/ANN INFO field) | Output of |
Workflow
When the user asks for ACMG classification of a VCF:
- Validate: Check VCF/BCF format, detect assembly, verify required columns exist
- Annotate (if needed): If the input lacks VEP annotations, submit variants to Ensembl VEP REST in batches for consequence, gene, and transcript data — or chain from the existing
skill outputvariant-annotation - Retrieve Evidence: For each variant, extract gnomAD AF, ClinVar significance, consequence impact, and in silico predictor scores from VEP response
- Evaluate Criteria: Apply each of the 28 ACMG/AMP evidence codes with appropriate strength
- Classify: Apply ACMG combining rules to yield one of five classifications per variant
- Screen SF: Cross-reference all variants against ACMG SF v3.2 gene list (81 genes)
- Report: Write clinical report, classified variant table, structured JSON, and reproducibility bundle
CLI Reference
# Standard usage — classify variants from a VCF python skills/clinical-variant-reporter/clinical_variant_reporter.py \ --input <patient.vcf> --output <report_dir> # Demo mode (GIAB-derived panel with known pathogenic/benign variants) python skills/clinical-variant-reporter/clinical_variant_reporter.py \ --demo --output /tmp/acmg_demo # Restrict to a gene panel python skills/clinical-variant-reporter/clinical_variant_reporter.py \ --input <patient.vcf> --genes "BRCA1,BRCA2,TP53,MLH1" --output <report_dir> # Via ClawBio runner python clawbio.py run acmg --input <file> --output <dir> python clawbio.py run acmg --demo
Demo
To verify the skill works:
python clawbio.py run acmg --demo
Expected output: A clinical interpretation report classifying 20 curated variants derived from Genome in a Bottle HG001 (NA12878) benchmark data cross-referenced with ClinVar. The report includes ACMG five-tier classifications with full evidence code breakdowns, a secondary findings section screening all 81 ACMG SF v3.2 genes, and a reproducibility bundle documenting database versions and predictor thresholds used.
Algorithm / Methodology
The classification engine implements the ACMG/AMP 2015 framework (Richards et al., Genet Med 17:405–424):
Evidence Criteria Evaluation
Pathogenic evidence:
| Code | Strength | Assessment Method |
|---|---|---|
| PVS1 | Very strong | Loss-of-function variant type: nonsense, frameshift, canonical splice (±1,2), initiation codon loss |
| PS1 | Strong | Same amino acid change as an established ClinVar Pathogenic variant (review stars ≥ 2) |
| PM1 | Moderate | Located in a critical functional domain (from VEP consequence context) |
| PM2 | Moderate | Absent or extremely rare in gnomAD: AF < 0.0001 (dominant) or AF < 0.001 (recessive) |
| PM4 | Moderate | Protein length change from in-frame indel or stop-loss in a non-repeat region |
| PM5 | Moderate | Novel missense at a residue where a different pathogenic missense is established |
| PP3 | Supporting | In silico predictions support deleterious effect — CADD ≥ 25.3, SIFT=deleterious, PolyPhen=probably_damaging |
| PP5 | Supporting | Reputable source reports variant as pathogenic (ClinVar with review stars ≥ 2) |
Benign evidence:
| Code | Strength | Assessment Method |
|---|---|---|
| BA1 | Stand-alone | gnomAD total AF > 5% — classified Benign immediately |
| BS1 | Strong | gnomAD AF > 1% for rare Mendelian disease |
| BP4 | Supporting | In silico predictions support no impact — CADD < 15, SIFT=tolerated, PolyPhen=benign |
| BP6 | Supporting | Reputable source reports variant as benign (ClinVar with review stars ≥ 2) |
| BP7 | Supporting | Synonymous variant with no predicted splice impact |
Combining Rules
| Classification | Required Evidence Combination |
|---|---|
| Pathogenic | PVS1 + ≥1 PS; OR PVS1 + ≥2 PM; OR PVS1 + 1 PM + 1 PP; OR PVS1 + ≥2 PP; OR ≥2 PS; OR 1 PS + ≥3 PM; OR 1 PS + 2 PM + ≥2 PP; OR 1 PS + 1 PM + ≥4 PP |
| Likely Pathogenic | PVS1 + 1 PM; OR 1 PS + 1–2 PM; OR 1 PS + ≥2 PP; OR ≥3 PM; OR 2 PM + ≥2 PP; OR 1 PM + ≥4 PP |
| Likely Benign | 1 BS + 1 BP; OR ≥2 BP |
| Benign | BA1 alone; OR ≥2 BS |
| VUS | Does not meet any of the above; or conflicting pathogenic and benign evidence |
Key Thresholds
- BA1: gnomAD AF > 5% (Richards et al., 2015)
- BS1: gnomAD AF > 1% (rare Mendelian disease default)
- PM2: gnomAD AF < 0.0001 (dominant) or < 0.001 (recessive)
- PP3: CADD ≥ 25.3
- BP4: CADD < 15
- ClinVar minimum stars for PS1/PP5/BP6: ≥ 2
Example Queries
- "Classify the variants in this exome VCF according to ACMG guidelines"
- "Which variants in my VCF are pathogenic or likely pathogenic?"
- "Run ACMG classification on this VCF and check for secondary findings"
- "Generate an ACMG-compliant clinical report from this genome VCF"
Output Structure
output_directory/ ├── report.md # Clinical interpretation report ├── result.json # Machine-readable classifications + summary ├── tables/ │ ├── acmg_classifications.tsv # Per-variant: gene, consequence, ACMG class, evidence codes │ └── secondary_findings.tsv # Variants in ACMG SF v3.2 genes with classifications ├── figures/ │ └── classification_summary.png # Bar chart of P/LP/VUS/LB/B distribution └── reproducibility/ ├── commands.sh # Exact command to reproduce └── database_versions.json # ClinVar date, gnomAD version, VEP release, SF list version
Dependencies
Required:
- Python 3.10+ (standard library for core classification engine)
>= 2.31 — Ensembl VEP REST API access (live mode only)requests
>= 3.7 — classification summary figurematplotlib
Optional:
— faster VCF parsing for large files (graceful fallback to stdlib parser)pysam
— tabular data export (graceful fallback to csv module)pandas
Safety
- Local-first: All classification logic runs locally. Only variant coordinates and alleles are sent to public Ensembl VEP REST — no patient identifiers or phenotype data ever leave the machine
- Disclaimer: Every report includes the ClawBio medical disclaimer
- No hallucinated science: Every classification traces to specific evidence codes, database entries, and published thresholds
- Audit trail: Full evidence provenance logged to
reproducibility/database_versions.json - Conservative defaults: Missing evidence is never treated as supporting pathogenicity
- Warn before overwrite: Checks for existing output before writing to a directory
Integration with Bio Orchestrator
Trigger conditions — the orchestrator routes here when:
- The user mentions ACMG, ACMG classification, pathogenic variant classification, or clinical variant interpretation
- The user provides a VCF and asks for guideline-grade or clinical-grade classification
- The user asks about secondary findings or ACMG SF screening
Chaining partners:
: Upstream — provides VEP-annotated VCF that this skill consumesvariant-annotation
: Downstream — pharmacogenomic loci for drug–gene interaction analysispharmgx-reporter
: Downstream — classified variants inspected for trait associationsgwas-lookup
: Downstream — gene–drug interactions for pharmacogenes found in the classified setclinpgx
: Downstream — ACMG classifications feed into unified personal genomic profileprofile-report
Citations
- Richards et al. (2015) — ACMG/AMP standards and guidelines for the interpretation of sequence variants. Genet Med 17:405–424
- Rehm et al. (2013) — ACMG clinical laboratory standards for next-generation sequencing. Genet Med 15:733–747
- Miller et al. (2023) — ACMG SF v3.2 list for reporting of secondary findings. Genet Med 25:100866
- Abou Tayoun et al. (2018) — PVS1 ACMG/AMP variant criterion recommendations. Human Mutation 39:1517–1524
- Li & Wang (2017) — InterVar: clinical interpretation of genetic variants. Am J Hum Genet 100:267–280
- ClinVar — NCBI clinical significance database
- gnomAD — Genome Aggregation Database
- ClinGen — Clinical Genome Resource