OpenClaw-Medical-Skills structural-variant-calling

<!--

install
source · Clone the upstream repo
git clone https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/variant-interpretation-acmg/bioSkills/structural-variant-calling" ~/.claude/skills/freedomintelligence-openclaw-medical-skills-structural-variant-calling && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/variant-interpretation-acmg/bioSkills/structural-variant-calling" ~/.openclaw/skills/freedomintelligence-openclaw-medical-skills-structural-variant-calling && rm -rf "$T"
manifest: skills/variant-interpretation-acmg/bioSkills/structural-variant-calling/SKILL.md
source content
<!-- # COPYRIGHT NOTICE # This file is part of the "Universal Biomedical Skills" project. # Copyright (c) 2026 MD BABU MIA, PhD <md.babu.mia@mssm.edu> # All Rights Reserved. # # This code is proprietary and confidential. # Unauthorized copying of this file, via any medium is strictly prohibited. # # Provenance: Authenticated by MD BABU MIA -->

name: bio-variant-calling-structural-variant-calling description: Call structural variants (SVs) from short-read sequencing using Manta, Delly, and LUMPY. Detects deletions, insertions, inversions, duplications, and translocations that are too large for standard SNV callers. Use when detecting structural variants from short-read data. tool_type: cli primary_tool: manta measurable_outcome: Execute skill workflow successfully with valid output within 15 minutes. allowed-tools:

  • read_file
  • run_shell_command

Structural Variant Calling (Short Reads)

Manta (Recommended)

# Configure Manta run (creates runWorkflow.py)
configManta.py \
    --bam sample.bam \
    --referenceFasta reference.fa \
    --runDir manta_run

# Execute
manta_run/runWorkflow.py -j 8

# Output: manta_run/results/variants/
# - diploidSV.vcf.gz (germline SVs)
# - candidateSV.vcf.gz (all candidates)
# - candidateSmallIndels.vcf.gz (small indels)

Manta Tumor-Normal Mode

# Somatic SV calling
configManta.py \
    --tumorBam tumor.bam \
    --normalBam normal.bam \
    --referenceFasta reference.fa \
    --runDir manta_somatic

manta_somatic/runWorkflow.py -j 8

# Output includes:
# - somaticSV.vcf.gz (somatic SVs)
# - diploidSV.vcf.gz (germline SVs)

Manta Options

# WES mode (for exome data)
configManta.py \
    --bam sample.bam \
    --referenceFasta reference.fa \
    --exome \                          # Use exome settings
    --callRegions regions.bed.gz \     # Restrict to regions
    --runDir manta_exome

# RNA-seq mode
configManta.py \
    --bam rnaseq.bam \
    --referenceFasta reference.fa \
    --rna \                            # RNA-seq mode
    --runDir manta_rna

Delly

# Call SVs
delly call \
    -g reference.fa \
    -o sv_calls.bcf \
    sample.bam

# Convert to VCF
bcftools view sv_calls.bcf > sv_calls.vcf

# Multiple samples (joint calling)
delly call \
    -g reference.fa \
    -o joint_svs.bcf \
    sample1.bam sample2.bam sample3.bam

Delly Somatic Mode

# Call with tumor-normal
delly call \
    -g reference.fa \
    -o svs.bcf \
    tumor.bam normal.bam

# Create sample file
echo -e "tumor\ttumor\nnormal\tcontrol" > samples.tsv

# Filter for somatic
delly filter \
    -f somatic \
    -o somatic_svs.bcf \
    -s samples.tsv \
    svs.bcf

Delly SV Types

# Call specific SV type
delly call -t DEL -g ref.fa -o deletions.bcf sample.bam
delly call -t DUP -g ref.fa -o duplications.bcf sample.bam
delly call -t INV -g ref.fa -o inversions.bcf sample.bam
delly call -t BND -g ref.fa -o translocations.bcf sample.bam
delly call -t INS -g ref.fa -o insertions.bcf sample.bam

LUMPY

# Extract split reads and discordant pairs
samtools view -b -F 1294 sample.bam > discordant.bam
samtools view -h sample.bam | \
    /path/to/lumpy-sv/scripts/extractSplitReads_BwaMem -i stdin | \
    samtools view -Sb - > splitters.bam

# Run LUMPY
lumpyexpress \
    -B sample.bam \
    -S splitters.bam \
    -D discordant.bam \
    -o lumpy_svs.vcf

Smoove (LUMPY Wrapper)

# Simplified LUMPY workflow
smoove call \
    --name sample \
    --fasta reference.fa \
    --outdir smoove_output \
    -p 8 \
    sample.bam

# Output: smoove_output/sample-smoove.genotyped.vcf.gz

Merge Multiple Callers

# Use SURVIVOR to merge callsets
# Create file listing VCFs
ls manta_svs.vcf delly_svs.vcf lumpy_svs.vcf > vcf_list.txt

# Merge with parameters
SURVIVOR merge vcf_list.txt 1000 2 1 1 0 50 merged_svs.vcf

# Parameters: max_dist min_callers type_agree strand_agree estimate_dist min_size

Filter SV Calls

# Filter by quality
bcftools view -i 'QUAL >= 20' svs.vcf > svs.filtered.vcf

# Filter by size
bcftools view -i 'ABS(SVLEN) >= 50' svs.vcf > svs.min50.vcf

# Filter by SV type
bcftools view -i 'SVTYPE="DEL"' svs.vcf > deletions.vcf
bcftools view -i 'SVTYPE="INS"' svs.vcf > insertions.vcf
bcftools view -i 'SVTYPE="INV"' svs.vcf > inversions.vcf
bcftools view -i 'SVTYPE="DUP"' svs.vcf > duplications.vcf
bcftools view -i 'SVTYPE="BND"' svs.vcf > translocations.vcf

# Keep only PASS
bcftools view -f PASS svs.vcf > svs.pass.vcf

Annotate SVs

# AnnotSV annotation
AnnotSV \
    -SVinputFile svs.vcf \
    -genomeBuild GRCh38 \
    -outputFile annotated_svs

# Output includes: genes, DGV, gnomAD-SV, ClinVar

SV Types

TypeCodeDescription
DeletionDELSequence removed
InsertionINSSequence inserted
InversionINVSequence reversed
DuplicationDUPSequence duplicated
TranslocationBNDBreakend (inter-chromosomal)

Comparison: Manta vs Delly vs LUMPY

FeatureMantaDellyLUMPY
SpeedFastMediumMedium
SensitivityHighHighHigh
Small SVsGoodModerateGood
Large SVsGoodGoodGood
RNA-seqYesNoNo
SomaticYesYesLimited

Coverage Guidelines

CoverageDetection Ability
10xLarge SVs (>1kb)
30xMost SVs
50x+Small SVs, better breakpoints

Long-Read SV Callers

For long-read data (ONT/PacBio HiFi), use specialized callers with higher sensitivity:

CallerBest ForNotes
CuteSVONT/HiFiFast, accurate for all SV types
Sniffles2ONT/HiFiPopulation-scale, multisample
PBSVPacBioOfficial PacBio caller

See long-read-sequencing/structural-variants for long-read SV workflows.

Related Skills

  • long-read-sequencing/structural-variants - Long-read SV calling
  • copy-number/cnvkit-analysis - Copy number variants
  • variant-calling/filtering-best-practices - Filter VCF files
  • alignment-files/alignment-filtering - Prepare BAM files
<!-- AUTHOR_SIGNATURE: 9a7f3c2e-MD-BABU-MIA-2026-MSSM-SECURE -->