install
source · Clone the upstream repo
git clone https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/bio-workflows-cnv-pipeline" ~/.claude/skills/freedomintelligence-openclaw-medical-skills-bio-workflows-cnv-pipeline && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/bio-workflows-cnv-pipeline" ~/.openclaw/skills/freedomintelligence-openclaw-medical-skills-bio-workflows-cnv-pipeline && rm -rf "$T"
manifest:
skills/bio-workflows-cnv-pipeline/SKILL.mdsource content
<!--
# COPYRIGHT NOTICE
# This file is part of the "Universal Biomedical Skills" project.
# Copyright (c) 2026 MD BABU MIA, PhD <md.babu.mia@mssm.edu>
# All Rights Reserved.
#
# This code is proprietary and confidential.
# Unauthorized copying of this file, via any medium is strictly prohibited.
#
# Provenance: Authenticated by MD BABU MIA
-->
name: bio-workflows-cnv-pipeline description: End-to-end copy number variant detection workflow from BAM files. Covers CNVkit analysis for exome/targeted sequencing with visualization and annotation. Use when detecting copy number alterations from sequencing data. tool_type: mixed primary_tool: CNVkit workflow: true depends_on:
- copy-number/cnvkit-analysis
- copy-number/cnv-visualization
- copy-number/cnv-annotation qc_checkpoints:
- after_coverage: "Uniform coverage across targets"
- after_calling: "Reasonable CNV count, expected ploidy"
- after_annotation: "Known CNVs detected if present" measurable_outcome: Execute skill workflow successfully with valid output within 15 minutes. allowed-tools:
- read_file
- run_shell_command
CNV Pipeline
Complete workflow for detecting copy number variants from exome or targeted sequencing data.
Workflow Overview
BAM files (tumor/normal or germline) | v [1. Target Preparation] --> Create/access target BED | v [2. Coverage Calculation] --> Read depth per target | v [3. Reference Creation] --> Pool of normals | v [4. CNV Calling] --------> Log2 ratios, segmentation | v [5. Visualization] ------> Scatter plots, heatmaps | v [6. Annotation] ---------> Gene-level CNVs | v CNV calls with gene annotations
Primary Path: CNVkit
Step 1: Prepare Target Regions
# If using exome capture kit BED cnvkit.py target capture_targets.bed \ --annotate refFlat.txt \ --split \ -o targets.bed # Access regions (off-target for WGS-like sensitivity) cnvkit.py access genome.fa \ -o access.bed cnvkit.py antitarget targets.bed \ --access access.bed \ -o antitargets.bed
Step 2: Calculate Coverage
# For each sample for bam in *.bam; do sample=$(basename $bam .bam) # Target coverage cnvkit.py coverage $bam targets.bed \ -o coverage/${sample}.targetcoverage.cnn # Antitarget coverage cnvkit.py coverage $bam antitargets.bed \ -o coverage/${sample}.antitargetcoverage.cnn done
Step 3: Create Reference (Pool of Normals)
# From normal samples cnvkit.py reference \ coverage/normal*.targetcoverage.cnn \ coverage/normal*.antitargetcoverage.cnn \ --fasta genome.fa \ -o reference.cnn # Or flat reference (no normals available) cnvkit.py reference \ --fasta genome.fa \ --targets targets.bed \ --antitargets antitargets.bed \ -o flat_reference.cnn
Step 4: Call CNVs
for bam in tumor*.bam; do sample=$(basename $bam .bam) # Fix and segment cnvkit.py fix \ coverage/${sample}.targetcoverage.cnn \ coverage/${sample}.antitargetcoverage.cnn \ reference.cnn \ -o cnv/${sample}.cnr # Segment cnvkit.py segment cnv/${sample}.cnr \ -o cnv/${sample}.cns # Call integer copy numbers cnvkit.py call cnv/${sample}.cns \ -o cnv/${sample}.call.cns done
Step 5: Visualization
# Scatter plot for single sample cnvkit.py scatter cnv/tumor1.cnr \ -s cnv/tumor1.cns \ -o plots/tumor1_scatter.pdf # Chromosome-specific cnvkit.py scatter cnv/tumor1.cnr \ -s cnv/tumor1.cns \ -c chr17 \ -o plots/tumor1_chr17.pdf # Diagram (chromosome ideogram) cnvkit.py diagram cnv/tumor1.cnr \ -s cnv/tumor1.cns \ -o plots/tumor1_diagram.pdf # Heatmap for multiple samples cnvkit.py heatmap cnv/*.cns \ -o plots/cohort_heatmap.pdf
Step 6: Export and Annotation
# Export to various formats cnvkit.py export seg cnv/*.cns -o cnv/cohort.seg cnvkit.py export vcf cnv/tumor1.call.cns -o cnv/tumor1.vcf # Gene-level summary cnvkit.py genemetrics cnv/tumor1.cnr \ -s cnv/tumor1.cns \ --threshold 0.2 \ -o cnv/tumor1_genes.tsv # Filter for significant CNVs awk '$6 < -0.4 || $6 > 0.3' cnv/tumor1_genes.tsv > cnv/tumor1_significant_genes.tsv
Batch Processing Script
#!/bin/bash set -e TARGETS="targets.bed" REFERENCE="reference.cnn" OUTDIR="cnv_results" mkdir -p ${OUTDIR}/{coverage,cnv,plots} # Process all tumor samples for bam in tumor*.bam; do sample=$(basename $bam .bam) echo "Processing ${sample}..." # Coverage cnvkit.py coverage $bam ${TARGETS} \ -o ${OUTDIR}/coverage/${sample}.targetcoverage.cnn # Fix cnvkit.py fix \ ${OUTDIR}/coverage/${sample}.targetcoverage.cnn \ ${OUTDIR}/coverage/${sample}.antitargetcoverage.cnn \ ${REFERENCE} \ -o ${OUTDIR}/cnv/${sample}.cnr # Segment cnvkit.py segment ${OUTDIR}/cnv/${sample}.cnr \ -o ${OUTDIR}/cnv/${sample}.cns # Call cnvkit.py call ${OUTDIR}/cnv/${sample}.cns \ -o ${OUTDIR}/cnv/${sample}.call.cns # Plot cnvkit.py scatter ${OUTDIR}/cnv/${sample}.cnr \ -s ${OUTDIR}/cnv/${sample}.cns \ -o ${OUTDIR}/plots/${sample}.pdf done # Cohort heatmap cnvkit.py heatmap ${OUTDIR}/cnv/*.cns -o ${OUTDIR}/plots/heatmap.pdf
Germline CNV Calling
# For germline analysis (no tumor-normal) cnvkit.py batch sample*.bam \ --normal normal*.bam \ --targets targets.bed \ --fasta genome.fa \ --output-reference reference.cnn \ --output-dir cnv_output \ --scatter --diagram # Or use flat reference cnvkit.py batch sample.bam \ --method hybrid \ --targets targets.bed \ --fasta genome.fa \ --output-dir cnv_output
Parameter Recommendations
| Step | Parameter | Value |
|---|---|---|
| target | --split | Yes (for WES) |
| segment | --method | cbs (default) |
| call | --ploidy | 2 (adjust if known) |
| call | --purity | Estimate if tumor |
| genemetrics | --threshold | 0.2 |
Troubleshooting
| Issue | Likely Cause | Solution |
|---|---|---|
| Noisy signal | Low coverage | Increase sequencing depth |
| No CNVs | Flat reference, normal sample | Check reference creation |
| Many small CNVs | Over-segmentation | Increase segment min size |
| Batch effects | Different capture kits | Match samples to correct reference |
Complete Pipeline Script
#!/bin/bash set -e GENOME="genome.fa" TARGETS="capture_targets.bed" REFFLAT="refFlat.txt" NORMAL_BAMS="normal*.bam" TUMOR_BAMS="tumor*.bam" OUTDIR="cnv_results" mkdir -p ${OUTDIR}/{coverage,cnv,plots,annotation} # Step 1: Prepare targets cnvkit.py target ${TARGETS} --annotate ${REFFLAT} --split -o ${OUTDIR}/targets.bed cnvkit.py access ${GENOME} -o ${OUTDIR}/access.bed cnvkit.py antitarget ${OUTDIR}/targets.bed --access ${OUTDIR}/access.bed -o ${OUTDIR}/antitargets.bed # Step 2: Coverage (normals) for bam in ${NORMAL_BAMS}; do sample=$(basename $bam .bam) cnvkit.py coverage $bam ${OUTDIR}/targets.bed -o ${OUTDIR}/coverage/${sample}.targetcoverage.cnn cnvkit.py coverage $bam ${OUTDIR}/antitargets.bed -o ${OUTDIR}/coverage/${sample}.antitargetcoverage.cnn done # Step 3: Reference cnvkit.py reference ${OUTDIR}/coverage/normal*.cnn --fasta ${GENOME} -o ${OUTDIR}/reference.cnn # Step 4-5: Process tumors for bam in ${TUMOR_BAMS}; do sample=$(basename $bam .bam) cnvkit.py coverage $bam ${OUTDIR}/targets.bed -o ${OUTDIR}/coverage/${sample}.targetcoverage.cnn cnvkit.py coverage $bam ${OUTDIR}/antitargets.bed -o ${OUTDIR}/coverage/${sample}.antitargetcoverage.cnn cnvkit.py fix ${OUTDIR}/coverage/${sample}.targetcoverage.cnn \ ${OUTDIR}/coverage/${sample}.antitargetcoverage.cnn \ ${OUTDIR}/reference.cnn -o ${OUTDIR}/cnv/${sample}.cnr cnvkit.py segment ${OUTDIR}/cnv/${sample}.cnr -o ${OUTDIR}/cnv/${sample}.cns cnvkit.py call ${OUTDIR}/cnv/${sample}.cns -o ${OUTDIR}/cnv/${sample}.call.cns cnvkit.py scatter ${OUTDIR}/cnv/${sample}.cnr -s ${OUTDIR}/cnv/${sample}.cns -o ${OUTDIR}/plots/${sample}.pdf cnvkit.py genemetrics ${OUTDIR}/cnv/${sample}.cnr -s ${OUTDIR}/cnv/${sample}.cns -o ${OUTDIR}/annotation/${sample}_genes.tsv done echo "Pipeline complete. Results in ${OUTDIR}/"
Related Skills
- copy-number/cnvkit-analysis - CNVkit details
- copy-number/cnv-visualization - Plotting options
- copy-number/cnv-annotation - Gene annotations
- copy-number/gatk-cnv - GATK alternative