OpenClaw-Medical-Skills bio-workflows-methylation-pipeline

<!--

install
source · Clone the upstream repo
git clone https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/bio-workflows-methylation-pipeline" ~/.claude/skills/freedomintelligence-openclaw-medical-skills-bio-workflows-methylation-pipeline && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/bio-workflows-methylation-pipeline" ~/.openclaw/skills/freedomintelligence-openclaw-medical-skills-bio-workflows-methylation-pipeline && rm -rf "$T"
manifest: skills/bio-workflows-methylation-pipeline/SKILL.md
source content
<!-- # COPYRIGHT NOTICE # This file is part of the "Universal Biomedical Skills" project. # Copyright (c) 2026 MD BABU MIA, PhD <md.babu.mia@mssm.edu> # All Rights Reserved. # # This code is proprietary and confidential. # Unauthorized copying of this file, via any medium is strictly prohibited. # # Provenance: Authenticated by MD BABU MIA -->

name: bio-workflows-methylation-pipeline description: End-to-end bisulfite sequencing workflow from FASTQ to differentially methylated regions. Covers Bismark alignment, methylation calling, and DMR detection with methylKit. Use when analyzing bisulfite sequencing data. tool_type: mixed primary_tool: Bismark workflow: true depends_on:

  • read-qc/fastp-workflow
  • methylation-analysis/bismark-alignment
  • methylation-analysis/methylation-calling
  • methylation-analysis/methylkit-analysis
  • methylation-analysis/dmr-detection qc_checkpoints:
  • after_qc: "Q30 >80%, adapter content removed"
  • after_alignment: "Mapping efficiency >50%, bisulfite conversion >99%"
  • after_calling: "Coverage distribution reasonable, no biased positions" measurable_outcome: Execute skill workflow successfully with valid output within 15 minutes. allowed-tools:
  • read_file
  • run_shell_command

Methylation Pipeline

Complete workflow from bisulfite sequencing FASTQ to differentially methylated regions.

Workflow Overview

FASTQ files
    |
    v
[1. QC & Trimming] -----> fastp/Trim Galore
    |
    v
[2. Alignment] ---------> Bismark
    |
    v
[3. Deduplication] -----> deduplicate_bismark
    |
    v
[4. Methylation Calling] -> bismark_methylation_extractor
    |
    v
[5. Analysis] -----------> methylKit (R)
    |
    v
[6. DMR Detection] ------> methylKit/DSS
    |
    v
Differentially methylated regions

Primary Path: Bismark + methylKit

Step 1: Quality Control

# Trim Galore recommended for bisulfite data (handles adapter bias)
trim_galore --paired --fastqc \
    -o trimmed/ \
    sample_R1.fastq.gz sample_R2.fastq.gz

# Or fastp with conservative settings
fastp -i sample_R1.fastq.gz -I sample_R2.fastq.gz \
    -o trimmed/sample_R1.fq.gz -O trimmed/sample_R2.fq.gz \
    --detect_adapter_for_pe \
    --qualified_quality_phred 20 \
    --length_required 35 \
    --html qc/sample_fastp.html

Step 2: Bismark Alignment

# Prepare genome (once)
bismark_genome_preparation --bowtie2 genome/

# Align
bismark --genome genome/ \
    -1 trimmed/sample_R1_val_1.fq.gz \
    -2 trimmed/sample_R2_val_2.fq.gz \
    -o aligned/ \
    --parallel 4 \
    --temp_dir tmp/

# Output: sample_R1_val_1_bismark_bt2_pe.bam

QC Checkpoint: Check Bismark report

  • Mapping efficiency >50% (BS-seq has lower rates)
  • Bisulfite conversion rate >99%

Step 3: Deduplication

deduplicate_bismark \
    --bam \
    -p \
    -o deduplicated/ \
    aligned/sample_R1_val_1_bismark_bt2_pe.bam

Step 4: Methylation Calling

bismark_methylation_extractor \
    --paired-end \
    --comprehensive \
    --bedGraph \
    --cytosine_report \
    --genome_folder genome/ \
    -o methylation/ \
    deduplicated/sample_R1_val_1_bismark_bt2_pe.deduplicated.bam

# Generate summary report
bismark2report
bismark2summary

Step 5: Analysis with methylKit

library(methylKit)

# Read methylation calls
files <- list(
    'methylation/control_1.CpG_report.txt',
    'methylation/control_2.CpG_report.txt',
    'methylation/treated_1.CpG_report.txt',
    'methylation/treated_2.CpG_report.txt'
)

sample_ids <- c('control_1', 'control_2', 'treated_1', 'treated_2')
treatment <- c(0, 0, 1, 1)

# Read cytosine reports
meth_obj <- methRead(
    location = as.list(files),
    sample.id = as.list(sample_ids),
    assembly = 'hg38',
    treatment = treatment,
    context = 'CpG',
    pipeline = 'bismarkCytosineReport'
)

# Filter by coverage
meth_filtered <- filterByCoverage(meth_obj, lo.count = 10, hi.perc = 99.9)

# Normalize coverage
meth_norm <- normalizeCoverage(meth_filtered)

# Merge samples (keep sites covered in all)
meth_merged <- unite(meth_norm, destrand = TRUE)

# Sample statistics
getMethylationStats(meth_obj[[1]], plot = TRUE)
getCoverageStats(meth_obj[[1]], plot = TRUE)

Step 6: DMR Detection

# Calculate differential methylation (per CpG)
diff_meth <- calculateDiffMeth(meth_merged)

# Get significant DMCs
dmc <- getMethylDiff(diff_meth, difference = 25, qvalue = 0.01)

# Tile into regions (DMRs)
tiles <- tileMethylCounts(meth_merged, win.size = 1000, step.size = 1000)
diff_tiles <- calculateDiffMeth(tiles)
dmr <- getMethylDiff(diff_tiles, difference = 25, qvalue = 0.01)

# Export
write.csv(as.data.frame(dmc), 'dmc_results.csv')
write.csv(as.data.frame(dmr), 'dmr_results.csv')

# Annotate with genomic features
library(genomation)
gene_obj <- readTranscriptFeatures('genes.bed')
annotateWithGeneParts(as(dmr, 'GRanges'), gene_obj)

Parameter Recommendations

StepParameterValue
Trim GaloredefaultRecommended for BS-seq
Bismark--parallel4 (per sample parallelization)
methylKitlo.count10 (minimum coverage)
methylKitdifference25 (% methylation difference)
methylKitqvalue0.01
DMR tileswin.size500-1000 bp

Troubleshooting

IssueLikely CauseSolution
Low mapping rateNormal for BS-seqExpect 40-70%
Low conversionFailed bisulfite treatmentCheck spike-in controls
Few DMRsLow coverage, small differencesIncrease sequencing, relax thresholds
Biased positionsM-biasTrim 10bp from read ends

Complete Pipeline Script

#!/bin/bash
set -e

THREADS=4
GENOME="genome/"
SAMPLES="control_1 control_2 treated_1 treated_2"
OUTDIR="methylation_results"

mkdir -p ${OUTDIR}/{trimmed,aligned,deduplicated,methylation,qc}

# Step 1: QC
for sample in $SAMPLES; do
    trim_galore --paired --fastqc -o ${OUTDIR}/trimmed/ \
        ${sample}_R1.fastq.gz ${sample}_R2.fastq.gz
done

# Step 2: Alignment
for sample in $SAMPLES; do
    bismark --genome ${GENOME} \
        -1 ${OUTDIR}/trimmed/${sample}_R1_val_1.fq.gz \
        -2 ${OUTDIR}/trimmed/${sample}_R2_val_2.fq.gz \
        -o ${OUTDIR}/aligned/ \
        --parallel ${THREADS} --temp_dir tmp/
done

# Step 3: Deduplication
for sample in $SAMPLES; do
    deduplicate_bismark --bam -p \
        -o ${OUTDIR}/deduplicated/ \
        ${OUTDIR}/aligned/${sample}_R1_val_1_bismark_bt2_pe.bam
done

# Step 4: Methylation calling
for sample in $SAMPLES; do
    bismark_methylation_extractor --paired-end --comprehensive \
        --bedGraph --cytosine_report \
        --genome_folder ${GENOME} \
        -o ${OUTDIR}/methylation/ \
        ${OUTDIR}/deduplicated/${sample}_R1_val_1_bismark_bt2_pe.deduplicated.bam
done

bismark2report
echo "Pipeline complete. Run R script for DMR analysis."

Related Skills

  • methylation-analysis/bismark-alignment - Bismark parameters
  • methylation-analysis/methylation-calling - Calling details
  • methylation-analysis/methylkit-analysis - methylKit functions
  • methylation-analysis/dmr-detection - DMR algorithms
<!-- AUTHOR_SIGNATURE: 9a7f3c2e-MD-BABU-MIA-2026-MSSM-SECURE -->