install
source · Clone the upstream repo
git clone https://github.com/mdbabumiamssm/LLMs-Universal-Life-Science-and-Clinical-Skills-
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/mdbabumiamssm/LLMs-Universal-Life-Science-and-Clinical-Skills- "$T" && mkdir -p ~/.claude/skills && cp -r "$T/Skills/Epigenomics/methylation-analysis/bismark-alignment" ~/.claude/skills/mdbabumiamssm-llms-universal-life-science-and-clinical-skills-bismark-alignment && rm -rf "$T"
manifest:
Skills/Epigenomics/methylation-analysis/bismark-alignment/SKILL.mdsource content
<!--
# COPYRIGHT NOTICE
# This file is part of the "Universal Biomedical Skills" project.
# Copyright (c) 2026 MD BABU MIA, PhD <md.babu.mia@mssm.edu>
# All Rights Reserved.
#
# This code is proprietary and confidential.
# Unauthorized copying of this file, via any medium is strictly prohibited.
#
# Provenance: Authenticated by MD BABU MIA
-->
name: bio-methylation-bismark-alignment description: Bisulfite sequencing read alignment using Bismark with bowtie2/hisat2. Handles genome preparation and produces BAM files with methylation information. Use when aligning WGBS, RRBS, or other bisulfite-converted sequencing reads to a reference genome. tool_type: cli primary_tool: bismark measurable_outcome: Execute skill workflow successfully with valid output within 15 minutes. allowed-tools:
- read_file
- run_shell_command
Bismark Alignment
Prepare Genome Index
# One-time genome preparation (creates bisulfite-converted index) bismark_genome_preparation --bowtie2 /path/to/genome_folder/ # Genome folder should contain FASTA files (e.g., hg38.fa, chr1.fa, etc.) # Creates Bisulfite_Genome/ subdirectory with CT and GA converted indices
Basic Single-End Alignment
bismark --genome /path/to/genome_folder/ reads.fastq.gz -o output_dir/
Paired-End Alignment
bismark --genome /path/to/genome_folder/ \ -1 reads_R1.fastq.gz \ -2 reads_R2.fastq.gz \ -o output_dir/
Common Options
bismark --genome /path/to/genome_folder/ \ --bowtie2 \ # Use bowtie2 (default) --parallel 4 \ # Number of parallel instances --temp_dir /tmp/ \ # Temporary directory --non_directional \ # For non-directional libraries --nucleotide_coverage \ # Generate nucleotide coverage report -o output_dir/ \ reads.fastq.gz
RRBS Mode
# Reduced Representation Bisulfite Sequencing bismark --genome /path/to/genome_folder/ \ --pbat \ # For PBAT libraries (post-bisulfite adapter tagging) reads.fastq.gz # MspI digestion (RRBS standard) # Bismark handles MspI-digested libraries automatically
PBAT Libraries
# Post-Bisulfite Adapter Tagging (e.g., scBS-seq) bismark --genome /path/to/genome_folder/ --pbat reads.fastq.gz
Non-Directional Libraries
# For libraries where all 4 strands are present bismark --genome /path/to/genome_folder/ --non_directional reads.fastq.gz
With Quality/Adapter Trimming (Pre-alignment)
# Trim adapters first with Trim Galore (recommended) trim_galore --illumina --paired reads_R1.fastq.gz reads_R2.fastq.gz # Then align bismark --genome /path/to/genome_folder/ \ -1 reads_R1_val_1.fq.gz \ -2 reads_R2_val_2.fq.gz
Multicore Processing
# --parallel sets instances per alignment direction # Total threads = parallel * 2 (for directional) or parallel * 4 (non-directional) bismark --genome /path/to/genome_folder/ \ --parallel 4 \ reads.fastq.gz
Output Files
# Bismark produces: # - reads_bismark_bt2.bam # Aligned reads # - reads_bismark_bt2_SE_report.txt # Alignment report # View alignment report cat output_dir/reads_bismark_bt2_SE_report.txt
Sort and Index BAM
# Bismark output is unsorted samtools sort output.bam -o output.sorted.bam samtools index output.sorted.bam
Deduplicate (Optional)
# Remove PCR duplicates (recommended for WGBS, not RRBS) deduplicate_bismark --bam output_bismark_bt2.bam # For paired-end deduplicate_bismark --paired --bam output_bismark_bt2_pe.bam
Check Alignment Statistics
# Bismark generates detailed report cat *_SE_report.txt # Key metrics: # - Sequences analyzed # - Unique alignments # - Mapping efficiency # - C methylated in CpG context
Genome Preparation with HISAT2 (Recommended for Large Genomes)
# HISAT2 is faster and uses less memory for large mammalian genomes bismark_genome_preparation --hisat2 /path/to/genome_folder/ # Align with HISAT2 bismark --genome /path/to/genome_folder/ --hisat2 reads.fastq.gz # HISAT2 paired-end bismark --genome /path/to/genome_folder/ --hisat2 \ -1 reads_R1.fastq.gz \ -2 reads_R2.fastq.gz
Key Parameters
| Parameter | Description |
|---|---|
| --genome | Path to genome folder |
| --bowtie2 | Use Bowtie2 aligner (default) |
| --hisat2 | Use HISAT2 aligner |
| --parallel | Parallel alignment instances |
| --non_directional | Non-directional library |
| --pbat | PBAT library protocol |
| -o | Output directory |
| --temp_dir | Temporary file directory |
| --nucleotide_coverage | Generate nuc coverage report |
| -N | Mismatches in seed (0 or 1, default 0) |
| -L | Seed length (default 20) |
Library Types
| Type | Parameter | Description |
|---|---|---|
| Directional | (default) | Standard WGBS/RRBS |
| Non-directional | --non_directional | All 4 strands |
| PBAT | --pbat | Post-bisulfite adapter tagging |
Related Skills
- methylation-calling - Extract methylation from Bismark BAM
- methylkit-analysis - Import Bismark output to R
- sequence-io/read-sequences - FASTQ handling
- alignment-files/sam-bam-basics - BAM manipulation