Claude-skill-registry bio-atac-seq-atac-peak-calling
Call accessible chromatin regions from ATAC-seq data using MACS3 with ATAC-specific parameters. Use when identifying open chromatin regions from aligned ATAC-seq BAM files, different from ChIP-seq peak calling.
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/atac-peak-calling" ~/.claude/skills/majiayu000-claude-skill-registry-bio-atac-seq-atac-peak-calling && rm -rf "$T"
manifest:
skills/data/atac-peak-calling/SKILL.mdsource content
ATAC-seq Peak Calling
Basic MACS3 for ATAC-seq
# Standard ATAC-seq peak calling macs3 callpeak \ -t sample.bam \ -f BAMPE \ -g hs \ -n sample \ --outdir peaks/ \ -q 0.05 \ --nomodel \ --shift -75 \ --extsize 150 \ --keep-dup all \ -B
Key ATAC-seq Parameters
# Explained parameters macs3 callpeak \ -t sample.bam \ # Treatment BAM -f BAMPE \ # Paired-end BAM (uses fragment size) -g hs \ # Genome size: hs (human), mm (mouse) -n sample \ # Output name prefix --nomodel \ # Don't build shifting model --shift -75 \ # Shift reads to center on Tn5 cut site --extsize 150 \ # Extend reads to this size --keep-dup all \ # Keep duplicates (ATAC has natural duplicates) -B \ # Generate bedGraph for visualization --call-summits # Call peak summits
Why These Parameters?
| Parameter | Reason |
|---|---|
| --nomodel | ATAC doesn't have control, can't build model |
| --shift -75 | Centers on Tn5 insertion site |
| --extsize 150 | Smooths signal around cut sites |
| --keep-dup all | Tn5 creates duplicate cuts at accessible sites |
| -f BAMPE | Uses actual fragment size from paired-end |
Paired-End vs Single-End
# Paired-end (recommended for ATAC) macs3 callpeak -f BAMPE -t sample.bam ... # Single-end (less common) macs3 callpeak -f BAM -t sample.bam \ --nomodel --shift -75 --extsize 150 ...
Call Peaks on NFR Only
# First, filter to nucleosome-free reads (<100bp fragments) samtools view -h sample.bam | \ awk 'substr($0,1,1)=="@" || ($9>0 && $9<100) || ($9<0 && $9>-100)' | \ samtools view -b > nfr.bam # Call peaks on NFR macs3 callpeak \ -t nfr.bam \ -f BAMPE \ -g hs \ -n sample_nfr \ --nomodel \ --shift -37 \ --extsize 75 \ --keep-dup all \ -q 0.01
Broad Peaks (Optional)
# For broader accessible regions macs3 callpeak \ -t sample.bam \ -f BAMPE \ -g hs \ -n sample_broad \ --nomodel \ --shift -75 \ --extsize 150 \ --broad \ --broad-cutoff 0.1
Batch Processing
#!/bin/bash GENOME=hs # hs for human, mm for mouse OUTDIR=peaks mkdir -p $OUTDIR for bam in *.bam; do sample=$(basename $bam .bam) echo "Processing $sample..." macs3 callpeak \ -t $bam \ -f BAMPE \ -g $GENOME \ -n $sample \ --outdir $OUTDIR \ --nomodel \ --shift -75 \ --extsize 150 \ --keep-dup all \ -q 0.05 \ -B \ --call-summits done
Output Files
| File | Description |
|---|---|
| _peaks.narrowPeak | Peak locations (BED-like) |
| _summits.bed | Peak summit positions |
| _peaks.xls | Peak statistics (Excel format) |
| _treat_pileup.bdg | Signal track (bedGraph) |
| _control_lambda.bdg | Background (if control provided) |
narrowPeak Format
chr1 100 500 peak1 500 . 10.5 50.2 45.1 200
Columns: chrom, start, end, name, score, strand, signalValue, pValue, qValue, summit_offset
Convert to BigWig
# Sort bedGraph sort -k1,1 -k2,2n sample_treat_pileup.bdg > sample.sorted.bdg # Convert to BigWig bedGraphToBigWig sample.sorted.bdg chrom.sizes sample.bw
Merge Replicates
# Pool BAMs before peak calling (recommended for final peaks) samtools merge -@ 8 merged.bam rep1.bam rep2.bam rep3.bam # Call peaks on merged macs3 callpeak -t merged.bam -f BAMPE -g hs -n merged ...
IDR for Replicate Consistency
# Call peaks on each replicate macs3 callpeak -t rep1.bam -f BAMPE -g hs -n rep1 ... macs3 callpeak -t rep2.bam -f BAMPE -g hs -n rep2 ... # Run IDR idr --samples rep1_peaks.narrowPeak rep2_peaks.narrowPeak \ --input-file-type narrowPeak \ --output-file idr_peaks.txt \ --plot # Filter by IDR threshold awk '$5 >= 540' idr_peaks.txt > reproducible_peaks.bed
Related Skills
- read-alignment/bowtie2-alignment - Align ATAC-seq reads
- atac-seq/atac-qc - Quality control
- chip-seq/peak-calling - ChIP-seq comparison
- genome-intervals/bed-file-basics - Work with peak files