OpenClaw-Medical-Skills bio-longread-alignment

Align long reads using minimap2 for Oxford Nanopore and PacBio data. Supports various presets for different read types and applications. Use when aligning ONT or PacBio reads to a reference genome for variant calling, SV detection, or coverage analysis.

install
source · Clone the upstream repo
git clone https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/bio-longread-alignment" ~/.claude/skills/freedomintelligence-openclaw-medical-skills-bio-longread-alignment && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/bio-longread-alignment" ~/.openclaw/skills/freedomintelligence-openclaw-medical-skills-bio-longread-alignment && rm -rf "$T"
manifest: skills/bio-longread-alignment/SKILL.md
source content

Version Compatibility

Reference examples tested with: minimap2 2.26+, samtools 1.19+

Before using code patterns, verify installed versions match. If versions differ:

  • CLI:
    <tool> --version
    then
    <tool> --help
    to confirm flags

If code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.

Long-Read Alignment with minimap2

"Align my long reads to the reference" → Map ONT or PacBio reads using minimap2 with technology-specific presets for optimal sensitivity and accuracy.

  • CLI:
    minimap2 -ax map-ont ref.fa reads.fq | samtools sort -o aligned.bam
    (ONT),
    minimap2 -ax map-hifi
    (PacBio HiFi)

Oxford Nanopore Alignment

# Basic ONT alignment
minimap2 -ax map-ont reference.fa reads.fastq.gz | \
    samtools sort -o aligned.bam
samtools index aligned.bam

PacBio HiFi Alignment

# PacBio HiFi reads (high accuracy)
minimap2 -ax map-hifi reference.fa reads.fastq.gz | \
    samtools sort -o aligned.bam
samtools index aligned.bam

PacBio CLR Alignment

# PacBio CLR (continuous long reads, lower accuracy)
minimap2 -ax map-pb reference.fa reads.fastq.gz | \
    samtools sort -o aligned.bam
samtools index aligned.bam

Pre-Build Index for Multiple Runs

# Build index once
minimap2 -d reference.mmi reference.fa

# Use index for alignment
minimap2 -ax map-ont reference.mmi reads.fastq.gz | samtools sort -o aligned.bam

Common Options

minimap2 -ax map-ont \
    -t 8 \                         # Threads
    -R '@RG\tID:sample\tSM:sample' \  # Read group
    --secondary=no \               # No secondary alignments
    --MD \                         # Generate MD tag for variants
    -Y \                           # Use soft clipping for supplementary
    reference.fa reads.fastq.gz | \
    samtools sort -@ 4 -o aligned.bam

Splice-Aware Alignment (RNA)

# For direct RNA or cDNA sequencing
minimap2 -ax splice reference.fa reads.fastq.gz | \
    samtools sort -o aligned.bam

With Junction BED (Known Splice Sites)

# Provide known splice junctions
minimap2 -ax splice --junc-bed junctions.bed \
    reference.fa reads.fastq.gz | samtools sort -o aligned.bam

Assembly to Reference Alignment

# Assembly with ~0.1% divergence
minimap2 -ax asm5 reference.fa assembly.fa > aligned.sam

# Assembly with higher divergence (~5%)
minimap2 -ax asm20 reference.fa assembly.fa > aligned.sam

Output PAF (Faster, No BAM)

# PAF format (faster, for quick analysis)
minimap2 -x map-ont reference.fa reads.fastq.gz > alignments.paf

Keep Secondary and Supplementary

# Keep all alignments (for SV calling)
minimap2 -ax map-ont \
    --secondary=yes \
    -N 5 \                         # Max secondary alignments
    reference.fa reads.fastq.gz | samtools sort -o aligned.bam

Filter Alignments

# During alignment pipeline
minimap2 -ax map-ont reference.fa reads.fastq.gz | \
    samtools view -b -q 10 | \     # Min mapping quality 10
    samtools sort -o aligned.bam

Multiple FASTQ Files

# Concatenate inputs
minimap2 -ax map-ont reference.fa reads1.fastq.gz reads2.fastq.gz | \
    samtools sort -o aligned.bam

# Or use file list
cat file_list.txt | xargs minimap2 -ax map-ont reference.fa | \
    samtools sort -o aligned.bam

Output Statistics

# Get alignment statistics
samtools flagstat aligned.bam

# Detailed stats
samtools stats aligned.bam | grep ^SN

Convert PAF to BED

# Extract alignments to BED
awk 'OFS="\t" {print $6, $8, $9, $1, $12, ($5=="+")?"+":"-"}' alignments.paf > alignments.bed

Key Presets

PresetDescriptionBest For
map-ontONT readsNanopore genomic
map-hifiPacBio HiFiPacBio genomic
map-pbPacBio CLRPacBio CLR
spliceLong RNA readscDNA, direct RNA
asm5Low divergenceSame species assembly
asm20High divergenceCross-species assembly
srShort readsIllumina (basic)

Key Parameters

ParameterDefaultDescription
-t3CPU threads
-k15K-mer size
-w10Minimizer window
-aoffOutput SAM
-xnonePreset
--secondaryyesOutput secondary
-N5Max secondary alignments
--MDoffGenerate MD tag
-RnoneRead group header
-YoffSoft clipping for supplementary

Output Formats

FormatFlagDescription
PAF(default)Pairwise Alignment Format
SAM-aSequence Alignment Map
BAM-a | samtoolsBinary SAM

Related Skills

  • medaka-polishing - Polish consensus with medaka
  • structural-variants - Call SVs from alignments
  • alignment-files/sam-bam-basics - BAM manipulation