install
source · Clone the upstream repo
git clone https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/bio-genome-assembly-assembly-polishing" ~/.claude/skills/freedomintelligence-openclaw-medical-skills-bio-genome-assembly-assembly-polishi && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/bio-genome-assembly-assembly-polishing" ~/.openclaw/skills/freedomintelligence-openclaw-medical-skills-bio-genome-assembly-assembly-polishi && rm -rf "$T"
manifest:
skills/bio-genome-assembly-assembly-polishing/SKILL.mdsource content
<!--
# COPYRIGHT NOTICE
# This file is part of the "Universal Biomedical Skills" project.
# Copyright (c) 2026 MD BABU MIA, PhD <md.babu.mia@mssm.edu>
# All Rights Reserved.
#
# This code is proprietary and confidential.
# Unauthorized copying of this file, via any medium is strictly prohibited.
#
# Provenance: Authenticated by MD BABU MIA
-->
name: bio-genome-assembly-assembly-polishing description: Polish genome assemblies to reduce errors using short reads (Pilon), long reads (Racon), or ONT-specific tools (medaka). Essential for improving long-read assembly accuracy. Use when improving assembly accuracy with polishing tools. tool_type: cli primary_tool: Pilon measurable_outcome: Execute skill workflow successfully with valid output within 15 minutes. allowed-tools:
- read_file
- run_shell_command
Assembly Polishing
Improve assembly accuracy by correcting errors using additional sequencing data.
Polishing Strategies
| Tool | Input Reads | Best For |
|---|---|---|
| Pilon | Illumina | Final polishing |
| medaka | ONT | ONT assemblies |
| Racon | Long reads | Quick polishing |
| NextPolish | Both | Combined approach |
Recommended Workflows
ONT Assembly
- Racon (2-3 rounds with ONT)
- medaka (1 round)
- Pilon (2-3 rounds with Illumina)
PacBio CLR Assembly
- Racon (2-3 rounds)
- Pilon (2-3 rounds with Illumina)
PacBio HiFi Assembly
- Often no polishing needed (>99% accuracy)
- Optional Pilon if Illumina available
Pilon (Illumina Polishing)
Installation
conda install -c bioconda pilon
Basic Usage
# Map short reads to assembly bwa index assembly.fasta bwa mem -t 16 assembly.fasta R1.fq.gz R2.fq.gz | samtools sort -o aligned.bam samtools index aligned.bam # Run Pilon pilon --genome assembly.fasta --frags aligned.bam --output polished
Key Options
| Option | Description |
|---|---|
| Input assembly |
| Paired-end BAM |
| Output prefix |
| Write changes file |
| Write VCF of changes |
| What to fix (snps, indels, gaps, all) |
| Threads for alignment |
| Min depth for correction |
Multiple Rounds
#!/bin/bash ASSEMBLY=$1 R1=$2 R2=$3 ROUNDS=${4:-3} current=$ASSEMBLY for i in $(seq 1 $ROUNDS); do echo "=== Pilon round $i ===" bwa index $current bwa mem -t 16 $current $R1 $R2 | samtools sort -o round${i}.bam samtools index round${i}.bam pilon --genome $current --frags round${i}.bam --output pilon_round${i} --changes current=pilon_round${i}.fasta changes=$(wc -l < pilon_round${i}.changes) echo "Changes made: $changes" if [ $changes -eq 0 ]; then echo "No more changes, stopping" break fi done cp $current final_polished.fasta
Fix Specific Issues
# Only fix SNPs and small indels pilon --genome assembly.fa --frags aligned.bam --output polished --fix snps,indels # Only fill gaps pilon --genome assembly.fa --frags aligned.bam --output polished --fix gaps
medaka (ONT Polishing)
Installation
conda install -c bioconda medaka
Basic Usage
medaka_consensus -i reads.fastq.gz -d assembly.fasta -o medaka_output -t 8
Key Options
| Option | Description |
|---|---|
| Input reads |
| Draft assembly |
| Output directory |
| Threads |
| Model name |
Model Selection
# List available models medaka tools list_models # Use specific model (match your basecaller) medaka_consensus -i reads.fq.gz -d assembly.fa -o output -m r1041_e82_400bps_sup_v5.1.0
Models for Common Chemistries
| Chemistry | Model |
|---|---|
| R10.4.1 + SUP | r1041_e82_400bps_sup_* |
| R10.4.1 + HAC | r1041_e82_400bps_hac_* |
| R9.4.1 + SUP | r941_sup_* |
Output
medaka_output/ ├── consensus.fasta # Polished assembly ├── calls_to_draft.bam # Alignments └── *.hdf # Intermediate files
Racon (Long-Read Polishing)
Installation
conda install -c bioconda racon
Basic Usage
# Map reads to assembly minimap2 -ax map-ont assembly.fasta reads.fastq.gz > aligned.sam # Polish racon -t 16 reads.fastq.gz aligned.sam assembly.fasta > polished.fasta
Multiple Rounds
#!/bin/bash ASSEMBLY=$1 READS=$2 ROUNDS=${3:-3} current=$ASSEMBLY for i in $(seq 1 $ROUNDS); do echo "=== Racon round $i ===" minimap2 -ax map-ont $current $READS > round${i}.sam racon -t 16 $READS round${i}.sam $current > racon_round${i}.fasta current=racon_round${i}.fasta done cp $current racon_polished.fasta
Key Options
| Option | Description |
|---|---|
| Threads |
| Match score (default: 3) |
| Mismatch score (default: -5) |
| Gap penalty (default: -4) |
| Window size (default: 500) |
Complete Polishing Workflow
ONT Assembly Polishing
#!/bin/bash set -euo pipefail ASSEMBLY=$1 # Flye assembly ONT_READS=$2 # ONT reads ILLUMINA_R1=$3 # Illumina R1 ILLUMINA_R2=$4 # Illumina R2 OUTDIR=$5 mkdir -p $OUTDIR # Step 1: Racon polishing (2 rounds) echo "=== Racon Polishing ===" current=$ASSEMBLY for i in 1 2; do minimap2 -ax map-ont $current $ONT_READS > ${OUTDIR}/racon_${i}.sam racon -t 16 $ONT_READS ${OUTDIR}/racon_${i}.sam $current > ${OUTDIR}/racon_${i}.fasta current=${OUTDIR}/racon_${i}.fasta done # Step 2: medaka polishing echo "=== medaka Polishing ===" medaka_consensus -i $ONT_READS -d $current -o ${OUTDIR}/medaka -t 8 current=${OUTDIR}/medaka/consensus.fasta # Step 3: Pilon polishing (2 rounds) echo "=== Pilon Polishing ===" for i in 1 2; do bwa index $current bwa mem -t 16 $current $ILLUMINA_R1 $ILLUMINA_R2 | samtools sort -o ${OUTDIR}/pilon_${i}.bam samtools index ${OUTDIR}/pilon_${i}.bam pilon --genome $current --frags ${OUTDIR}/pilon_${i}.bam --output ${OUTDIR}/pilon_${i} current=${OUTDIR}/pilon_${i}.fasta done cp $current ${OUTDIR}/final_polished.fasta echo "Done: ${OUTDIR}/final_polished.fasta"
NextPolish (Combined Approach)
Installation
conda install -c bioconda nextpolish
Usage
# Create config file cat > run.cfg << EOF [General] job_type = local job_prefix = nextPolish task = best rewrite = yes rerun = 3 parallel_jobs = 2 multithread_jobs = 8 genome = assembly.fasta genome_size = auto workdir = ./01_rundir [lgs_option] lgs_fofn = lgs.fofn lgs_options = -min_read_len 1k -max_depth 100 lgs_minimap2_options = -x map-ont [sgs_option] sgs_fofn = sgs.fofn sgs_options = -max_depth 100 EOF # File of filenames ls reads.fastq.gz > lgs.fofn ls R1.fq.gz R2.fq.gz > sgs.fofn # Run nextPolish run.cfg
Quality Assessment
After polishing, assess improvement:
# Compare to reference (if available) quast.py -r reference.fa original.fa polished.fa -o quast_comparison # Check error rate minimap2 -ax map-ont polished.fa reads.fq.gz | samtools stats | grep "error rate"
Related Skills
- long-read-assembly - Initial assembly
- short-read-assembly - Source of polishing reads
- assembly-qc - Assess polishing improvement
- long-read-sequencing - medaka variant calling