OpenClaw-Medical-Skills bio-phylo-modern-tree-inference

<!--

install
source · Clone the upstream repo
git clone https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/bio-phylo-modern-tree-inference" ~/.claude/skills/freedomintelligence-openclaw-medical-skills-bio-phylo-modern-tree-inference && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/bio-phylo-modern-tree-inference" ~/.openclaw/skills/freedomintelligence-openclaw-medical-skills-bio-phylo-modern-tree-inference && rm -rf "$T"
manifest: skills/bio-phylo-modern-tree-inference/SKILL.md
source content
<!-- # COPYRIGHT NOTICE # This file is part of the "Universal Biomedical Skills" project. # Copyright (c) 2026 MD BABU MIA, PhD <md.babu.mia@mssm.edu> # All Rights Reserved. # # This code is proprietary and confidential. # Unauthorized copying of this file, via any medium is strictly prohibited. # # Provenance: Authenticated by MD BABU MIA -->

name: bio-phylo-modern-tree-inference description: Build maximum likelihood phylogenetic trees using IQ-TREE2 and RAxML-ng. Use when inferring publication-quality trees with model selection, ultrafast bootstrap, or partitioned analyses from sequence alignments. tool_type: cli primary_tool: IQ-TREE2 measurable_outcome: Execute skill workflow successfully with valid output within 15 minutes. allowed-tools:

  • read_file
  • run_shell_command

Modern ML Tree Inference

Build maximum likelihood phylogenetic trees with automatic model selection and ultrafast bootstrap.

IQ-TREE2 Basic Usage

# Simple ML tree with automatic model selection
iqtree2 -s alignment.fasta -m MFP -B 1000 -T AUTO

# -s: input alignment
# -m MFP: ModelFinder Plus (automatic model selection + tree inference)
# -B 1000: 1000 ultrafast bootstrap replicates (minimum recommended for publication)
# -T AUTO: automatic thread detection

IQ-TREE2 Output Files

FileDescription
.treefile
Best ML tree in Newick format
.iqtree
Full analysis report with model parameters
.log
Run log
.contree
Consensus tree with bootstrap support
.splits.nex
Bootstrap splits in Nexus format
.model.gz
Model parameters
.bionj
Initial BIONJ tree
.mldist
ML distance matrix
.ckp.gz
Checkpoint file for resuming

Model Selection

# ModelFinder only (no tree inference)
iqtree2 -s alignment.fasta -m MF

# Use specific model
iqtree2 -s alignment.fasta -m GTR+G4 -B 1000

# Test only specific models
iqtree2 -s alignment.fasta -m MF -mset GTR,HKY,K2P

# Protein models
iqtree2 -s protein.fasta -m MFP -B 1000 -st AA

Common DNA Substitution Models

ModelParametersUse Case
JCEqual ratesVery simple, rarely appropriate
K2P/K80Ti/Tv ratioSimple, some rate variation
HKYTi/Tv + base freqModerate complexity
GTR6 rates + base freqMost general, recommended default
+G4Gamma rate variation4 discrete rate categories
+IInvariant sitesSites that never change
+R4FreeRate modelMore flexible than Gamma

Ultrafast Bootstrap

# Standard ultrafast bootstrap (UFBoot2)
# B>=1000: Minimum for publication. Use 10000 for final analyses.
iqtree2 -s alignment.fasta -m GTR+G4 -B 1000

# Standard bootstrap (slower but more accurate for small datasets)
iqtree2 -s alignment.fasta -m GTR+G4 -b 100

# SH-aLRT test (fast approximate likelihood ratio test)
iqtree2 -s alignment.fasta -m GTR+G4 -alrt 1000

# Both UFBoot and SH-aLRT
iqtree2 -s alignment.fasta -m GTR+G4 -B 1000 -alrt 1000

Interpreting Bootstrap Values

UFBootSH-aLRTInterpretation
>= 95>= 80Strong support
80-9470-79Moderate support
< 80< 70Weak support

Partitioned Analysis

For multi-gene datasets with different evolutionary rates:

# Create partition file (partitions.nex)
cat > partitions.nex << 'EOF'
#nexus
begin sets;
    charset gene1 = 1-500;
    charset gene2 = 501-1200;
    charset gene3 = 1201-1800;
    charpartition mine = HKY:gene1, GTR:gene2, GTR+G:gene3;
end;
EOF

# Run partitioned analysis
iqtree2 -s concat.fasta -p partitions.nex -m MFP -B 1000

# Edge-linked partition model (proportional branch lengths)
iqtree2 -s concat.fasta -q partitions.nex -m MFP -B 1000

# Edge-unlinked (independent branch lengths per partition)
iqtree2 -s concat.fasta -Q partitions.nex -m MFP -B 1000

RAxML-ng Basic Usage

# Simple ML tree with GTR+G
raxml-ng --all --msa alignment.fasta --model GTR+G --bs-trees 100

# --all: ML search + bootstrapping
# --msa: input alignment
# --model: substitution model
# --bs-trees: number of bootstrap replicates

RAxML-ng Model Specification

# DNA models
raxml-ng --msa alignment.fasta --model GTR+G4+I

# Protein models (automatic detection)
raxml-ng --msa protein.fasta --model LG+G8+F

# Check alignment and determine model
raxml-ng --check --msa alignment.fasta --model GTR+G

RAxML-ng Output Files

FileDescription
.raxml.bestTree
Best ML tree
.raxml.support
Tree with bootstrap support values
.raxml.bootstraps
All bootstrap trees
.raxml.mlTrees
All ML trees from search
.raxml.log
Analysis log
.raxml.rba
Binary alignment (for restart)

RAxML-ng Advanced Options

# Multiple ML searches (find global optimum)
# --tree pars{10}: 10 starting parsimony trees recommended for thorough search
raxml-ng --msa alignment.fasta --model GTR+G --tree pars{10} --prefix ml_search

# Constrained tree search
raxml-ng --msa alignment.fasta --model GTR+G --tree-constraint constraint.tre

# Site likelihoods for topology tests
raxml-ng --sitelh --msa alignment.fasta --model GTR+G --tree candidate.tre

Comparing IQ-TREE2 vs RAxML-ng

FeatureIQ-TREE2RAxML-ng
Model selectionBuilt-in ModelFinderExternal (ModelTest-NG)
Ultrafast bootstrapYes (UFBoot2)No
Standard bootstrapYesYes
Partition modelsExtensiveGood
SpeedFaster for UFBootFaster for standard BS
MemoryLowerHigher
CheckpointingYesYes

Large Dataset Strategies

# IQ-TREE2 with reduced memory
iqtree2 -s large.fasta -m GTR+G -B 1000 -T 4 -mem 8G

# Use approximate NNI search
iqtree2 -s large.fasta -m GTR+G -B 1000 -fast

# RAxML-ng with parsimony starting trees
raxml-ng --msa large.fasta --model GTR+G --tree pars{5} --threads 8

Tree Topology Tests

# IQ-TREE2: AU test comparing trees
iqtree2 -s alignment.fasta -m GTR+G -z trees.nwk -n 0 -zb 10000 -au

# Output interpretation:
# p-AU < 0.05: Reject tree
# p-AU >= 0.05: Cannot reject tree

Constrained Analysis

# IQ-TREE2: Enforce monophyly constraint
iqtree2 -s alignment.fasta -m GTR+G -g constraint.tre -B 1000

# Constraint file format (Newick with taxa to constrain):
# ((Human,Chimp),Gorilla);

Complete Workflow Example

# 1. Check alignment
iqtree2 -s alignment.fasta -m GTR+G -n 0

# 2. Find best model
iqtree2 -s alignment.fasta -m MF -T AUTO

# 3. Full analysis with best model
iqtree2 -s alignment.fasta -m GTR+I+G4 -B 1000 -alrt 1000 -T AUTO

# 4. Visualize result
cat alignment.fasta.treefile

Resuming Interrupted Runs

# IQ-TREE2: Resume from checkpoint
iqtree2 -s alignment.fasta -m GTR+G -B 1000 --redo-tree

# RAxML-ng: Resume
raxml-ng --msa alignment.fasta --model GTR+G --redo

Reproducibility

# Set random seed for reproducible results
# seed=12345: Any fixed seed ensures reproducibility across runs
iqtree2 -s alignment.fasta -m GTR+G -B 1000 --seed 12345

raxml-ng --msa alignment.fasta --model GTR+G --seed 12345 --bs-trees 100

Related Skills

  • tree-io - Read and convert output tree files
  • tree-visualization - Visualize trees with bootstrap support
  • distance-calculations - Compare with distance-based methods
  • alignment/alignment-io - Prepare alignments for tree inference
<!-- AUTHOR_SIGNATURE: 9a7f3c2e-MD-BABU-MIA-2026-MSSM-SECURE -->