Claude-skill-registry bio-annotation
Functional annotation and taxonomy inference from sequence homology.
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/bio-annotation" ~/.claude/skills/majiayu000-claude-skill-registry-bio-annotation && rm -rf "$T"
manifest:
skills/data/bio-annotation/SKILL.mdsource content
Bio Annotation
When to use
- Functional annotation and taxonomy inference from sequence homology.
Prerequisites
- Tools installed via pixi (see pixi.toml).
- Reference DB root: /media/shared-expansion/db/ (wsu; override per machine branch).
- Input FASTA and reference DBs are readable.
Inputs
- proteins.faa (FASTA protein sequences).
- reference_db/ (eggNOG, InterPro, DIAMOND databases + taxdump).
Outputs
- results/bio-annotation/annotations.parquet
- results/bio-annotation/taxonomy.parquet
- results/bio-annotation/annotation_report.md
- results/bio-annotation/logs/
Steps
- Run InterProScan for domain/family annotation.
- Run eggnog-mapper for orthology-based annotation.
- Run DIAMOND and resolve taxonomy with TaxonKit.
QC gates
- Annotation hit rate and taxonomy rank coverage meet project thresholds.
- On failure: retry with alternative parameters; if still failing, record in report and exit non-zero.
Validation
- Verify proteins.faa is non-empty and amino acid encoded.
- Verify required reference DBs exist under the reference root.
Tools
- interproscan v6.0.0
- eggnog-mapper v2.1.13
- diamond v2.1.16
- taxonkit v0.20.0
Paper summaries (2023-2025)
- summaries/ (include example use cases and tool settings used)
Tool documentation
- InterProScan - Domain and family annotation
- eggNOG-mapper - Orthology-based functional annotation
- DIAMOND - Fast sequence homology search
- TaxonKit - Taxonomy resolution and manipulation
References
- See ../bio-skills-references.md