Claude-skill-registry bio-annotation

Functional annotation and taxonomy inference from sequence homology.

install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/bio-annotation" ~/.claude/skills/majiayu000-claude-skill-registry-bio-annotation && rm -rf "$T"
manifest: skills/data/bio-annotation/SKILL.md
source content

Bio Annotation

When to use

  • Functional annotation and taxonomy inference from sequence homology.

Prerequisites

  • Tools installed via pixi (see pixi.toml).
  • Reference DB root: /media/shared-expansion/db/ (wsu; override per machine branch).
  • Input FASTA and reference DBs are readable.

Inputs

  • proteins.faa (FASTA protein sequences).
  • reference_db/ (eggNOG, InterPro, DIAMOND databases + taxdump).

Outputs

  • results/bio-annotation/annotations.parquet
  • results/bio-annotation/taxonomy.parquet
  • results/bio-annotation/annotation_report.md
  • results/bio-annotation/logs/

Steps

  1. Run InterProScan for domain/family annotation.
  2. Run eggnog-mapper for orthology-based annotation.
  3. Run DIAMOND and resolve taxonomy with TaxonKit.

QC gates

  • Annotation hit rate and taxonomy rank coverage meet project thresholds.
  • On failure: retry with alternative parameters; if still failing, record in report and exit non-zero.

Validation

  • Verify proteins.faa is non-empty and amino acid encoded.
  • Verify required reference DBs exist under the reference root.

Tools

  • interproscan v6.0.0
  • eggnog-mapper v2.1.13
  • diamond v2.1.16
  • taxonkit v0.20.0

Paper summaries (2023-2025)

  • summaries/ (include example use cases and tool settings used)

Tool documentation

References

  • See ../bio-skills-references.md