OpenSkillIndex ← back to search

claude-code science

Claude-skill-registry bio-annotation

Functional annotation and taxonomy inference from sequence homology.

install

source · Clone the upstream repo

git clone https://github.com/majiayu000/claude-skill-registry

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/bio-annotation" ~/.claude/skills/majiayu000-claude-skill-registry-bio-annotation && rm -rf "$T"

manifest: skills/data/bio-annotation/SKILL.md

tags

#sequence-analysis #functional-annotation #taxonomy-inference #bioinformatics #homology-search

source content

Bio Annotation

When to use

Functional annotation and taxonomy inference from sequence homology.

Prerequisites

Tools installed via pixi (see pixi.toml).
Reference DB root: /media/shared-expansion/db/ (wsu; override per machine branch).
Input FASTA and reference DBs are readable.

Inputs

proteins.faa (FASTA protein sequences).
reference_db/ (eggNOG, InterPro, DIAMOND databases + taxdump).

Outputs

results/bio-annotation/annotations.parquet
results/bio-annotation/taxonomy.parquet
results/bio-annotation/annotation_report.md
results/bio-annotation/logs/

Steps

Run InterProScan for domain/family annotation.
Run eggnog-mapper for orthology-based annotation.
Run DIAMOND and resolve taxonomy with TaxonKit.

QC gates

Annotation hit rate and taxonomy rank coverage meet project thresholds.
On failure: retry with alternative parameters; if still failing, record in report and exit non-zero.

Validation

Verify proteins.faa is non-empty and amino acid encoded.
Verify required reference DBs exist under the reference root.

Tools

interproscan v6.0.0
eggnog-mapper v2.1.13
diamond v2.1.16
taxonkit v0.20.0

Paper summaries (2023-2025)

summaries/ (include example use cases and tool settings used)

Tool documentation

InterProScan - Domain and family annotation
eggNOG-mapper - Orthology-based functional annotation
DIAMOND - Fast sequence homology search
TaxonKit - Taxonomy resolution and manipulation

References

See ../bio-skills-references.md