Claude-skill-registry bio-structure-annotation

Structure prediction and structure-based annotation.

install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/bio-structure-annotation" ~/.claude/skills/majiayu000-claude-skill-registry-bio-structure-annotation && rm -rf "$T"
manifest: skills/data/bio-structure-annotation/SKILL.md
source content

Bio Structure Annotation

When to use

  • Structure prediction and structure-based annotation.

Prerequisites

  • Tools installed via pixi (see pixi.toml).
  • Reference DB root: /media/shared-expansion/db/ (wsu; override per machine branch).
  • Protein FASTA inputs are available.

Inputs

  • proteins.faa (FASTA protein sequences)

Outputs

  • results/bio-structure-annotation/structures/
  • results/bio-structure-annotation/structure_hits.tsv
  • results/bio-structure-annotation/structure_report.md
  • results/bio-structure-annotation/logs/

Steps

  1. Run fast embedding screen (tm-vec).
  2. Predict structures (boltz or colabfold) as needed.
  3. Search structures with Foldseek and annotate hits.

QC gates

  • Prediction success rate meets project thresholds.
  • Search hit thresholds meet project thresholds.
  • On failure: retry with alternative parameters; if still failing, record in report and exit non-zero.

Validation

  • Verify proteins.faa is non-empty and amino acid encoded.
  • Verify Foldseek databases exist under the reference root.

Tools

  • tm-vec v1.0.3
  • boltz v2.2.1
  • colabfold v1.5.5
  • foldseek v10-941cd33

Paper summaries (2023-2025)

  • summaries/ (include example use cases and tool settings used)

Tool documentation

  • TM-Vec - Fast protein structure embedding and similarity search
  • Boltz - AI-based protein structure prediction
  • ColabFold - Fast protein structure prediction using AlphaFold2
  • Foldseek - Fast structure-based protein search

References

  • See ../bio-skills-references.md