BioClaw blast-search

Run BLAST sequence similarity searches. Use when the user asks to BLAST a sequence, find similar sequences, identify a gene/protein, or do homology search. Triggers on "blast", "sequence similarity", "homology", "identify sequence".

install

source · Clone the upstream repo

git clone https://github.com/Runchuan-BU/BioClaw

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/Runchuan-BU/BioClaw "$T" && mkdir -p ~/.claude/skills && cp -r "$T/container/skills/blast-search" ~/.claude/skills/runchuan-bu-bioclaw-blast-search && rm -rf "$T"

manifest: container/skills/blast-search/SKILL.md

source content

BLAST Search

Run NCBI BLAST+ searches inside the BioClaw container.

When to Use

User provides a DNA/RNA/protein sequence and wants to find similar sequences
User asks to identify an unknown sequence
User wants to check sequence conservation across species

How to Execute

1. Determine BLAST program

Input	Database	Program
Nucleotide query	Nucleotide DB	`blastn`
Protein query	Protein DB	`blastp`
Nucleotide query	Protein DB	`blastx`
Protein query	Nucleotide DB	`tblastn`

2. For local BLAST (sequences provided by user)

# Create query file
cat > /tmp/query.fa << 'EOF'
>query_sequence
ATGCGATCGATCGATCG...
EOF

# Create subject file (if user provides reference)
cat > /tmp/subject.fa << 'EOF'
>reference
ATGCGATCGATCGATCG...
EOF

# Run BLAST
blastn -query /tmp/query.fa -subject /tmp/subject.fa -outfmt 6 -evalue 1e-5

3. For remote BLAST (against NCBI databases)

Use BioPython's NCBIWWW module:

from Bio.Blast import NCBIWWW, NCBIXML
from Bio import SeqIO

# Read sequence
sequence = "ATGCGATCGATCGATCG..."

# Run remote BLAST
result_handle = NCBIWWW.qblast("blastn", "nt", sequence)
blast_records = NCBIXML.parse(result_handle)

for record in blast_records:
    for alignment in record.alignments[:10]:
        print(f"Title: {alignment.title}")
        for hsp in alignment.hsps:
            print(f"  Score: {hsp.score}, E-value: {hsp.expect}")
            print(f"  Identity: {hsp.identities}/{hsp.align_length} ({hsp.identities/hsp.align_length*100:.1f}%)")

4. Output format

Present results in a clear table:

*BLAST Results (top 10 hits)*

• Hit 1: Homo sapiens TP53 gene (98.5% identity, E=1e-45)
• Hit 2: Mus musculus Trp53 gene (89.2% identity, E=1e-38)
...

5. Follow-up suggestions

After showing results, suggest:

Multiple sequence alignment of top hits
Phylogenetic analysis
Domain/motif analysis of the query
Structural comparison if protein