BioClaw query-uniprot
Query UniProt protein database. Use when user asks about protein sequences, functions, annotations, domains, or protein identifiers. Triggers on "uniprot", "protein function", "protein sequence", "gene product", "protein info".
install
source · Clone the upstream repo
git clone https://github.com/Runchuan-BU/BioClaw
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/Runchuan-BU/BioClaw "$T" && mkdir -p ~/.claude/skills && cp -r "$T/container/skills/query-uniprot" ~/.claude/skills/runchuan-bu-bioclaw-query-uniprot && rm -rf "$T"
manifest:
container/skills/query-uniprot/SKILL.mdsource content
UniProt Protein Database Query
Query the UniProt REST API for protein information.
When to Use
- User asks about a protein's function, sequence, or annotation
- User provides a gene name and wants protein info
- User needs protein accession IDs
- User asks "what does gene X do" (protein level)
How to Execute
import requests import json BASE_URL = "https://rest.uniprot.org" # 1. Search by gene name (default: human, reviewed/Swiss-Prot) def search_uniprot(gene_name, organism_id=9606, max_results=5): url = f"{BASE_URL}/uniprotkb/search" params = { "query": f"gene_exact:{gene_name} AND organism_id:{organism_id} AND reviewed:true", "format": "json", "size": max_results, "fields": "accession,id,gene_names,protein_name,organism_name,length,cc_function,ft_domain,sequence" } r = requests.get(url, params=params) r.raise_for_status() return r.json() # 2. Get by accession ID def get_uniprot_entry(accession): url = f"{BASE_URL}/uniprotkb/{accession}.json" r = requests.get(url) r.raise_for_status() return r.json() # 3. Get FASTA sequence def get_fasta(accession): url = f"{BASE_URL}/uniprotkb/{accession}.fasta" r = requests.get(url) r.raise_for_status() return r.text # Example usage data = search_uniprot("TP53") for entry in data.get("results", []): acc = entry["primaryAccession"] name = entry.get("proteinDescription", {}).get("recommendedName", {}).get("fullName", {}).get("value", "N/A") gene = entry.get("genes", [{}])[0].get("geneName", {}).get("value", "N/A") length = entry.get("sequence", {}).get("length", "N/A") # Extract function functions = [c["texts"][0]["value"] for c in entry.get("comments", []) if c["commentType"] == "FUNCTION"] func_text = functions[0][:200] if functions else "N/A" print(f"Accession: {acc}") print(f"Protein: {name}") print(f"Gene: {gene}") print(f"Length: {length} aa") print(f"Function: {func_text}")
Common Search Patterns
- By gene:
gene_exact:BRCA1 AND organism_id:9606 - By keyword:
keyword:kinase AND organism_id:9606 - By disease:
cc_disease:cancer AND organism_id:9606 - By GO term:
go:apoptosis AND organism_id:9606 - Species IDs: Human=9606, Mouse=10090, Rat=10116, Zebrafish=7955, Fly=7227, Yeast=559292
Output Format
Present: Accession, protein name, gene, organism, length, function summary, and UniProt link.
Follow-up Suggestions
- "Want me to get the AlphaFold structure for this protein?"
- "Should I check protein-protein interactions on STRING?"
- "Want me to BLAST this protein sequence?"