BioClaw query-ensembl
Query Ensembl for genomic data. Use when user asks about gene coordinates, genomic sequences, variants, gene structure, exons, transcripts, or species comparison. Triggers on "ensembl", "gene coordinates", "genomic location", "exon", "transcript", "variant location", "rsid", "rs number".
install
source · Clone the upstream repo
git clone https://github.com/Runchuan-BU/BioClaw
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/Runchuan-BU/BioClaw "$T" && mkdir -p ~/.claude/skills && cp -r "$T/container/skills/query-ensembl" ~/.claude/skills/runchuan-bu-bioclaw-query-ensembl && rm -rf "$T"
manifest:
container/skills/query-ensembl/SKILL.mdsource content
Ensembl REST API Query
Query the Ensembl REST API for genomic annotations, sequences, and variants.
When to Use
- User asks about a gene's genomic location, exons, or transcripts
- User wants to look up an rsID or variant
- User needs genomic/cDNA/protein sequences
- User asks about gene structure or regulatory features
- User wants cross-species gene information
How to Execute
import requests import json BASE_URL = "https://rest.ensembl.org" HEADERS = {"Content-Type": "application/json", "Accept": "application/json"} # 1. Gene lookup by symbol def lookup_gene(symbol, species="homo_sapiens"): url = f"{BASE_URL}/lookup/symbol/{species}/{symbol}" r = requests.get(url, headers=HEADERS, params={"expand": 1}) r.raise_for_status() return r.json() # 2. Get sequence def get_sequence(ensembl_id, seq_type="genomic"): url = f"{BASE_URL}/sequence/id/{ensembl_id}" r = requests.get(url, headers=HEADERS, params={"type": seq_type}) r.raise_for_status() return r.json() # 3. Variant lookup by rsID def lookup_variant(rsid, species="homo_sapiens"): url = f"{BASE_URL}/variation/{species}/{rsid}" r = requests.get(url, headers=HEADERS) r.raise_for_status() return r.json() # 4. Get overlapping features in a region def overlap_region(species, chrom, start, end, feature="gene"): url = f"{BASE_URL}/overlap/region/{species}/{chrom}:{start}-{end}" r = requests.get(url, headers=HEADERS, params={"feature": feature}) r.raise_for_status() return r.json() # 5. Cross-species homologs def get_homologs(ensembl_id, target_species=None): url = f"{BASE_URL}/homology/id/{ensembl_id}" params = {} if target_species: params["target_species"] = target_species r = requests.get(url, headers=HEADERS, params=params) r.raise_for_status() return r.json() # Example: look up BRCA2 gene = lookup_gene("BRCA2") print(f"Gene: {gene['display_name']}") print(f"Ensembl ID: {gene['id']}") print(f"Location: chr{gene['seq_region_name']}:{gene['start']}-{gene['end']}") print(f"Strand: {'+' if gene['strand'] == 1 else '-'}") print(f"Biotype: {gene['biotype']}") print(f"Description: {gene.get('description', 'N/A')}")
Key Endpoints
| Endpoint | Use |
|---|---|
| Gene info by symbol |
| Info by Ensembl ID |
| Get sequence |
| Variant info |
| Features in region |
| Orthologs/paralogs |
| Variant effect prediction |
Notes
- Region queries max 4,900,000 bp
- Species:
,homo_sapiens
,mus_musculus
,danio_reriodrosophila_melanogaster - Always use
Accept headerapplication/json
Follow-up Suggestions
- "Want me to get the protein sequence for this gene?"
- "Should I check for known pathogenic variants?"
- "Want me to find orthologs in mouse?"