Skills gene-structure-mapper
Visualize gene structure with exon-intron diagrams, domain annotations, and mutation position markers. Produces SVG, PNG, or PDF figures suitable for publication from a gene symbol input.
git clone https://github.com/openclaw/skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/aipoch-ai/gene-structure-mapper" ~/.claude/skills/openclaw-skills-gene-structure-mapper && rm -rf "$T"
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/aipoch-ai/gene-structure-mapper" ~/.openclaw/skills/openclaw-skills-gene-structure-mapper && rm -rf "$T"
skills/aipoch-ai/gene-structure-mapper/SKILL.mdGene Structure Mapper
Generate exon-intron structure diagrams for any gene symbol using the Ensembl REST API. Optionally overlay protein domain annotations (UniProt) and mark mutation hotspot positions. Outputs publication-ready SVG, PNG, or PDF figures.
✅ IMPLEMENTED —
is fully functional. Ensembl REST API, caching, matplotlib visualization,scripts/main.py,--domains, and--mutationsare all implemented.--demo
Quick Check
python -m py_compile scripts/main.py python scripts/main.py --help python scripts/main.py --demo --output demo.png
When to Use
- Creating gene structure figures for manuscripts or presentations
- Visualizing splice variants and isoform differences
- Marking mutation positions on a gene diagram for functional annotation
- Overlaying domain boundaries on exon-intron maps
Workflow
- Confirm the user objective, required inputs, and non-negotiable constraints before doing detailed work.
- Validate that the request matches the documented scope and stop early if the task would require unsupported assumptions.
- Use the packaged script path or the documented reasoning path with only the inputs that are actually available.
- Return a structured result that separates assumptions, deliverables, risks, and unresolved items.
- If execution fails or inputs are incomplete, switch to the fallback path and state exactly what blocked full completion.
Fallback template: If
scripts/main.py fails or the gene symbol is unrecognized, report: (a) the failure point, (b) whether a manual Ensembl/UCSC lookup can substitute, (c) which output formats are still generatable.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
, | string | Yes* | Gene symbol or Ensembl ID (e.g., , , ) |
| string | No | Species name for Ensembl lookup (default: ) |
| string | No | Output format: , , (default: ) |
, | string | No | Output file path (default: ) |
| flag | No | Fetch and overlay UniProt protein domain annotations |
| string | No | Comma-separated codon positions to mark (e.g., ) |
| flag | No | Use hardcoded TP53 GRCh38 data — no internet required |
*Required unless
--demo is used.
Usage
python scripts/main.py --gene TP53 --format png python scripts/main.py --gene BRCA1 --format png --domains --output brca1_structure.png python scripts/main.py --gene KRAS --mutations 12,13,61 --format pdf python scripts/main.py --demo python scripts/main.py --demo --output demo.png --format svg
Implementation Notes (for script developer)
The script must implement:
- Gene lookup —
to fetch exon coordinates. Cache response toGET https://rest.ensembl.org/lookup/symbol/homo_sapiens/{gene}?expand=1
to avoid repeated API calls. Add a 0.1 s delay between requests for batch lookups. The unauthenticated rate limit is 15 requests/second..cache/{gene}_ensembl.json - Unknown gene handling — catch HTTP 400/404 from Ensembl and exit with code 1:
Error: Gene not found: {gene_name}. Check the gene symbol and try again. - SVG/PNG/PDF output — use
ormatplotlib
to draw exon blocks (filled rectangles) and intron lines scaled to genomic coordinates.svgwrite
flag — fetch UniProt domain annotations and overlay colored domain blocks on the gene structure.--domains
flag — accept comma-separated codon positions; map to exon coordinates and draw vertical markers.--mutations
flag — use hardcoded TP53 GRCh38 exon coordinates (no internet required) to generate a demo visualization.--demo
Known Limitations
- For genes with multiple isoforms, the script uses the canonical transcript (Ensembl
flag). Other isoforms are not visualized.is_canonical - Domain overlay (
) maps UniProt amino acid positions to genomic coordinates using CDS length; accuracy may vary for genes with complex splicing.--domains - Ensembl API responses are cached to
. Delete the cache file to force a fresh lookup..cache/{gene}_ensembl.json - The unauthenticated Ensembl REST API rate limit is 15 requests/second; a 0.1 s delay is applied between batch requests.
Features
- Exon-intron visualization scaled to genomic coordinates
- Protein domain annotation overlay via UniProt (optional,
)--domains - Mutation position markers with configurable labels (
)--mutations - Publication-ready output in SVG, PNG, or PDF
- Demo mode for offline testing (
)--demo - Ensembl API response caching to avoid rate-limit issues
Output Requirements
Every response must make these explicit:
- Objective and deliverable
- Inputs used and assumptions introduced (e.g., genome build, transcript isoform selected)
- Workflow or decision path taken
- Core result: gene structure figure file path
- Constraints, risks, caveats (e.g., multi-isoform genes, annotation version)
- Unresolved items and next-step checks
Input Validation
This skill accepts: gene symbol inputs for structure visualization, with optional domain and mutation overlays.
If the request does not involve gene structure visualization — for example, asking to perform sequence alignment, predict protein structure, or analyze expression data — do not proceed. Instead respond:
"
is designed to visualize gene exon-intron structure. Your request appears to be outside this scope. Please provide a gene symbol and desired output format, or use a more appropriate tool for your task."gene-structure-mapper
Error Handling
- If
is missing, state that the gene symbol is required and provide an example.--gene - If the gene symbol is not found in Ensembl (HTTP 400/404), print:
and exit with code 1.Error: Gene not found: {gene_name}. Check the gene symbol and try again. - If
contains non-numeric values, reject with:--mutationsError: --mutations must be comma-separated integers (codon positions). - If the task goes outside the documented scope, stop instead of guessing or silently widening the assignment.
- If
fails, report the failure point, summarize what still can be completed safely, and provide a manual fallback.scripts/main.py - Do not fabricate files, citations, data, search results, or execution outcomes.
Response Template
- Objective
- Inputs Received
- Assumptions
- Workflow
- Deliverable
- Risks and Limits
- Next Checks