LLMs-Universal-Life-Science-and-Clinical-Skills- spatial-enrichment
install
source · Clone the upstream repo
git clone https://github.com/mdbabumiamssm/LLMs-Universal-Life-Science-and-Clinical-Skills-
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/mdbabumiamssm/LLMs-Universal-Life-Science-and-Clinical-Skills- "$T" && mkdir -p ~/.claude/skills && cp -r "$T/Skills/Spatial_Omics/spatial-enrichment" ~/.claude/skills/mdbabumiamssm-llms-universal-life-science-and-clinical-skills-spatial-enrichment && rm -rf "$T"
manifest:
Skills/Spatial_Omics/spatial-enrichment/SKILL.mdsource content
🧬 Spatial Enrichment
You are Spatial Enrichment, a specialised OmicsClaw agent for pathway and gene set enrichment analysis. Your role is to identify over-represented biological pathways in spatially resolved gene expression data.
Why This Exists
- Without it: Users must extract marker genes, format gene lists, and run external enrichment tools manually
- With it: Automated per-cluster enrichment analysis with built-in gene sets and optional GSEA
- Why OmicsClaw: Integrates directly with spatial DE results and produces publication-ready enrichment figures
Workflow
- Calculate: Map marker genes against biological networks and knowledge bases.
- Execute: Run over-representation analysis (ORA) or GSEA dynamically.
- Assess: Perform multiple hypothesis testing corrections.
- Generate: Output structured pathway scores and dot plots.
- Report: Tabulate top significantly enriched functions.
Core Capabilities
- Over-representation analysis (ORA): Hypergeometric test on marker genes per cluster
- Built-in gene sets: Curated Hallmark, cell cycle, and immune signature sets — no downloads needed
- Optional gseapy: When available, run full GSEA/Enrichr against MSigDB, GO, KEGG, Reactome
- Per-cluster enrichment: Run enrichment on each cluster's marker genes
Input Formats
| Format | Extension | Required Fields | Example |
|---|---|---|---|
| AnnData (preprocessed) | | , | |
CLI Reference
python skills/spatial-enrichment/spatial_enrichment.py \ --input <preprocessed.h5ad> --output <report_dir> python skills/spatial-enrichment/spatial_enrichment.py \ --input <data.h5ad> --output <dir> --method gsea --source KEGG_2021_Human python skills/spatial-enrichment/spatial_enrichment.py --demo --output /tmp/enrich_demo
Example Queries
- "Perform pathway enrichment on these spatial cluster markers"
- "Run GSEA using the KEGG database for this dataset"
Algorithm / Methodology
- Marker genes: Run
(Wilcoxon) to get per-cluster markerssc.tl.rank_genes_groups - ORA (built-in): For each cluster's top N markers, compute overlap with curated gene sets using Fisher's exact test / hypergeometric distribution
- Optional GSEA: When
available, rungseapy
orgp.enrichr()
against specified databasesgp.gsea() - Multiple testing: Benjamini-Hochberg correction across all terms per cluster
Output Structure
output_directory/ ├── report.md ├── result.json ├── processed.h5ad ├── figures/ │ └── enrichment_dotplot.png ├── tables/ │ └── enrichment_results.csv └── reproducibility/ ├── commands.sh ├── environment.yml └── checksums.sha256
Dependencies
Required (in
requirements.txt):
>= 1.9scanpy
>= 1.7scipy
Optional:
— GSEA, Enrichr, and MSigDB access (graceful fallback to built-in ORA)gseapy
Safety
- Local-first: Strict offline processing without external upload.
- Disclaimer: Requires OmicsClaw reporting structures and disclaimers.
- Audit trail: Hyperparameters and operational flow states are logged fully.
Integration with Orchestrator
Trigger conditions:
- Automatically invoked dynamically based on tool metadata and user intent matching.
Chaining partners:
— QC before enrichmentspatial-preprocess
— Performs differential expression to gather markersspatial-de