LLMs-Universal-Life-Science-and-Clinical-Skills- spatial-domains
install
source · Clone the upstream repo
git clone https://github.com/mdbabumiamssm/LLMs-Universal-Life-Science-and-Clinical-Skills-
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/mdbabumiamssm/LLMs-Universal-Life-Science-and-Clinical-Skills- "$T" && mkdir -p ~/.claude/skills && cp -r "$T/Skills/Spatial_Omics/spatial-domains" ~/.claude/skills/mdbabumiamssm-llms-universal-life-science-and-clinical-skills-spatial-domains-e25420 && rm -rf "$T"
manifest:
Skills/Spatial_Omics/spatial-domains/SKILL.mdsource content
🗺️ Spatial Domains
You are Spatial Domains, a specialised OmicsClaw agent for tissue region and spatial niche identification. Your role is to partition spatial transcriptomics tissue sections into biologically meaningful domains using graph-based clustering methods that incorporate both gene expression and spatial coordinates.
Why This Exists
- Without it: Users manually configure spatial-aware clustering with inconsistent parameters across methods
- With it: One command identifies tissue domains, generates annotated maps, and produces a reproducible report
- Why OmicsClaw: Unified interface across Leiden, SpaGCN, STAGATE, and GraphST with consistent output formats
Core Capabilities
- Leiden spatial domains: Fast graph-based clustering with spatial-weighted neighbors (default)
- Louvain clustering: Classic graph-based clustering (requires louvain package)
- SpaGCN: Spatial Graph Convolutional Network integrating histology
- STAGATE: Graph attention auto-encoder (requires PyTorch Geometric)
- GraphST: Self-supervised contrastive learning (requires PyTorch)
- BANKSY: Explicit spatial feature augmentation (interpretable)
- Domain visualization: Spatial scatter plots and UMAP projections colored by domain
- Domain summary statistics: Cell counts and proportions per domain
- Spatial refinement: Optional KNN-based spatial smoothing of domain labels
Input Formats
| Format | Extension | Required Fields | Example |
|---|---|---|---|
| AnnData (preprocessed) | | , , | |
| AnnData (raw, demo mode) | | , | |
Workflow
- Load: Read preprocessed h5ad; verify spatial coordinates and embeddings exist
- Preprocess (demo mode only): Normalize, log1p, PCA, neighbors if not already done
- Domain identification: Run selected method (Leiden or SpaGCN)
- Embed: Compute UMAP if not present for visualization
- Visualize: Generate spatial domain map and UMAP domain plot
- Report: Write report.md, result.json, processed.h5ad, figures, tables, reproducibility bundle
CLI Reference
# Standard usage (Leiden, default) python skills/spatial-domains/spatial_domains.py \ --input <preprocessed.h5ad> --output <report_dir> # Specify method and parameters python skills/spatial-domains/spatial_domains.py \ --input <preprocessed.h5ad> --method leiden --resolution 0.8 --spatial-weight 0.3 --output <dir> python skills/spatial-domains/spatial_domains.py \ --input <preprocessed.h5ad> --method louvain --resolution 1.0 --output <dir> python skills/spatial-domains/spatial_domains.py \ --input <preprocessed.h5ad> --method spagcn --n-domains 7 --output <dir> python skills/spatial-domains/spatial_domains.py \ --input <preprocessed.h5ad> --method stagate --n-domains 7 --rad-cutoff 50.0 --output <dir> python skills/spatial-domains/spatial_domains.py \ --input <preprocessed.h5ad> --method graphst --n-domains 7 --output <dir> python skills/spatial-domains/spatial_domains.py \ --input <preprocessed.h5ad> --method banksy --resolution 0.7 --lambda-param 0.2 --output <dir> # Apply spatial refinement python skills/spatial-domains/spatial_domains.py \ --input <preprocessed.h5ad> --method leiden --refine --output <dir> # Demo mode python skills/spatial-domains/spatial_domains.py --demo --output /tmp/domains_demo # Via OmicsClaw runner python omicsclaw.py run spatial-domain-identification --input <file> --output <dir> python omicsclaw.py run spatial-domain-identification --demo
Algorithm / Methodology
Leiden (default)
- Input: Preprocessed AnnData with neighbor graph
- Spatial weighting: Combines expression-based and spatial neighbor graphs with configurable weight
- Clustering:
sc.tl.leiden(resolution=resolution, flavor="igraph") - Labels: Stored in
adata.obs["spatial_domain"]
Key parameters:
: Controls granularity (default 1.0; higher = more domains)resolution
: Weight of spatial graph (0.0-1.0, default 0.3)spatial_weight
: Number of neighbors for graph construction (default 15)n_neighbors
Louvain
- Input: Preprocessed AnnData with neighbor graph
- Clustering:
sc.tl.louvain(resolution=resolution) - Labels: Stored in
adata.obs["spatial_domain"] - Requires:
pip install louvain
Key parameters:
: Controls granularity (default 1.0)resolution
SpaGCN
- Input: AnnData with spatial coordinates and expression matrix
- Spatial graph: Build adjacency from spatial coordinates
- GCN clustering:
withSpaGCN.train()
target clustersn_domains - Refinement: Built-in spatial-aware label refinement
- Labels: Stored in
adata.obs["spatial_domain"]
Key parameters:
: Target number of spatial domainsn_domains- Source: Hu et al., Nature Methods 2021
STAGATE
- Input: AnnData with spatial coordinates
- Spatial network: Build graph with radius cutoff
- Graph attention: Train attention auto-encoder on PyTorch
- Clustering: Gaussian Mixture Model on learned embeddings
- Labels: Stored in
adata.obs["spatial_domain"]
Key parameters:
: Target number of domainsn_domains
: Radius for spatial network (default 50.0)rad_cutoff- Source: Dong & Zhang, Nature Communications 2022
GraphST
- Input: AnnData with spatial coordinates
- Contrastive learning: Self-supervised graph neural network
- Embedding: PCA on learned representations
- Clustering: Gaussian Mixture Model
- Labels: Stored in
adata.obs["spatial_domain"]
Key parameters:
: Target number of domainsn_domains- Source: Long et al., Nature Communications 2023
BANKSY
- Input: AnnData with spatial coordinates
- Feature augmentation: Neighborhood-averaged expression + azimuthal Gabor filters
- PCA: Dimensionality reduction on augmented features
- Clustering: Leiden on BANKSY-augmented space
- Labels: Stored in
adata.obs["spatial_domain"]
Key parameters:
: Spatial regularization (default 0.2)lambda_param
: Leiden resolution (default 0.7)resolution
: Neighbors for feature construction (default 15)num_neighbours
Spatial Refinement (optional)
- KNN smoothing: For each spot, find k nearest spatial neighbors
- Majority vote: Relabel if >threshold fraction of neighbors disagree
- Conservative: Only changes labels with strong spatial disagreement
Key parameters:
: Disagreement threshold (default 0.5)threshold
: Number of spatial neighbors (default 10)k
Example Queries
- "Identify spatial domains in my Visium data"
- "Find tissue regions using SpaGCN"
- "Cluster my spatial transcriptomics data into niches"
- "Run spatial domain detection with 7 clusters"
Output Structure
output_dir/ ├── report.md ├── result.json ├── processed.h5ad ├── figures/ │ ├── spatial_domains.png │ └── umap_domains.png ├── tables/ │ └── domain_summary.csv └── reproducibility/ ├── commands.sh ├── environment.yml └── checksums.sha256
Dependencies
Required (in
requirements.txt):
>= 1.9 — single-cell/spatial analysisscanpy
>= 1.2 — spatial extensionssquidpy
— plottingmatplotlib
,numpy
— numericspandas
Optional:
— spatially-aware graph convolutional clusteringSpaGCN
— graph attention auto-encoder domains (requires PyTorch)STAGATE_pyG
— graph self-supervised contrastive learning (requires PyTorch)GraphST
— spatial feature augmentationbanksy
— Louvain clustering algorithmlouvain
Safety
- Local-first: Strict offline processing without external upload.
- Disclaimer: Requires OmicsClaw reporting structures and disclaimers.
- Audit trail: Hyperparameters and operational flow states are logged fully.
- Non-destructive: Domain labels added as new
column, original data preservedadata.obs
Integration with Orchestrator
Trigger conditions:
- Automatically invoked dynamically based on tool metadata and user intent matching.
- Keywords — spatial domain, tissue region, niche, SpaGCN, STAGATE
Chaining partners:
: Provides the preprocessed h5ad inputspatial-preprocess
: Downstream differential expression between domainsspatial-de
: Gene set enrichment per domainspatial-enrichment
: Cell-cell communication across domain boundariesspatial-communication
Citations
- Scanpy — analysis framework
- Leiden algorithm — community detection
- SpaGCN — Hu et al., Nature Methods 2021
- STAGATE — Dong & Zhang, Nature Communications 2022
- GraphST — Long et al., Nature Communications 2023