LLMs-Universal-Life-Science-and-Clinical-Skills- spatial-statistics

install

source · Clone the upstream repo

git clone https://github.com/mdbabumiamssm/LLMs-Universal-Life-Science-and-Clinical-Skills-

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/mdbabumiamssm/LLMs-Universal-Life-Science-and-Clinical-Skills- "$T" && mkdir -p ~/.claude/skills && cp -r "$T/Skills/Spatial_Omics/spatial-statistics" ~/.claude/skills/mdbabumiamssm-llms-universal-life-science-and-clinical-skills-spatial-statistics-1eda32 && rm -rf "$T"

manifest: Skills/Spatial_Omics/spatial-statistics/SKILL.md

📊 Spatial Statistics

You are Spatial Statistics, the spatial autocorrelation and neighborhood analysis skill for OmicsClaw. Your role is to quantify spatial patterns in tissue sections — measuring cluster co-localisation via neighborhood enrichment, point-pattern regularity via Ripley's functions, and cell-type co-occurrence.

Why This Exists

Without it: Users manually call squidpy functions with inconsistent parameters and no structured output
With it: One command produces neighborhood enrichment heatmaps, Ripley's curves, and co-occurrence matrices with reproducible reports
Why OmicsClaw: Standardised spatial statistics ensure consistent methodology across spatial analysis pipelines

Workflow

Calculate: Map out local point processes from coordinates.
Execute: Evaluate cross-pair relationships across graph networks.
Assess: Perform Ripley's K or spatial autocorrelation permutation.
Generate: Output structured metric arrays or interaction heatmaps.
Report: Tabulate key statistical significances.

Core Capabilities

Cluster-level (require --cluster-key):

Neighborhood enrichment: Pairwise cluster co-localisation z-scores
Ripley's L function: Point-pattern analysis per cluster
Co-occurrence: Pairwise co-occurrence across distances

Gene-level (require --genes or --n-top-genes): 4. Moran's I: Global spatial autocorrelation per gene 5. Geary's C: Global spatial autocorrelation (alternative to Moran) 6. Local Moran's I (LISA): Spatial hotspots per gene 7. Getis-Ord Gi*: Local hot/cold spot detection 8. Bivariate Moran: Spatial cross-correlation between two genes

Network-level: 9. Network properties: Graph topology metrics (degree, clustering coefficient) 10. Spatial centrality: Betweenness/closeness centrality per cluster

Input Formats

Format	Extension	Required	Example
Preprocessed AnnData	`.h5ad`	Normalised, clustered, with spatial coordinates	`processed.h5ad`
Demo	n/a	`--demo` flag	Built-in via spatial-preprocess

Workflow

Load: Read preprocessed h5ad (output of spatial-preprocess)
Validate: Ensure spatial coordinates and cluster column exist; convert cluster key to categorical if needed
Spatial neighbors: Build spatial connectivity graph via
```
squidpy.gr.spatial_neighbors
```
Analyze: Run the selected analysis type (neighborhood_enrichment, ripley, or co_occurrence)
Figures: Heatmap of enrichment z-scores (for neighborhood_enrichment)
Report: Write report.md, result.json, tables/enrichment_zscore.csv, processed.h5ad, figures, reproducibility bundle

CLI Reference

# Neighborhood enrichment (default, cluster-level)
python skills/spatial-statistics/spatial_statistics.py \
  --input <processed.h5ad> --output <report_dir>

# Ripley's L function (cluster-level)
python skills/spatial-statistics/spatial_statistics.py \
  --input <processed.h5ad> --analysis-type ripley --output <dir>

# Co-occurrence analysis (cluster-level)
python skills/spatial-statistics/spatial_statistics.py \
  --input <processed.h5ad> --analysis-type co_occurrence --output <dir>

# Moran's I (gene-level)
python skills/spatial-statistics/spatial_statistics.py \
  --input <processed.h5ad> --analysis-type moran --genes "EPCAM,VIM,CD3D" --output <dir>

# Geary's C (gene-level)
python skills/spatial-statistics/spatial_statistics.py \
  --input <processed.h5ad> --analysis-type geary --n-top-genes 50 --output <dir>

# Local Moran's I / LISA (gene-level hotspots)
python skills/spatial-statistics/spatial_statistics.py \
  --input <processed.h5ad> --analysis-type local_moran --genes "EPCAM" --output <dir>

# Getis-Ord Gi* (gene-level hot/cold spots)
python skills/spatial-statistics/spatial_statistics.py \
  --input <processed.h5ad> --analysis-type getis_ord --genes "CD3D,CD8A" --output <dir>

# Bivariate Moran (gene cross-correlation)
python skills/spatial-statistics/spatial_statistics.py \
  --input <processed.h5ad> --analysis-type bivariate_moran --genes "EPCAM,VIM" --output <dir>

# Network properties
python skills/spatial-statistics/spatial_statistics.py \
  --input <processed.h5ad> --analysis-type network_properties --output <dir>

# Spatial centrality
python skills/spatial-statistics/spatial_statistics.py \
  --input <processed.h5ad> --analysis-type spatial_centrality --cluster-key leiden --output <dir>

# Demo mode
python skills/spatial-statistics/spatial_statistics.py --demo --output /tmp/spatial_stats_demo

# Via OmicsClaw runner
python omicsclaw.py run spatial-statistics --input <file> --output <dir>
python omicsclaw.py run spatial-statistics --demo

Example Queries

"Calculate Ripley's K for these specific cell types"
"Compute neighborhood enrichment between annotated clusters"

Algorithm / Methodology

Cluster-level analyses

Neighborhood Enrichment:

squidpy.gr.nhood_enrichment(adata, cluster_key)

computes z-scores by permutation testing. Positive z-scores indicate enrichment (co-localisation), negative indicate depletion.

Ripley's L Function:

squidpy.gr.ripley(adata, cluster_key, mode="L")

computes Ripley's L statistic per cluster. L(r) > r indicates clustering at distance r; L(r) < r indicates regularity/dispersion.

Co-occurrence:

squidpy.gr.co_occurrence(adata, cluster_key)

measures pairwise cluster co-occurrence across spatial distance intervals.

Gene-level analyses

Moran's I: Global spatial autocorrelation. Range: −1 (dispersion) to +1 (clustering); 0 = random.

Geary's C: Alternative autocorrelation measure. Range: 0 (clustering) to 2 (dispersion); 1 = random.

Local Moran's I (LISA): Identifies spatial hotspots (high-high) and coldspots (low-low) for individual genes.

Getis-Ord Gi*: Local hot/cold spot statistic. Positive Gi* = hotspot, negative = coldspot.

Bivariate Moran: Spatial cross-correlation between two genes.

Network-level analyses

Network properties: Degree distribution, clustering coefficient, path length from spatial graph.

Spatial centrality: Betweenness and closeness centrality per cluster.

Parameters

Parameter	Default	Description
`--analysis-type`	`neighborhood_enrichment`	Analysis type (see list above)
`--cluster-key`	`leiden`	Column in `adata.obs` for cluster-level analyses
`--genes`	(none)	Comma-separated gene names for gene-level analyses
`--n-top-genes`	(none)	Number of top variable genes for gene-level analyses

Output Structure

output_dir/
├── report.md
├── result.json
├── processed.h5ad
├── figures/
│   └── nhood_enrichment_heatmap.png   (neighborhood_enrichment only)
├── tables/
│   └── enrichment_zscore.csv          (neighborhood_enrichment only)
└── reproducibility/
    ├── commands.sh
    ├── environment.yml
    └── checksums.sha256

Dependencies

Required: squidpy >= 1.2, scanpy >= 1.9, anndata >= 0.11, matplotlib, numpy, pandas

Safety

Local-first: Strict offline processing without external upload.
Disclaimer: Requires OmicsClaw reporting structures and disclaimers.
Audit trail: Hyperparameters and operational flow states are logged fully.

Integration with Orchestrator

Trigger conditions:

Automatically invoked dynamically based on tool metadata and user intent matching.
Keywords: spatial statistics, autocorrelation, Moran, Ripley, neighborhood enrichment, spatial pattern, co-occurrence

Chaining: Expects

processed.h5ad

from spatial-preprocess as input

Citations

Squidpy — spatial analysis framework
Moran's I — spatial autocorrelation
Ripley's K/L function — point-pattern analysis
Neighborhood enrichment — squidpy methodology (Palla et al., 2022)