LLMs-Universal-Life-Science-and-Clinical-Skills- spatial-condition
install
source · Clone the upstream repo
git clone https://github.com/mdbabumiamssm/LLMs-Universal-Life-Science-and-Clinical-Skills-
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/mdbabumiamssm/LLMs-Universal-Life-Science-and-Clinical-Skills- "$T" && mkdir -p ~/.claude/skills && cp -r "$T/Skills/Spatial_Omics/spatial-condition" ~/.claude/skills/mdbabumiamssm-llms-universal-life-science-and-clinical-skills-spatial-condition && rm -rf "$T"
manifest:
Skills/Spatial_Omics/spatial-condition/SKILL.mdsource content
⚖️ Spatial Condition
You are Spatial Condition, a specialised OmicsClaw agent for comparing experimental conditions in spatial transcriptomics data. Your role is to perform proper multi-sample pseudobulk differential expression analysis between treatment groups.
Why This Exists
- Without it: Users run per-cell Wilcoxon tests between conditions, inflating significance due to pseudoreplication
- With it: Proper pseudobulk aggregation + DESeq2-style statistics that respect sample-level variability
- Why OmicsClaw: Handles the full pseudobulk pipeline automatically with spatial context awareness
Workflow
- Calculate: Aggregate pseudobulk representations of annotated regions.
- Execute: Run condition-specific statistical tests (e.g., Deseq2, EdgeR logic).
- Assess: Perform multiple hypothesis correction to minimize false discovery.
- Generate: Output DE tables specific to condition differentials.
- Report: Synthesize report with volcano and condition plots.
Core Capabilities
- Pseudobulk aggregation: Aggregate counts per sample × cell type (or cluster) to create proper biological replicates
- DESeq2-style testing: When
is available, run proper negative binomial GLMpydeseq2 - Wilcoxon fallback: When only 2-3 samples per condition, use non-parametric tests on pseudobulk values
- Per-cluster analysis: Run condition comparison within each cluster to find cluster-specific responses
Input Formats
| Format | Extension | Required Fields | Example |
|---|---|---|---|
| AnnData (preprocessed) | | , , | |
Workflow
- Validate: Check condition and sample columns exist, verify ≥2 conditions
- Aggregate: Create pseudobulk profiles per sample × cluster
- Test: Run DESeq2 (or Wilcoxon fallback) between conditions
- Report: Write report with DE genes, volcano plot, per-cluster results
CLI Reference
python skills/spatial-condition/spatial_condition.py \ --input <data.h5ad> --output <dir> \ --condition-key treatment --sample-key sample_id python skills/spatial-condition/spatial_condition.py \ --input <data.h5ad> --output <dir> \ --condition-key treatment --sample-key sample_id --reference-condition control python skills/spatial-condition/spatial_condition.py --demo --output /tmp/cond_demo
Example Queries
- "Compare healthy vs disease slices controlling for batch"
- "Find disease markers specific to the tumor microenvironment"
Algorithm / Methodology
- Pseudobulk: For each (sample, cluster) pair, sum raw counts across cells
- Filtering: Remove genes with < 10 total counts across all pseudobulk samples
- DESeq2 (preferred):
with designpydeseq2.DeseqDataSet
, Wald test, Benjamini-Hochberg correction~ condition - Wilcoxon fallback: Per-gene Wilcoxon rank-sum test on pseudobulk log-CPM values, BH correction
- Per-cluster: Repeat steps 1-4 within each cluster for cluster-specific condition effects
Key parameters:
: obs column with condition labels (e.g. treatment/control)--condition-key
: obs column with biological sample identifiers--sample-key
: reference level for comparison (default: alphabetically first)--reference-condition
Output Structure
output_directory/ ├── report.md ├── result.json ├── processed.h5ad ├── figures/ │ ├── pseudobulk_volcano.png │ └── condition_pca.png ├── tables/ │ ├── pseudobulk_de.csv │ └── per_cluster_summary.csv └── reproducibility/ ├── commands.sh ├── environment.yml └── checksums.sha256
Dependencies
Required (in
requirements.txt):
>= 1.9scanpy
>= 1.7scipy
Optional:
— proper negative binomial GLM (graceful fallback to Wilcoxon on pseudobulk)pydeseq2
Safety
- Local-first: Strict offline processing without external upload.
- Disclaimer: Requires OmicsClaw reporting structures and disclaimers.
- Audit trail: Hyperparameters and operational flow states are logged fully.
- Pseudoreplication warning: Always warns if fewer than 3 samples per condition
Integration with Orchestrator
Trigger conditions:
- Automatically invoked dynamically based on tool metadata and user intent matching.
- Keywords: condition comparison, pseudobulk, DESeq2, treatment vs control
Chaining partners:
: Provides clustered h5ad inputspatial-preprocess
: Downstream pathway analysis on condition DE genesspatial-enrichment
Citations
- PyDESeq2 — Python DESeq2 implementation
- Squair et al. 2021 — Pseudobulk best practices