LLMs-Universal-Life-Science-and-Clinical-Skills- sc-multiome
install
source · Clone the upstream repo
git clone https://github.com/mdbabumiamssm/LLMs-Universal-Life-Science-and-Clinical-Skills-
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/mdbabumiamssm/LLMs-Universal-Life-Science-and-Clinical-Skills- "$T" && mkdir -p ~/.claude/skills && cp -r "$T/Skills/Transcriptomics/sc-multiome" ~/.claude/skills/mdbabumiamssm-llms-universal-life-science-and-clinical-skills-sc-multiome && rm -rf "$T"
manifest:
Skills/Transcriptomics/sc-multiome/SKILL.mdsource content
🧩 Single-Cell Multi-Omics Integration
Jointly analyze multiple modalities (RNA + protein, RNA + ATAC) measured in the same cells.
Common Modalities
| Technology | Modalities | Package |
|---|---|---|
| CITE-seq | RNA + surface proteins (ADT) | Seurat |
| 10X Multiome | RNA + ATAC | Seurat, Signac, ArchR |
| SHARE-seq | RNA + ATAC | Seurat, Signac |
| Spatial (Visium) | RNA + spatial | Seurat, Squidpy |
Workflow
- Calculate: Reassemble isolated modalities to identical identifiers.
- Normalize: Compute scale and metric dimensions natively for both arms.
- Execute: Fuse structures via multi-modal matrices (WNN/MOFA).
- Visualise: Render combined UMAP architecture layouts.
- Report: Tabulate alignment consistency metrics.
CLI Reference
python skills/singlecell/multiome/sc_multiome.py \ --input <data.h5ad> --output <dir> python omicsclaw.py run sc-multiome --demo
Algorithm / Methodology
CITE-seq Analysis (Seurat R)
Load Data
library(Seurat) data <- Read10X('filtered_feature_bc_matrix/') rna_counts <- data$`Gene Expression` adt_counts <- data$`Antibody Capture` obj <- CreateSeuratObject(counts = rna_counts, assay = 'RNA') obj[['ADT']] <- CreateAssayObject(counts = adt_counts)
QC and Normalization
obj <- PercentageFeatureSet(obj, pattern = '^MT-', col.name = 'percent.mt') obj <- subset(obj, nFeature_RNA > 200 & percent.mt < 20) # Normalize RNA obj <- NormalizeData(obj, assay = 'RNA') obj <- FindVariableFeatures(obj, assay = 'RNA') obj <- ScaleData(obj, assay = 'RNA') # Normalize ADT (CLR normalization) obj <- NormalizeData(obj, assay = 'ADT', normalization.method = 'CLR', margin = 2) obj <- ScaleData(obj, assay = 'ADT')
Weighted Nearest Neighbor (WNN) Clustering
Goal: Jointly cluster cells using both modalities, weighting each per cell.
# PCA for each modality obj <- RunPCA(obj, assay = 'RNA', reduction.name = 'pca') obj <- RunPCA(obj, assay = 'ADT', reduction.name = 'apca', features = rownames(obj[['ADT']])) # WNN graph combining both modalities obj <- FindMultiModalNeighbors(obj, reduction.list = list('pca', 'apca'), dims.list = list(1:30, 1:18)) # Cluster on WNN graph obj <- FindClusters(obj, graph.name = 'wsnn', resolution = 0.5) # UMAP on WNN obj <- RunUMAP(obj, nn.name = 'weighted.nn', reduction.name = 'wnn.umap')
Visualize
DimPlot(obj, reduction = 'wnn.umap', label = TRUE) FeaturePlot(obj, features = c('adt_CD3', 'adt_CD19', 'adt_CD14'), reduction = 'wnn.umap') # Compare modality weights VlnPlot(obj, features = 'RNA.weight', group.by = 'seurat_clusters')
10X Multiome (RNA + ATAC, Seurat + Signac)
library(Seurat) library(Signac) # Load data rna_counts <- Read10X_h5('filtered_feature_bc_matrix.h5')$`Gene Expression` atac_counts <- Read10X_h5('filtered_feature_bc_matrix.h5')$Peaks fragments <- CreateFragmentObject('atac_fragments.tsv.gz') obj <- CreateSeuratObject(counts = rna_counts, assay = 'RNA') obj[['ATAC']] <- CreateChromatinAssay(counts = atac_counts, fragments = fragments, genome = 'hg38', min.cells = 5) # ATAC processing obj <- NucleosomeSignal(obj) obj <- TSSEnrichment(obj) obj <- RunTFIDF(obj, assay = 'ATAC') obj <- FindTopFeatures(obj, assay = 'ATAC', min.cutoff = 'q0') obj <- RunSVD(obj, assay = 'ATAC') # RNA processing DefaultAssay(obj) <- 'RNA' obj <- NormalizeData(obj) %>% FindVariableFeatures() %>% ScaleData() %>% RunPCA() # WNN integration obj <- FindMultiModalNeighbors(obj, reduction.list = list('pca', 'lsi'), dims.list = list(1:30, 2:30)) obj <- RunUMAP(obj, nn.name = 'weighted.nn', reduction.name = 'wnn.umap') obj <- FindClusters(obj, graph.name = 'wsnn')
MuData / muon (Python)
import scanpy as sc import muon as mu from muon import prot as pt mdata = mu.read_10x_h5('filtered_feature_bc_matrix.h5') rna = mdata.mod['rna'] prot = mdata.mod['prot'] # Process RNA sc.pp.filter_cells(rna, min_genes=200) sc.pp.normalize_total(rna, target_sum=1e4) sc.pp.log1p(rna) sc.pp.highly_variable_genes(rna) sc.tl.pca(rna) # Process protein (CLR normalization) pt.pp.clr(prot) # Multi-omics factor analysis mu.tl.mofa(mdata, n_factors=20) # Joint UMAP mu.tl.umap(mdata) mu.pl.umap(mdata, color=['rna:leiden', 'prot:CD3'])
Multi-Modal Marker Discovery
DefaultAssay(obj) <- 'RNA' rna_markers <- FindAllMarkers(obj, only.pos = TRUE) DefaultAssay(obj) <- 'ADT' adt_markers <- FindAllMarkers(obj, only.pos = TRUE) all_markers <- rbind( transform(rna_markers, modality = 'RNA'), transform(adt_markers, modality = 'ADT') )
Modality Weight Inspection
weights <- obj@reductions$wnn@misc$weights aggregate(weights, by = list(obj$seurat_clusters), mean)
Parameters
| Parameter | Default | Description |
|---|---|---|
| | wnn, mofa, standard |
| | Comma-separated modalities |
| | Number of factors (MOFA) |
Example Queries
- "Run multimodal WNN integration across my protein and transcript data"
- "Use MOFA to derive multi-omic factors"
Output Structure
output_dir/ ├── report.md ├── result.json ├── processed.h5ad ├── figures/ │ └── summary_plot.png ├── tables/ │ └── metrics.csv └── reproducibility/ ├── commands.sh ├── environment.yml └── checksums.sha256
Version Compatibility
Reference examples tested with: scanpy 1.10+, muon 0.1+, numpy 1.26+
Dependencies
Required: scanpy, numpy Optional: muon, Seurat (R), Signac (R), ArchR (R)
Citations
- WNN — Hao et al., Cell 2021
- MOFA+ — Argelaguet et al., Molecular Systems Biology 2020
- muon — Bredikhin et al., Genome Biology 2022
- CITE-seq — Stoeckius et al., Nature Methods 2017
Safety
- Local-first: Strict offline processing without external upload.
- Disclaimer: Requires OmicsClaw reporting structures and disclaimers.
- Audit trail: Hyperparameters and operational flow states are logged fully.
Integration with Orchestrator
Trigger conditions:
- Automatically invoked dynamically based on tool metadata and user intent matching.
Chaining partners:
— Data loading and QCsc-preprocess
— Batch integration (single modality)sc-integrate
— Cell type annotation post WNNsc-annotate