OpenClaw-Medical-Skills single-cell-multi-omics-integration
Quick-reference sheet for OmicVerse tutorials spanning MOFA, GLUE pairing, SIMBA integration, TOSICA transfer, and StaVIA cartography.
install
source · Clone the upstream repo
git clone https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/single-multiomics" ~/.claude/skills/freedomintelligence-openclaw-medical-skills-single-cell-multi-omics-integration && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/single-multiomics" ~/.openclaw/skills/freedomintelligence-openclaw-medical-skills-single-cell-multi-omics-integration && rm -rf "$T"
manifest:
skills/single-multiomics/SKILL.mdsource content
Single-Cell Multi-Omics Tutorials Cheat Sheet
This skill walk-through summarizes the OmicVerse notebooks that cover paired and unpaired multi-omic integration, multi-batch embedding, reference transfer, and trajectory cartography.
MOFA on paired scRNA + scATAC (t_mofa.ipynb
)
t_mofa.ipynb- Data preparation: Load preprocessed AnnData objects for RNA (
) and ATAC (rna_p_n_raw.h5ad
) withatac_p_n_raw.h5ad
, and initialiseov.utils.read
with matchingpyMOFA
andomics
lists.omics_name - Model training: Call
to select highly variable features and run the factor model withmofa_preprocess()
, which exports the learned MOFA+ factors to an HDF5 model file.mofa_run(outfile=...) - Result inspection: Reload downstream AnnData, append factor scores via
, and explore factor–cluster associations usingov.single.factor_exact
,factor_correlation
, and the plotting helpers inget_weights
(pyMOFAART
,plot_r2
,plot_cor
,plot_factor
, etc.).plot_weights - Export workflow: Persist factors and weights through the MOFA HDF5 artifact and reuse them by instantiating
for later annotation or visualisation sessions.pyMOFAART(model_path=...) - Dependencies & hardware: Requires
; plots optionally rely onmofapy2
/pymde
but run on CPU.scvi-tools
MOFA after GLUE pairing (t_mofa_glue.ipynb
)
t_mofa_glue.ipynb- Data preparation: Start from GLUE-derived embeddings (
,rna-emb.h5ad
), build aatac.emb.h5ad
object, and runGLUE_pair
to align unpaired cells before subsetting to highly variable features.correlation() - Model training: Instantiate
with the aligned AnnData objects, runpyMOFA
, and save the joint factors throughmofa_preprocess()
.mofa_run(outfile='models/chen_rna_atac.hdf5') - Result inspection: Use
plus AnnData that now contains the GLUE embeddings to compute factors (pyMOFAART
) and visualise variance explained, factor–cluster correlations, and ranked feature weights.get_factors - Export workflow: Reuse the saved MOFA HDF5 model for downstream inspection; GLUE embeddings can be embedded with
(GPU-accelerated MDE is optional,scvi.model.utils.mde
works on CPU).sc.tl.umap - Dependencies & hardware: Requires both
and the GLUE tooling (mofapy2
,scglue
,scvi-tools
); GPU acceleration only affects optional MDE visualisation.pymde
SIMBA batch integration (t_simba.ipynb
)
t_simba.ipynb- Data preparation: Fetch the concatenated AnnData (
) derived from multiple pancreas studies and pass it, alongside a results directory, tosimba_adata_raw.h5ad
.pySIMBA - Model training: Execute
to bin features and build a SIMBA-compatible graph, then callpreprocess(...)
followed bygen_graph()
to launch PyTorch-BigGraph optimisation (can scale with CPU workers) andtrain(num_workers=...)
to resume trained checkpoints.load(...) - Result inspection: Apply
to obtain the harmonised AnnData with SIMBA embeddings (batch_correction()
) and visualise usingX_simba
/mde
coloured by cell type or batch.sc.tl.umap - Export workflow: Training outputs reside in the workdir (e.g.,
); reuse them withresult_human_pancreas/pbg/graph0
for later analyses.simba_object.load(...) - Dependencies & hardware: Requires installing
andsimba
(PyTorch BigGraph backend). GPU is optional; make sure adequate CPU threads and memory are available for graph training.simba_pbg
TOSICA reference transfer (t_tosica.ipynb
)
t_tosica.ipynb- Data preparation: Download demo AnnData references (
,demo_train.h5ad
) and required gene-set GMT files viademo_test.h5ad
; confirm datasets are log-normalised before training.ov.utils.download_tosica_gmt() - Model training: Create
with the reference AnnData, chosen pathway mask, label key, project directory, and batch size; train withpyTOSICA
, then persist weights withtrain(epochs=...)
and optionally reload viasave()
.load() - Result inspection: Generate predictions on query AnnData through
, embed with OmicVerse preprocessing and GPU-enabledpredicted(pre_adata=...)
(UMAP fallback available), and explore pathway attention to interpret transformer heads.mde - Export workflow: Saved project folder keeps model checkpoints and attention summaries; reuse the exported assets to annotate future datasets without retraining from scratch.
- Dependencies & hardware: Needs TOSICA (PyTorch transformer) plus downloaded gene-set masks; avoid setting
if memory is constrained. GPU acceleration improves embedding (depth=2
) but training runs on standard PyTorch (CPU/GPU depending on environment).mde
StaVIA trajectory cartography (t_stavia.ipynb
)
t_stavia.ipynb- Data preparation: Load example dentate gyrus velocity data via
, preprocess with OmicVerse (scvelo.datasets.dentategyrus()
,preprocess
,scale
, neighbours, UMAP) to populate the AnnData matrices used by VIA.pca - Model training: Configure VIA hyperparameters (components, neighbours, seeds, root selection) and instantiate/run
on the chosen representation (VIA.core.VIA
).adata.obsm['scaled|original|X_pca'] - Result inspection: Store outputs such as pseudotime (
), cluster graph abstractions, trajectory curves, atlas views, and stream plots through VIA plotting helpers.single_cell_pt_markov - Export workflow: Persist derived visualisations and animations (e.g.,
,animate_streamplot_ov
) to files (animate_atlas
) for reporting; recompute edge bundles via.gif
when needed.make_edgebundle_milestone - Dependencies & hardware: Relies on
,scvelo
, and OmicVerse plotting; computations are CPU-bound though producing large stream/animation outputs benefits from ample memory.pyVIA