OpenClaw-Medical-Skills bulktrajblend-trajectory-interpolation
Extend scRNA-seq developmental trajectories with BulkTrajBlend by generating intermediate cells from bulk RNA-seq, training beta-VAE and GNN models, and interpolating missing states.
install
source · Clone the upstream repo
git clone https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/bulk-trajblend-interpolation" ~/.claude/skills/freedomintelligence-openclaw-medical-skills-bulktrajblend-trajectory-interpolati && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/bulk-trajblend-interpolation" ~/.openclaw/skills/freedomintelligence-openclaw-medical-skills-bulktrajblend-trajectory-interpolati && rm -rf "$T"
manifest:
skills/bulk-trajblend-interpolation/SKILL.mdsource content
BulkTrajBlend trajectory interpolation
Overview
Invoke this skill when users need to bridge gaps in single-cell developmental trajectories using matched bulk RNA-seq. It follows
, showcasing how BulkTrajBlend deconvolves PDAC bulk samples, identifies overlapping communities with a GNN, and interpolates "interrupted" cell states.t_bulktrajblend.ipynb
Instructions
- Prepare libraries and inputs
- Import
,omicverse as ov
,scanpy as sc
, and helper functions likescvelo as scv
; runfrom omicverse.utils import mde
.ov.plot_set() - Load the reference scRNA-seq AnnData (
) and raw bulk counts withscv.datasets.dentategyrus()
followed byov.utils.read(...)
for gene ID harmonisation.ov.bulk.Matrix_ID_mapping(...)
- Import
- Configure BulkTrajBlend
- Instantiate
.ov.bulk2single.BulkTrajBlend(bulk_seq=bulk_df, single_seq=adata, bulk_group=['dg_d_1','dg_d_2','dg_d_3'], celltype_key='clusters') - Explain that
names correspond to raw bulk columns and the method expects unscaled counts.bulk_group
- Instantiate
- Set beta-VAE expectations
- Call
(or pass a dictionary) to define expected cell counts per cluster. Mention that omitting the argument triggers TAPE-based estimation.bulktb.vae_configure(cell_target_num=100)
- Call
- Train or load the beta-VAE
- Use
.bulktb.vae_train(batch_size=512, learning_rate=1e-4, hidden_size=256, epoch_num=3500, vae_save_dir='...', vae_save_name='dg_btb_vae', generate_save_dir='...', generate_save_name='dg_btb') - Highlight resuming with
and the need to regenerate cells with consistent random seeds for reproducibility.bulktb.vae_load('.../dg_btb_vae.pth')
- Use
- Generate synthetic cells
- Produce filtered AnnData via
and inspect compositions withbulktb.vae_generate(leiden_size=25)
.ov.bulk2single.bulk2single_plot_cellprop(...) - Save outputs to disk for reuse (
).adata.write_h5ad
- Produce filtered AnnData via
- Configure and train the GNN
- Call
to set hyperparameters.bulktb.gnn_configure(max_epochs=2000, use_rep='X', neighbor_rep='X_pca', gpu=0, ...) - Train using
; reload checkpoints withbulktb.gnn_train()
.bulktb.gnn_load('save_model/gnn.pth') - Generate overlapping community assignments through
.bulktb.gnn_generate()
- Call
- Visualise community structure
- Create MDE embeddings:
.bulktb.nocd_obj.adata.obsm['X_mde'] = mde(bulktb.nocd_obj.adata.obsm['X_pca']) - Plot clusters vs. discovered communities using
and filtered subsets excluding synthetic labels with hyphens.sc.pl.embedding(..., color=['clusters','nocd_n'], palette=ov.utils.pyomic_palette())
- Create MDE embeddings:
- Interpolate missing states
- Run
(replace with target lineage) to synthesise continuity, then preprocess the interpolated AnnData (HVG selection, scaling, PCA).bulktb.interpolation('OPC') - Compute embeddings with
, visualise withmde
, and compare to the original atlas.ov.utils.embedding
- Run
- Analyse trajectories
- Initialise
on both original and interpolated data to derive pseudotime, followed byov.single.pyVIA
,get_pseudotime
,sc.pp.neighbors
, andov.utils.cal_paga
for topology validation.ov.utils.plot_paga
- Initialise
- Troubleshooting tips
- If the VAE collapses (high reconstruction loss), lower
or reducelearning_rate
.hidden_size - Ensure the same generated dataset is used before calling
; regenerating cells changes the graph and can break checkpoint loading.gnn_train - Sparse clusters may need adjusted
thresholds or a smallercell_target_num
filter to retain rare populations.leiden_size
- If the VAE collapses (high reconstruction loss), lower
Examples
- "Train BulkTrajBlend on PDAC cohorts, then interpolate missing OPC states in the trajectory."
- "Load saved beta-VAE and GNN weights to regenerate overlapping communities and plot cluster vs. nocd labels."
- "Run VIA on interpolated cells and compare PAGA graphs with the original scRNA-seq trajectory."
References
- Tutorial notebook:
t_bulktrajblend.ipynb - Example datasets and checkpoints:
omicverse_guide/docs/Tutorials-bulk2single/data/ - Quick copy/paste commands:
reference.md