OpenClaw-Medical-Skills bio-single-cell-splicing
Analyzes alternative splicing at single-cell resolution using BRIE2 for probabilistic PSI estimation or leafcutter2 for cluster-based analysis with NMD detection. Identifies cell-type-specific splicing patterns. Use when analyzing isoform usage in scRNA-seq or finding splicing differences between cell populations.
git clone https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/bio-single-cell-splicing" ~/.claude/skills/freedomintelligence-openclaw-medical-skills-bio-single-cell-splicing && rm -rf "$T"
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/bio-single-cell-splicing" ~/.openclaw/skills/freedomintelligence-openclaw-medical-skills-bio-single-cell-splicing && rm -rf "$T"
skills/bio-single-cell-splicing/SKILL.md- shell exec via library
Version Compatibility
Reference examples tested with: anndata 0.10+, numpy 1.26+, pandas 2.2+, scanpy 1.10+
Before using code patterns, verify installed versions match. If versions differ:
- Python:
thenpip show <package>
to check signatureshelp(module.function)
If code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.
Single-Cell Splicing Analysis
Analyze alternative splicing at single-cell resolution.
Tool Selection
| Tool | Approach | Strengths |
|---|---|---|
| BRIE2 | Probabilistic PSI | Handles sparsity, regulatory features |
| leafcutter2 | Intron clustering | NMD detection, novel junctions |
Note: Avoid Whippet.jl (Julia 1.6.7 only, incompatible with Julia 1.9+)
BRIE2 Analysis
Goal: Estimate per-cell PSI values for splicing events with uncertainty quantification.
Approach: Prepare splicing events from annotation, count reads per cell barcode, then fit a Bayesian variational inference model for probabilistic PSI estimation.
"Analyze splicing in single-cell data" -> Estimate per-cell inclusion levels for splicing events with uncertainty.
- Python: BRIE2 (probabilistic PSI, handles sparsity)
- Python/R: leafcutter2 (intron clustering, NMD detection)
import brie import scanpy as sc import anndata as ad # Load single-cell data adata = sc.read_h5ad('scrnaseq.h5ad') # Prepare splicing events from annotation # BRIE2 uses pre-defined splicing events brie.preprocessing.get_events( gtf_file='annotation.gtf', out_file='splicing_events.gff3' ) # Count reads for splicing events from BAM files # Requires cell barcodes and UMIs brie.preprocessing.count( bam_file='possorted_genome_bam.bam', gff_file='splicing_events.gff3', out_dir='brie_counts/', cell_file='barcodes.tsv' # Filtered cell barcodes ) # Load BRIE count data adata_splice = brie.read_h5ad('brie_counts/brie_count.h5ad') # Run BRIE2 model for PSI estimation # Uses variational inference for probabilistic estimates brie.fit( adata_splice, layer='raw', n_epochs=400, batch_size=512 ) # PSI estimates stored in adata_splice.layers['Psi'] # Uncertainty in adata_splice.layers['Psi_var']
Cell-Type Specific Splicing
Goal: Identify splicing events that vary between cell types.
Approach: Compute mean PSI per cell type from BRIE2 output and rank events by cross-cell-type variance.
import numpy as np import pandas as pd # Add cell type annotations adata_splice.obs['cell_type'] = adata.obs['cell_type'] # Calculate mean PSI per cell type cell_types = adata_splice.obs['cell_type'].unique() psi_matrix = adata_splice.layers['Psi'] mean_psi = pd.DataFrame(index=adata_splice.var_names) for ct in cell_types: mask = adata_splice.obs['cell_type'] == ct mean_psi[ct] = np.nanmean(psi_matrix[mask, :], axis=0) # Find cell-type specific splicing events # Events with high variance across cell types psi_var = mean_psi.var(axis=1) variable_events = psi_var.nlargest(100) print('Top variable splicing events:') print(variable_events)
leafcutter2 Analysis
Goal: Detect differential intron usage in single-cell data with NMD-inducing splicing detection.
Approach: Extract junctions from 10X BAMs with cell barcodes, cluster introns, and run differential analysis between cell groups.
import subprocess # leafcutter2 (April 2025): Adds NMD-inducing splicing detection # Step 1: Extract junctions from BAM # Works with 10X BAMs with cell barcodes subprocess.run([ 'python', 'scripts/bam2junc.py', '-b', 'possorted_genome_bam.bam', '-o', 'junctions/', '--cb_tag', 'CB', # Cell barcode tag '--umi_tag', 'UB' # UMI tag ], check=True) # Step 2: Cluster introns subprocess.run([ 'python', 'clustering/leafcutter_cluster.py', '-j', 'junction_files.txt', '-o', 'leafcutter_sc', '-m', '10', # Min reads per junction '-l', '500000' # Max intron length ], check=True) # Step 3: Differential splicing between clusters # Pseudobulk approach for statistical power subprocess.run([ 'Rscript', 'scripts/leafcutter_ds.R', 'leafcutter_sc_perind_numers.counts.gz', 'groups.txt', '-o', 'differential_splicing', '-e', 'annotation_exons.txt.gz' ], check=True)
Pseudobulk Approach
Goal: Increase statistical power for splicing analysis by aggregating single cells into pseudobulk samples.
Approach: Sum junction counts within cell type groups, then apply bulk differential splicing methods to the aggregated counts.
import pandas as pd import numpy as np # For better statistical power, aggregate cells by type def pseudobulk_junctions(junction_counts, cell_metadata, groupby='cell_type'): '''Aggregate junction counts by cell group.''' groups = cell_metadata.groupby(groupby).groups pseudobulk = {} for group, cells in groups.items(): cell_mask = junction_counts.index.isin(cells) pseudobulk[group] = junction_counts.loc[cell_mask].sum() return pd.DataFrame(pseudobulk) # Run differential splicing on pseudobulk # Use leafcutter or rMATS on aggregated counts
Interpretation Considerations
| Challenge | Mitigation |
|---|---|
| Sparse data | BRIE2 probabilistic model, pseudobulk |
| Low reads per cell | Aggregate similar cells |
| 3' bias (10X) | Use 5' kit or full-length methods |
| Doublets | Filter before splicing analysis |
Quality Thresholds
| Metric | Recommendation |
|---|---|
| Min cells per event | >= 50 with reads |
| Min reads per junction | >= 5 per cell with coverage |
| PSI confidence | Variance < 0.1 |
Related Skills
- single-cell/preprocessing - QC before splicing analysis
- single-cell/clustering - Cell type annotation
- splicing-quantification - Bulk RNA-seq comparison