Claude-skill-registry loadingrnafromseurat
Load pre-existing Seurat objects into the immunopipe pipeline instead of starting from raw count matrices via SampleInfo. This enables analysis on already processed single-cell RNA-seq data stored in Seurat R objects.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/loadingrnafromseurat" ~/.claude/skills/majiayu000-claude-skill-registry-loadingrnafromseurat && rm -rf "$T"
skills/data/loadingrnafromseurat/SKILL.mdLoadingRNAFromSeurat Process Configuration
Purpose
Load pre-existing Seurat objects into the immunopipe pipeline instead of starting from raw count matrices via SampleInfo. This enables analysis on already processed single-cell RNA-seq data stored in Seurat R objects.
When to Use
- Starting from pre-processed Seurat objects (RDS or qs/qs2 format) instead of raw count matrices
- Re-analyzing existing Seurat objects with immunopipe's downstream analysis capabilities
- Alternative entry point when SampleInfo is not needed for RNA data input
- When metadata is already embedded in the Seurat object's meta.data slot
- When combining with TCR/BCR data - can use LoadingRNAFromSeurat for RNA + SampleInfo for VDJ data
Configuration Structure
Process Enablement
[LoadingRNAFromSeurat] cache = true
Input Specification
[LoadingRNAFromSeurat.in] # Path to Seurat object file (RDS or qs/qs2 format) # Can be single file or array of files for multiple samples infile = ["path/to/seurat_object.rds"] # Alternative: can use 'srtobj' alias (same as infile) # srtobj = ["path/to/seurat_object.rds"]
Environment Variables
[LoadingRNAFromSeurat.envs] # Whether the Seurat object is well-prepared for the pipeline # - If true: SeuratPreparing process will be skipped # - If false: SeuratPreparing will run for QC, normalization, integration prepared = false # Whether the Seurat object is already clustered # - If true: SeuratClustering (or SeuratClusteringOfAllCells) and SeuratMap2Ref will be skipped # - Forces 'prepared' to be true if set to true clustered = false # Column name in Seurat object's meta.data that contains sample identifiers # Used to create a "Sample" column in the output # Default is "Sample" - if meta.data already has "Sample", no action is taken # If column exists but named differently, specify here (e.g., "orig.ident", "sample_id") sample = "Sample"
Configuration Examples
Minimal Configuration (Single Seurat Object)
[LoadingRNAFromSeurat] [LoadingRNAFromSeurat.in] infile = ["path/to/sample1.rds"]
Pre-processed Seurat Object (Skip Preparation)
[LoadingRNAFromSeurat] [LoadingRNAFromSeurat.in] infile = ["data/preprocessed_seurat.rds"] [LoadingRNAFromSeurat.envs] # Object already normalized, QC'd, integrated - skip SeuratPreparing prepared = true
Fully Prepared Object with Clustering
[LoadingRNAFromSeurat] [LoadingRNAFromSeurat.in] infile = ["data/clustered_seurat.rds"] [LoadingRNAFromSeurat.envs] # Object is fully prepared and clustered # Skip both SeuratPreparing and SeuratClustering clustered = true # 'prepared' automatically set to true when clustered = true
Custom Sample Column Mapping
[LoadingRNAFromSeurat] [LoadingRNAFromSeurat.in] infile = ["data/seurat_objects/sample1.rds", "data/seurat_objects/sample2.rds"] [LoadingRNAFromSeurat.envs] # Seurat object uses "orig.ident" column for sample names sample = "orig.ident"
Loading Multiple Seurat Objects
[LoadingRNAFromSeurat] [LoadingRNAFromSeurat.in] infile = [ "data/sample1.rds", "data/sample2.rds", "data/sample3.rds" ] [LoadingRNAFromSeurat.envs] # Each object must have the sample column specified # Objects will be integrated by SeuratPreparing if prepared = false sample = "Sample"
RNA + TCR Combined Analysis
# Use LoadingRNAFromSeurat for RNA data [LoadingRNAFromSeurat] [LoadingRNAFromSeurat.in] infile = ["data/rna_seurat.rds"] [LoadingRNAFromSeurat.envs] prepared = true # Still use SampleInfo for TCR/BCR data paths [SampleInfo.in] infile = ["sample_info.txt"] # sample_info.txt should contain TCRData/BCRData columns (not RNAData)
Common Patterns
Pattern 1: Load and Start Analysis (Standard Workflow)
[LoadingRNAFromSeurat] [LoadingRNAFromSeurat.in] infile = ["data/seurat.rds"] # SeuratPreparing will run for QC, normalization, integration # SeuratClustering will run for clustering [SeuratClustering] [SeuratClusterStats]
Pattern 2: Load and Skip to Downstream Analysis
[LoadingRNAFromSeurat] [LoadingRNAFromSeurat.in] infile = ["data/prepared_seurat.rds"] [LoadingRNAFromSeurat.envs] prepared = true # Skip SeuratPreparing # Jump directly to clustering and marker analysis [SeuratClustering] [ClusterMarkers] [SeuratClusterStats]
Pattern 3: Fully Pre-processed (Skip Preparation + Clustering)
[LoadingRNAFromSeurat] [LoadingRNAFromSeurat.in] infile = ["data/final_seurat.rds"] [LoadingRNAFromSeurat.envs] clustered = true # Skip SeuratPreparing AND SeuratClustering # Jump directly to downstream analyses [CellTypeAnnotation] [ScFGSEA]
Pattern 4: TCR Analysis with Pre-processed RNA
[LoadingRNAFromSeurat] [LoadingRNAFromSeurat.in] infile = ["data/rna_seurat.rds"] [LoadingRNAFromSeurat.envs] prepared = true # Still load TCR/BCR data [ScRepLoading] # Continue with TCR-specific analyses [TOrBCellSelection] [CDR3Clustering] [ClonalStats]
Pattern 5: Multi-sample Integration
[LoadingRNAFromSeurat] [LoadingRNAFromSeurat.in] infile = [ "data/patient1.rds", "data/patient2.rds", "data/patient3.rds" ] [LoadingRNAFromSeurat.envs] # Each object has a "patient_id" column for sample identification sample = "patient_id" # SeuratPreparing will integrate multiple samples [SeuratPreparing] [SeuratClustering]
Dependencies
Upstream
- None (Entry point process)
- Can optionally work with
when TCR/BCR data is present (SampleInfo provides VDJ paths)SampleInfo
Downstream
- SeuratPreparing (if
)prepared = false- Performs QC, normalization, integration of loaded Seurat objects
- Required for standard analysis workflow
- SeuratClustering or SeuratClusteringOfAllCells (if
)clustered = false- Performs clustering analysis
- All downstream RNA analysis processes:
- SeuratClusterStats, ClusterMarkers, CellTypeAnnotation, SeuratMap2Ref, etc.
Validation Rules
File Format Requirements
- Supported formats: RDS (
/saveRDS()
) or qs/qs2 (readRDS()
/qs::qsave()
)qs::qread() - Content: Must contain a valid Seurat object
- File existence: Input files must exist at specified paths
- Sample column: If
parameter is not "Sample", the specified column must exist insampleobject@meta.data
Metadata Handling
- If
already contains a "Sample" column andmeta.data
:sample = "Sample"- No modification is made (symlink created to save space)
- If
column doesn't exist:sample- Error: Process fails with message "Sample column 'X' not found in metadata"
- If
column exists with custom name (not "Sample"):sample- A new "Sample" column is created by copying from the specified column
- Modified object is saved to output
SampleInfo Compatibility
- Mutually exclusive with RNAData: Cannot use both LoadingRNAFromSeurat and RNAData column in SampleInfo
- Compatible with TCRData/BCRData: Can use LoadingRNAFromSeurat for RNA + SampleInfo for VDJ data paths
- Required when: No SampleInfo section exists AND RNA data is needed
Environment Variable Validation
→ automatically setsclustered = true
(forced dependency)prepared = true
column must exist in Seurat object metadatasample- Boolean flags accept
/true
(case-insensitive in TOML)false
Troubleshooting
Issue: "Sample column not found in metadata"
Cause: The specified
sample column name doesn't exist in object@meta.data
Solution:
[LoadingRNAFromSeurat.envs] # Check your Seurat object's metadata: # colnames(seurat_obj@meta.data) sample = "actual_column_name" # Use the exact column name
Issue: SeuratPreparing still running despite prepared = true
prepared = trueCause: Configuration syntax error or caching issue Solution:
- Check TOML syntax (no quotes around boolean values)
- Clear cache:
[LoadingRNAFromSeurat] cache = "force" - Verify config is being loaded:
python -m immunopipe.validate_config config.toml
Issue: Multiple samples not being integrated
Cause: Sample column mapping incorrect or objects don't have the specified column Solution:
[LoadingRNAFromSeurat.in] infile = ["sample1.rds", "sample2.rds"] [LoadingRNAFromSeurat.envs] # Verify each object has this column before running sample = "orig.ident" # Common alternative to "Sample"
Issue: Want to use LoadingRNAFromSeurat but also have TCR data
Cause: Unclear how to specify TCR data paths Solution: Use both processes:
[LoadingRNAFromSeurat.in] infile = ["rna_seurat.rds"] [SampleInfo.in] infile = ["sample_info.txt"] # sample_info.txt only needs TCRData/BCRData columns (not RNAData)
Issue: Symlink error when Sample column already exists
Cause: Trying to create symlink when file exists Solution: This is handled automatically by the script - it removes existing file before creating symlink
Issue: Want to combine LoadingRNAFromSeurat with SampleInfo metadata
Cause: Need additional metadata columns Solution: Use
SeuratPreparing.envs.mutaters to add columns:
[SeuratPreparing.envs] mutaters = { "Condition" = "metadata$Condition", "Batch" = "metadata$Batch" }
Best Practices
- Always specify sample column: Even if default is "Sample", explicitly set it to avoid issues
- Check metadata before running: Use R to verify column names exist in
object@meta.data - Use
for re-analysis: Skip unnecessary preprocessing when objects are already preparedprepared = true - Use
cautiously: Only skip clustering if you're satisfied with existing clusteringclustered = true - Validate configuration: Run
before executing pipelinepython -m immunopipe.validate_config config.toml - Consider file size: Large RDS files can be slow to copy; use qs/qs2 format for better performance
Difference from SampleInfo
| Feature | SampleInfo | LoadingRNAFromSeurat |
|---|---|---|
| Input format | Raw count matrices (10X, loom) | Pre-processed Seurat objects |
| Data preparation | Always requires SeuratPreparing | Optional (can skip with ) |
| Metadata source | Sample info text file | Embedded in Seurat object |
| Multi-sample handling | Specified in text file | Multiple input files or single multi-sample object |
| TCR/BCR data support | Provides paths for RNA + VDJ | Only RNA (use SampleInfo for VDJ) |
| Integration | Required step | Depends on setting |
Workflow Integration
LoadingRNAFromSeurat replaces the standard
SampleInfo → SeuratPreparing entry point:
Standard workflow (raw data):
SampleInfo → SeuratPreparing → SeuratClustering → downstream analyses
With LoadingRNAFromSeurat (prepared data):
LoadingRNAFromSeurat → SeuratClustering → downstream analyses
With LoadingRNAFromSeurat (fully processed):
LoadingRNAFromSeurat → downstream analyses (skip SeuratClustering)
With TCR data:
LoadingRNAFromSeurat (RNA) + SampleInfo (VDJ paths) → ScRepLoading → TCR analyses