Claude-skill-registry immunopipe-config
Master skill for generating immunopipe pipeline configurations. Determines pipeline architecture based on data type (scRNA-seq with or without scTCR/BCR-seq) and analysis requirements. Routes to individual process skills for detailed configuration. Use this skill when starting a new immunopipe configuration or modifying pipeline-level options.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/immunopipe-config" ~/.claude/skills/majiayu000-claude-skill-registry-immunopipe-config && rm -rf "$T"
skills/data/immunopipe-config/SKILL.mdImmunopipe Configuration Generator (Main Skill)
Purpose: Master skill for generating immunopipe pipeline configurations. Routes to individual process skills and determines pipeline architecture based on analysis requirements.
When to Use This Skill
- User wants to create/modify immunopipe configuration files
- Need to determine which processes to enable based on analysis goals
- Need to configure pipeline-level options (name, outdir, forks, scheduler)
- Need routing to specific process configuration skills
Pipeline Architecture Decision Tree
Step 1: Data Type Assessment
Ask the user about their data:
-
Do you have scRNA-seq data?
- If YES → RNA analysis processes needed
- If NO → Cannot proceed (RNA data required)
-
Do you have scTCR-seq or scBCR-seq data?
- If YES → Enable TCR/BCR processes (TCR route)
- If NO → RNA-only analysis (No-TCR route)
-
Is your RNA data already processed in a Seurat object?
- If YES → Use
instead ofLoadingRNAFromSeurat
+SampleInfoSeuratPreparing - If NO → Use standard input via
SampleInfo
- If YES → Use
Step 2: Analysis Goals
Ask what analyses they want to perform:
| Goal | Required Processes | Routing |
|---|---|---|
| Basic clustering & visualization | , , , | Use , , , skills |
| T/B cell selection | Add | Use skill |
| Cell type annotation | Add or | Use or skills |
| Marker finding | Add or | Use or skills |
| TCR clonotype analysis | Add , , | Use , , skills |
| Cell-cell communication | Add | Use skill |
| Pathway enrichment | Add | Use skill |
| Metabolic analysis | Add | Use skill |
| Differential expression | Add | Use skill |
Step 3: Essential vs Optional Processes
Essential Processes (always needed for TCR route):
(orSampleInfo
)LoadingRNAFromSeurat
(if TCR/BCR data present)ScRepLoading
(unless loading from prepared Seurat object)SeuratPreparingSeuratClusteringSeuratClusterStats
Essential Processes (RNA-only route):
(orSampleInfo
)LoadingRNAFromSeuratSeuratPreparingSeuratClusteringSeuratClusterStats
Optional Processes (enable only if requested):
- T/B cell separationTOrBCellSelection
- Clustering before T/B selectionSeuratClusteringOfAllCells
- Markers before T/B selectionClusterMarkersOfAllCells
- Top genes before T/B selectionTopExpressingGenesOfAllCells
- Automated cell type annotationCellTypeAnnotation
- Reference-based annotationSeuratMap2Ref
- Sub-clustering analysisSeuratSubClustering
- Differential expression between clustersClusterMarkers
- Top expressed genes per clusterTopExpressingGenes
- Flexible marker findingMarkersFinder
- Module/pathway scoringModuleScoreCalculator
- TCR + RNA integrationScRepCombiningExpression
- TCR CDR3 clusteringCDR3Clustering
- TCR-specific analysisTESSA
- CDR3 physicochemical propertiesCDR3AAPhyschem
- Clonality statisticsClonalStats
- Ligand-receptor analysisCellCellCommunication
- Communication plotsCellCellCommunicationPlots
- Fast gene set enrichmentScFGSEA
- Pseudo-bulk differential expressionPseudoBulkDEG
- Comprehensive metabolic analysisScrnaMetabolicLandscape
Pipeline-Level Configuration
Basic Pipeline Options
name = "my_pipeline" # Pipeline name (affects workdir and outdir) outdir = "./output" # Output directory (default: ./<name>-output) loglevel = "info" # Logging level: debug, info, warning, error forks = 4 # Number of parallel jobs (adjust based on CPU cores) cache = true # Enable caching (recommended) error_strategy = "halt" # halt, ignore, or retry num_retries = 3 # Number of retries if error_strategy = "retry"
Scheduler Configuration
Local execution (default):
scheduler = "local"
SLURM cluster:
scheduler = "slurm" [scheduler_opts] qsub_opts = "-p general -q general -N {job.name} -t {job.index}"
SGE cluster:
scheduler = "sge" [scheduler_opts] qsub_opts = "-V -cwd -j yes"
Google Cloud Batch:
# Use: immunopipe gbatch instead of immunopipe # See gbatch skill for configuration
Plugin Options
[plugin_opts.report] filters = ["name:Filter"] # Filter processes in report [plugin_opts.runinfo] # Runinfo plugin enabled by default
Routing to Process Skills
When user needs specific process configuration, route to the appropriate skill:
Core Input Processes
- SampleInfo: Use
skillsampleinfo - LoadingRNAFromSeurat: Use
skillloadingrnafromseurat - ScRepLoading: Use
skillscreploading
Preprocessing Processes
- SeuratPreparing: Use
skillseuratpreparing
Clustering Processes
- SeuratClustering: Use
skillseuratclustering - SeuratClusteringOfAllCells: Use
skillseuratclusteringofallcells - SeuratSubClustering: Use
skillseuratsubclustering
Cell Selection
- TOrBCellSelection: Use
skilltorbcellselection
Annotation Processes
- CellTypeAnnotation: Use
skillcelltypeannotation - SeuratMap2Ref: Use
skillseuratmap2ref
Marker Analysis
- ClusterMarkers: Use
skillclustermarkers - ClusterMarkersOfAllCells: Use
skillclustermarkersofallcells - MarkersFinder: Use
skillmarkersfinder - TopExpressingGenes: Use
skilltopexpressinggenes - TopExpressingGenesOfAllCells: Use
skilltopexpressinggenesofallcells
TCR/BCR Analysis
- ScRepCombiningExpression: Use
skillscrepcombiningexpression - CDR3Clustering: Use
skillcdr3clustering - TESSA: Use
skilltessa - CDR3AAPhyschem: Use
skillcdr3aaphyschem - ClonalStats: Use
skillclonalstats
Downstream Analysis
- ModuleScoreCalculator: Use
skillmodulescorecalculator - CellCellCommunication: Use
skillcellcellcommunication - CellCellCommunicationPlots: Use
skillcellcellcommunicationplots - SeuratClusterStats: Use
skillseuratclusterstats - ScFGSEA: Use
skillscfgsea - PseudoBulkDEG: Use
skillpseudobulkdeg
Metabolic Analysis
- ScrnaMetabolicLandscape: Use
skillscrnametaboliclandscape
Configuration File Structure
A complete TOML configuration file has three sections:
# 1. PIPELINE-LEVEL OPTIONS name = "my_pipeline" outdir = "./output" forks = 4 # 2. PROCESS-LEVEL OPTIONS [ProcessName] cache = true forks = 2 # Override pipeline-level forks for this process [ProcessName.in] # Input files specification [ProcessName.envs] # Environment variables (process parameters) # 3. GOOGLE BATCH OPTIONS (if using immunopipe gbatch) [cli-gbatch] project = "my-gcp-project" region = "us-central1"
Example Workflows
Example 1: Basic TCR Analysis
User request: "I have scRNA-seq and scTCR-seq data. I want basic analysis with T cell selection."
Response:
- Enable essential TCR processes:
,SampleInfo
,ScRepLoading
,SeuratPreparing
,SeuratClusteringSeuratClusterStats - Enable T cell selection:
,SeuratClusteringOfAllCellsTOrBCellSelection - Route to
skill to configure input filessampleinfo - Route to each process skill for configuration
Minimal config:
name = "tcr_analysis" forks = 4 [SampleInfo.in] infile = ["sample_info.txt"] [SeuratClusteringOfAllCells] [TOrBCellSelection]
Example 2: Advanced RNA-only Analysis
User request: "RNA-only data. I need clustering, cell type annotation, marker finding, and pathway enrichment."
Response:
- Enable essential RNA processes:
,SampleInfo
,SeuratPreparing
,SeuratClusteringSeuratClusterStats - Add requested analyses:
,CellTypeAnnotation
,ClusterMarkersScFGSEA - Route to individual skills for configuration
Example 3: Loading from Prepared Seurat Object
User request: "I already have a processed Seurat object. I want to run TCR analysis."
Response:
- Use
instead ofLoadingRNAFromSeurat
+SampleInfoSeuratPreparing - Enable TCR processes:
,ScRepLoading
, etc.SeuratClustering - Set
inprepared = true
to skip preprocessingLoadingRNAFromSeurat
Important Notes
Process Dependencies
Some processes have dependencies:
requires bothScRepCombiningExpression
and RNA inputScRepLoading
requiresClusterMarkersSeuratClustering
usually followsTOrBCellSelectionSeuratClusteringOfAllCells
requires clustering to be completeCellCellCommunication
Mutually Exclusive Options
- Use EITHER
ORSampleInfo
as entry point (not both)LoadingRNAFromSeurat - If using
, typically enableTOrBCellSelection
firstSeuratClusteringOfAllCells
andCellTypeAnnotation
serve similar purposes (can use both, but one usually sufficient)SeuratMap2Ref
Cache Strategy
- Set
at pipeline level to reuse all previous resultscache = "force" - Set
for specific process to force re-runcache = false - Useful when tweaking visualization parameters without re-running analysis
Configuration Validation
After generating configuration, validate with:
python -m immunopipe.validate_config config.toml
External References
When process options reference external packages, expand them:
Seurat Functions
- When seeing
, check: https://satijalab.org/seurat/reference/Seurat::FunctionName - Common functions:
,FindMarkers()
,FindClusters()
,SCTransform()RunUMAP()
Plotthis Functions
- Plot types map to functions:
→bar
,BarPlot
→boxBoxPlot - Full reference: https://pwwang.github.io/plotthis/reference/
DESeq2 Design
- For
, design formulas use DESeq2 syntaxPseudoBulkDEG - Reference: https://bioconductor.org/packages/release/bioc/html/DESeq2.html
GSEA Databases
- For
, GMT files from MSigDBScFGSEA - Reference: https://www.gsea-msigdb.org/gsea/msigdb/
CellChat Database
- For
, CellChat databasesCellCellCommunication - Reference: http://www.cellchat.org/
Workflow Summary
- Assess data type (RNA-only vs TCR/BCR)
- Determine analysis goals (clustering, annotation, TCR analysis, etc.)
- Select essential processes based on data type
- Add optional processes based on goals
- Configure pipeline-level options (name, forks, scheduler)
- Route to individual process skills for detailed configuration
- Generate complete TOML file
- Validate configuration before running
Quick Start Templates
For quick starts, use these templates:
- Basic TCR:
template skillbasic-tcr - Basic RNA-only:
template skillbasic-rna - Advanced TCR:
template skilladvanced-tcr - Metabolic analysis:
template skillmetabolic - Cell communication:
template skillcommunication
Error Prevention
Common configuration errors to avoid:
- Missing input specification: Always set
for entry processes[ProcessName.in] - TCR data without ScRepLoading: If TCRData/BCRData columns exist, enable
ScRepLoading - Contradictory process enablement: Don't enable both "OfAllCells" and regular versions without
TOrBCellSelection - Invalid gene names: Use human gene symbols (uppercase) or mouse (title case)
- Path issues: Use absolute paths or paths relative to config file location
- Resource limits: Set appropriate
based on available CPU/memoryforks
Next Steps
After generating config:
- Save to
file (e.g.,.toml
)config.toml - Run:
immunopipe config.toml - Or use web UI:
pipen board @config.toml - Or use Google Batch:
immunopipe gbatch config.toml
For modifications, route to specific process skills based on what needs to change.