Claude-skill-registry immunopipe-config

Master skill for generating immunopipe pipeline configurations. Determines pipeline architecture based on data type (scRNA-seq with or without scTCR/BCR-seq) and analysis requirements. Routes to individual process skills for detailed configuration. Use this skill when starting a new immunopipe configuration or modifying pipeline-level options.

install

source · Clone the upstream repo

git clone https://github.com/majiayu000/claude-skill-registry

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/immunopipe-config" ~/.claude/skills/majiayu000-claude-skill-registry-immunopipe-config && rm -rf "$T"

manifest: skills/data/immunopipe-config/SKILL.md

source content

Immunopipe Configuration Generator (Main Skill)

Purpose: Master skill for generating immunopipe pipeline configurations. Routes to individual process skills and determines pipeline architecture based on analysis requirements.

When to Use This Skill

User wants to create/modify immunopipe configuration files
Need to determine which processes to enable based on analysis goals
Need to configure pipeline-level options (name, outdir, forks, scheduler)
Need routing to specific process configuration skills

Pipeline Architecture Decision Tree

Step 1: Data Type Assessment

Ask the user about their data:

Do you have scRNA-seq data?
- If YES → RNA analysis processes needed
- If NO → Cannot proceed (RNA data required)
Do you have scTCR-seq or scBCR-seq data?
- If YES → Enable TCR/BCR processes (TCR route)
- If NO → RNA-only analysis (No-TCR route)
Is your RNA data already processed in a Seurat object?
- If YES → Use
```
LoadingRNAFromSeurat
```
  instead of
```
SampleInfo
```
  +
```
SeuratPreparing
```
- If NO → Use standard input via
```
SampleInfo
```

Step 2: Analysis Goals

Ask what analyses they want to perform:

Goal	Required Processes	Routing
Basic clustering & visualization	`SampleInfo` , `SeuratPreparing` , `SeuratClustering` , `SeuratClusterStats`	Use `sampleinfo` , `seuratpreparing` , `seuratclustering` , `seuratclusterstats` skills
T/B cell selection	Add `TOrBCellSelection`	Use `torbcellselection` skill
Cell type annotation	Add `CellTypeAnnotation` or `SeuratMap2Ref`	Use `celltypeannotation` or `seuratmap2ref` skills
Marker finding	Add `ClusterMarkers` or `MarkersFinder`	Use `clustermarkers` or `markersfinder` skills
TCR clonotype analysis	Add `CDR3Clustering` , `TESSA` , `ClonalStats`	Use `cdr3clustering` , `tessa` , `clonalstats` skills
Cell-cell communication	Add `CellCellCommunication`	Use `cellcellcommunication` skill
Pathway enrichment	Add `ScFGSEA`	Use `scfgsea` skill
Metabolic analysis	Add `ScrnaMetabolicLandscape`	Use `scrnametaboliclandscape` skill
Differential expression	Add `PseudoBulkDEG`	Use `pseudobulkdeg` skill

Step 3: Essential vs Optional Processes

Essential Processes (always needed for TCR route):

```
SampleInfo
```
(or
```
LoadingRNAFromSeurat
```
)
```
ScRepLoading
```
(if TCR/BCR data present)
```
SeuratPreparing
```
(unless loading from prepared Seurat object)
```
SeuratClustering
```
```
SeuratClusterStats
```

Essential Processes (RNA-only route):

```
SampleInfo
```
(or
```
LoadingRNAFromSeurat
```
)
```
SeuratPreparing
```
```
SeuratClustering
```
```
SeuratClusterStats
```

Optional Processes (enable only if requested):

```
TOrBCellSelection
```
- T/B cell separation
```
SeuratClusteringOfAllCells
```
- Clustering before T/B selection
```
ClusterMarkersOfAllCells
```
- Markers before T/B selection
```
TopExpressingGenesOfAllCells
```
- Top genes before T/B selection
```
CellTypeAnnotation
```
- Automated cell type annotation
```
SeuratMap2Ref
```
- Reference-based annotation
```
SeuratSubClustering
```
- Sub-clustering analysis
```
ClusterMarkers
```
- Differential expression between clusters
```
TopExpressingGenes
```
- Top expressed genes per cluster
```
MarkersFinder
```
- Flexible marker finding
```
ModuleScoreCalculator
```
- Module/pathway scoring
```
ScRepCombiningExpression
```
- TCR + RNA integration
```
CDR3Clustering
```
- TCR CDR3 clustering
```
TESSA
```
- TCR-specific analysis
```
CDR3AAPhyschem
```
- CDR3 physicochemical properties
```
ClonalStats
```
- Clonality statistics
```
CellCellCommunication
```
- Ligand-receptor analysis
```
CellCellCommunicationPlots
```
- Communication plots
```
ScFGSEA
```
- Fast gene set enrichment
```
PseudoBulkDEG
```
- Pseudo-bulk differential expression
```
ScrnaMetabolicLandscape
```
- Comprehensive metabolic analysis

Pipeline-Level Configuration

Basic Pipeline Options

name = "my_pipeline"           # Pipeline name (affects workdir and outdir)
outdir = "./output"            # Output directory (default: ./<name>-output)
loglevel = "info"              # Logging level: debug, info, warning, error
forks = 4                      # Number of parallel jobs (adjust based on CPU cores)
cache = true                   # Enable caching (recommended)
error_strategy = "halt"        # halt, ignore, or retry
num_retries = 3                # Number of retries if error_strategy = "retry"

Scheduler Configuration

Local execution (default):

scheduler = "local"

SLURM cluster:

scheduler = "slurm"

[scheduler_opts]
qsub_opts = "-p general -q general -N {job.name} -t {job.index}"

SGE cluster:

scheduler = "sge"

[scheduler_opts]
qsub_opts = "-V -cwd -j yes"

Google Cloud Batch:

# Use: immunopipe gbatch instead of immunopipe
# See gbatch skill for configuration

Plugin Options

[plugin_opts.report]
filters = ["name:Filter"]  # Filter processes in report

[plugin_opts.runinfo]
# Runinfo plugin enabled by default

Routing to Process Skills

When user needs specific process configuration, route to the appropriate skill:

Core Input Processes

SampleInfo: Use
```
sampleinfo
```
skill
LoadingRNAFromSeurat: Use
```
loadingrnafromseurat
```
skill
ScRepLoading: Use
```
screploading
```
skill

Preprocessing Processes

SeuratPreparing: Use
```
seuratpreparing
```
skill

Clustering Processes

SeuratClustering: Use
```
seuratclustering
```
skill
SeuratClusteringOfAllCells: Use
```
seuratclusteringofallcells
```
skill
SeuratSubClustering: Use
```
seuratsubclustering
```
skill

Cell Selection

TOrBCellSelection: Use
```
torbcellselection
```
skill

Annotation Processes

CellTypeAnnotation: Use
```
celltypeannotation
```
skill
SeuratMap2Ref: Use
```
seuratmap2ref
```
skill

Marker Analysis

ClusterMarkers: Use
```
clustermarkers
```
skill
ClusterMarkersOfAllCells: Use
```
clustermarkersofallcells
```
skill
MarkersFinder: Use
```
markersfinder
```
skill
TopExpressingGenes: Use
```
topexpressinggenes
```
skill
TopExpressingGenesOfAllCells: Use
```
topexpressinggenesofallcells
```
skill

TCR/BCR Analysis

ScRepCombiningExpression: Use
```
screpcombiningexpression
```
skill
CDR3Clustering: Use
```
cdr3clustering
```
skill
TESSA: Use
```
tessa
```
skill
CDR3AAPhyschem: Use
```
cdr3aaphyschem
```
skill
ClonalStats: Use
```
clonalstats
```
skill

Downstream Analysis

ModuleScoreCalculator: Use
```
modulescorecalculator
```
skill
CellCellCommunication: Use
```
cellcellcommunication
```
skill
CellCellCommunicationPlots: Use
```
cellcellcommunicationplots
```
skill
SeuratClusterStats: Use
```
seuratclusterstats
```
skill
ScFGSEA: Use
```
scfgsea
```
skill
PseudoBulkDEG: Use
```
pseudobulkdeg
```
skill

Metabolic Analysis

ScrnaMetabolicLandscape: Use
```
scrnametaboliclandscape
```
skill

Configuration File Structure

A complete TOML configuration file has three sections:

# 1. PIPELINE-LEVEL OPTIONS
name = "my_pipeline"
outdir = "./output"
forks = 4

# 2. PROCESS-LEVEL OPTIONS
[ProcessName]
cache = true
forks = 2  # Override pipeline-level forks for this process

[ProcessName.in]
# Input files specification

[ProcessName.envs]
# Environment variables (process parameters)

# 3. GOOGLE BATCH OPTIONS (if using immunopipe gbatch)
[cli-gbatch]
project = "my-gcp-project"
region = "us-central1"

Example Workflows

Example 1: Basic TCR Analysis

User request: "I have scRNA-seq and scTCR-seq data. I want basic analysis with T cell selection."

Response:

Enable essential TCR processes:

SampleInfo

ScRepLoading

SeuratPreparing

SeuratClustering

SeuratClusterStats

Enable T cell selection:

SeuratClusteringOfAllCells

TOrBCellSelection

Route to
```
sampleinfo
```
skill to configure input files
Route to each process skill for configuration

Minimal config:

name = "tcr_analysis"
forks = 4

[SampleInfo.in]
infile = ["sample_info.txt"]

[SeuratClusteringOfAllCells]
[TOrBCellSelection]

Example 2: Advanced RNA-only Analysis

User request: "RNA-only data. I need clustering, cell type annotation, marker finding, and pathway enrichment."

Response:

Enable essential RNA processes:

SampleInfo

SeuratPreparing

SeuratClustering

SeuratClusterStats

Add requested analyses:
```
CellTypeAnnotation
```
,
```
ClusterMarkers
```
,
```
ScFGSEA
```
Route to individual skills for configuration

Example 3: Loading from Prepared Seurat Object

User request: "I already have a processed Seurat object. I want to run TCR analysis."

Response:

Use

LoadingRNAFromSeurat

instead of

SampleInfo

SeuratPreparing

Enable TCR processes:
```
ScRepLoading
```
,
```
SeuratClustering
```
, etc.
Set
```
prepared = true
```
in
```
LoadingRNAFromSeurat
```
to skip preprocessing

Important Notes

Process Dependencies

Some processes have dependencies:

```
ScRepCombiningExpression
```
requires both
```
ScRepLoading
```
and RNA input
```
ClusterMarkers
```
requires
```
SeuratClustering
```

TOrBCellSelection

usually follows

SeuratClusteringOfAllCells

```
CellCellCommunication
```
requires clustering to be complete

Mutually Exclusive Options

Use EITHER
```
SampleInfo
```
OR
```
LoadingRNAFromSeurat
```
as entry point (not both)

If using

TOrBCellSelection

, typically enable

SeuratClusteringOfAllCells

first

```
CellTypeAnnotation
```
and
```
SeuratMap2Ref
```
serve similar purposes (can use both, but one usually sufficient)

Cache Strategy

Set
```
cache = "force"
```
at pipeline level to reuse all previous results
Set
```
cache = false
```
for specific process to force re-run
Useful when tweaking visualization parameters without re-running analysis

Configuration Validation

After generating configuration, validate with:

python -m immunopipe.validate_config config.toml

External References

When process options reference external packages, expand them:

Seurat Functions

When seeing
```
Seurat::FunctionName
```
, check: https://satijalab.org/seurat/reference/

Common functions:

FindMarkers()

FindClusters()

SCTransform()

RunUMAP()

Plotthis Functions

Plot types map to functions:
```
bar
```
→
```
BarPlot
```
,
```
box
```
→
```
BoxPlot
```
Full reference: https://pwwang.github.io/plotthis/reference/

DESeq2 Design

For
```
PseudoBulkDEG
```
, design formulas use DESeq2 syntax
Reference: https://bioconductor.org/packages/release/bioc/html/DESeq2.html

GSEA Databases

For
```
ScFGSEA
```
, GMT files from MSigDB
Reference: https://www.gsea-msigdb.org/gsea/msigdb/

CellChat Database

For
```
CellCellCommunication
```
, CellChat databases
Reference: http://www.cellchat.org/

Workflow Summary

Assess data type (RNA-only vs TCR/BCR)
Determine analysis goals (clustering, annotation, TCR analysis, etc.)
Select essential processes based on data type
Add optional processes based on goals
Configure pipeline-level options (name, forks, scheduler)
Route to individual process skills for detailed configuration
Generate complete TOML file
Validate configuration before running

Quick Start Templates

For quick starts, use these templates:

Basic TCR:
```
basic-tcr
```
template skill
Basic RNA-only:
```
basic-rna
```
template skill
Advanced TCR:
```
advanced-tcr
```
template skill
Metabolic analysis:
```
metabolic
```
template skill
Cell communication:
```
communication
```
template skill

Error Prevention

Common configuration errors to avoid:

Missing input specification: Always set
```
[ProcessName.in]
```
for entry processes
TCR data without ScRepLoading: If TCRData/BCRData columns exist, enable
```
ScRepLoading
```
Contradictory process enablement: Don't enable both "OfAllCells" and regular versions without
```
TOrBCellSelection
```
Invalid gene names: Use human gene symbols (uppercase) or mouse (title case)
Path issues: Use absolute paths or paths relative to config file location
Resource limits: Set appropriate
```
forks
```
based on available CPU/memory

Next Steps

After generating config:

Save to
```
.toml
```
file (e.g.,
```
config.toml
```
)
Run:
```
immunopipe config.toml
```
Or use web UI:
```
pipen board @config.toml
```
Or use Google Batch:
```
immunopipe gbatch config.toml
```

For modifications, route to specific process skills based on what needs to change.