Claude-skill-registry clonalstats
Generate comprehensive clonality statistics and diversity visualizations for TCR/BCR repertoire analysis. Quantifies clonal expansion, measures diversity metrics (Shannon, Simpson, Gini), and creates publication-ready plots.
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/clonalstats" ~/.claude/skills/majiayu000-claude-skill-registry-clonalstats && rm -rf "$T"
manifest:
skills/data/clonalstats/SKILL.mdsource content
ClonalStats Process Configuration
Purpose
Generate comprehensive clonality statistics and diversity visualizations for TCR/BCR repertoire analysis. Quantifies clonal expansion, measures diversity metrics (Shannon, Simpson, Gini), and creates publication-ready plots.
When to Use
- To quantify clonal expansion patterns in TCR/BCR data
- For diversity analysis comparing multiple samples or conditions
- To identify hyperexpanded clones and their distribution
- For rarefaction analysis to assess sampling depth
- After
to analyze integrated TCR+RNA dataScRepCombiningExpression
Configuration Structure
Process Enablement
[ClonalStats] cache = true
Input Specification
[ClonalStats.in] screpfile = ["ScRepCombiningExpression"]
Core Environment Variables
[ClonalStats.envs] # Clone definition: "gene" (VDJC), "aa" (CDR3 amino acid), "nt" (CDR3 nucleotide) clone_call = "aa" # Chain analysis: "both", "TRA", "TRB", "TRG", "IGH", "IGL" chain = "both" # Data transformations (dplyr::mutate syntax) mutaters = {} # Data filtering (dplyr::filter syntax) subset = null # Output device parameters devpars = {width = 800, height = 600, res = 100} # Save code and data (large files - use with caution) save_code = false save_data = false
Case-Based Plot Generation
[ClonalStats.envs.cases."Case Name"] viz_type = "volume" # volume, abundance, length, residency, stat, # composition, overlap, diversity, geneusage, # positional, kmer, rarefaction
Diversity Metrics
| Metric | Range | Interpretation | Best For |
|---|---|---|---|
| shannon | 0 - ∞ | Higher = more diversity | General comparison |
| inv.simpson | 1 - ∞ | Higher = more diversity | Common clones |
| gini.coeff | 0 - 1 | 0 = equality, 1 = inequality | Clonality dominance |
| norm.entropy | 0 - 1 | Higher = more diversity | Evenness-focused |
| chao1 | ≥ richness | Estimates total richness | Small samples |
| d50 | Count | Clones making up 50% | Practical dominance |
Interpretation:
- High diversity = Many unique clones, even distribution (healthy repertoire)
- Low diversity = Few dominant clones (antigen-specific response, infection, cancer)
- Gini ≈ 1 = Very skewed, few clones dominate
- Gini ≈ 0 = Even distribution
Visualization Types
viz_type options:
- Number of clones per sample/groupvolume
- Clone abundance distribution (trend/histogram/density)abundance
- CDR3 sequence length distributionlength
- Clones present across groups (venn/upset)residency
- Expanded clone analysis (pies/sankey)stat
- Diversity metrics (bar/box/violin)diversity
- V/D/J gene usage frequencygeneusage
- Sampling depth assessmentrarefaction
Configuration Examples
Minimal Configuration
[ClonalStats.in] screpfile = ["ScRepCombiningExpression"]
Standard Diversity Analysis
[ClonalStats.in] screpfile = ["ScRepCombiningExpression"] [ClonalStats.envs.cases."Diversity"] viz_type = "diversity" method = "shannon" plot_type = "box" group_by = "Diagnosis" comparisons = true [ClonalStats.envs.cases."Gini Coeff"] viz_type = "diversity" method = "gini.coeff" plot_type = "violin" group_by = "Diagnosis" add_box = true
Expanded Clone Analysis
[ClonalStats.in] screpfile = ["ScRepCombiningExpression"] [ClonalStats.envs.cases."Expanded Clones"] viz_type = "stat" plot_type = "pies" group_by = "Diagnosis" subgroup_by = "seurat_clusters" clones = {"Expanded (>2)" = "sel(Colitis > 2)"}
Rarefaction Analysis
[ClonalStats.in] screpfile = ["ScRepCombiningExpression"] [ClonalStats.envs.cases."Rarefaction"] viz_type = "rarefaction" group_by = "Patient" q = 1 # 0=richness, 1=shannon, 2=simpson n_boots = 20
Complete Analysis Suite
[ClonalStats.in] screpfile = ["ScRepCombiningExpression"] [ClonalStats.envs.cases."Volume"] viz_type = "volume" [ClonalStats.envs.cases."Abundance"] viz_type = "abundance" plot_type = "density" [ClonalStats.envs.cases."Diversity"] viz_type = "diversity" method = "shannon" [ClonalStats.envs.cases."Rarefaction"] viz_type = "rarefaction"
Common Patterns
Disease vs Healthy
[ClonalStats.envs.cases."Comparison"] viz_type = "diversity" method = "gini.coeff" plot_type = "box" group_by = "Condition" comparisons = true
Time Course
[ClonalStats.envs.cases."Timepoint"] viz_type = "volume" x = "Timepoint" [ClonalStats.envs.cases."Diversity"] viz_type = "diversity" method = "shannon" group_by = "Timepoint"
Treatment Response
[ClonalStats.envs.cases."Response"] viz_type = "diversity" method = "gini.coeff" group_by = "Response" plot_type = "box" comparisons = true
Dependencies
- Upstream:
(required)ScRepCombiningExpression - Related:
,ScRepLoading
,CDR3Clustering
(optional)TESSA
Validation Rules
- Input must be valid scRepertoire object
- For
, method must be supportedviz_type = "diversity" - For rarefaction,
should be ≥ 10n_boots - Use
syntax insel()
parameter for filteringclones
Troubleshooting
Sample column not found: Input must have
Sample column or specify x parameter.
Strange diversity values: Small repertoire sizes cause bias. Use
plot_type = "box".
Rarefaction curves noisy: Increase
n_boots (try 50-100).
Too many clones in stat plots: Use
subset or stricter clones thresholds.
Plot generation slow: Use
clone_call = "gene" for speed, apply subset.
Missing comparisons: Set
comparisons = true to add significance tests.
Best Practices
- Start with default cases to see standard visualizations
- Use multiple diversity metrics: Shannon + Gini
- Check rarefaction curves to ensure sufficient sampling
- Document clone thresholds when defining expanded clones
- Use
for speed, "aa" for granularityclone_call = "gene" - Set
for debugging (watch disk space)save_data = true - Validate findings with complementary diversity indices
- Consider sample size: small samples underestimate richness