Claude-skill-registry clonalstats

Generate comprehensive clonality statistics and diversity visualizations for TCR/BCR repertoire analysis. Quantifies clonal expansion, measures diversity metrics (Shannon, Simpson, Gini), and creates publication-ready plots.

install

source · Clone the upstream repo

git clone https://github.com/majiayu000/claude-skill-registry

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/clonalstats" ~/.claude/skills/majiayu000-claude-skill-registry-clonalstats && rm -rf "$T"

manifest: skills/data/clonalstats/SKILL.md

ClonalStats Process Configuration

Purpose

When to Use

To quantify clonal expansion patterns in TCR/BCR data
For diversity analysis comparing multiple samples or conditions
To identify hyperexpanded clones and their distribution
For rarefaction analysis to assess sampling depth
After
```
ScRepCombiningExpression
```
to analyze integrated TCR+RNA data

Configuration Structure

Process Enablement

[ClonalStats]
cache = true

Input Specification

[ClonalStats.in]
screpfile = ["ScRepCombiningExpression"]

Core Environment Variables

[ClonalStats.envs]
# Clone definition: "gene" (VDJC), "aa" (CDR3 amino acid), "nt" (CDR3 nucleotide)
clone_call = "aa"
# Chain analysis: "both", "TRA", "TRB", "TRG", "IGH", "IGL"
chain = "both"
# Data transformations (dplyr::mutate syntax)
mutaters = {}
# Data filtering (dplyr::filter syntax)
subset = null
# Output device parameters
devpars = {width = 800, height = 600, res = 100}
# Save code and data (large files - use with caution)
save_code = false
save_data = false

Case-Based Plot Generation

[ClonalStats.envs.cases."Case Name"]
viz_type = "volume"  # volume, abundance, length, residency, stat,
                    # composition, overlap, diversity, geneusage,
                    # positional, kmer, rarefaction

Diversity Metrics

Metric	Range	Interpretation	Best For
shannon	0 - ∞	Higher = more diversity	General comparison
inv.simpson	1 - ∞	Higher = more diversity	Common clones
gini.coeff	0 - 1	0 = equality, 1 = inequality	Clonality dominance
norm.entropy	0 - 1	Higher = more diversity	Evenness-focused
chao1	≥ richness	Estimates total richness	Small samples
d50	Count	Clones making up 50%	Practical dominance

Interpretation:

High diversity = Many unique clones, even distribution (healthy repertoire)
Low diversity = Few dominant clones (antigen-specific response, infection, cancer)
Gini ≈ 1 = Very skewed, few clones dominate
Gini ≈ 0 = Even distribution

Visualization Types

viz_type options:

```
volume
```
- Number of clones per sample/group
```
abundance
```
- Clone abundance distribution (trend/histogram/density)
```
length
```
- CDR3 sequence length distribution
```
residency
```
- Clones present across groups (venn/upset)
```
stat
```
- Expanded clone analysis (pies/sankey)
```
diversity
```
- Diversity metrics (bar/box/violin)
```
geneusage
```
- V/D/J gene usage frequency
```
rarefaction
```
- Sampling depth assessment

Configuration Examples

Minimal Configuration

[ClonalStats.in]
screpfile = ["ScRepCombiningExpression"]

Standard Diversity Analysis

[ClonalStats.in]
screpfile = ["ScRepCombiningExpression"]

[ClonalStats.envs.cases."Diversity"]
viz_type = "diversity"
method = "shannon"
plot_type = "box"
group_by = "Diagnosis"
comparisons = true

[ClonalStats.envs.cases."Gini Coeff"]
viz_type = "diversity"
method = "gini.coeff"
plot_type = "violin"
group_by = "Diagnosis"
add_box = true

Expanded Clone Analysis

[ClonalStats.in]
screpfile = ["ScRepCombiningExpression"]

[ClonalStats.envs.cases."Expanded Clones"]
viz_type = "stat"
plot_type = "pies"
group_by = "Diagnosis"
subgroup_by = "seurat_clusters"
clones = {"Expanded (>2)" = "sel(Colitis > 2)"}

Rarefaction Analysis

[ClonalStats.in]
screpfile = ["ScRepCombiningExpression"]

[ClonalStats.envs.cases."Rarefaction"]
viz_type = "rarefaction"
group_by = "Patient"
q = 1  # 0=richness, 1=shannon, 2=simpson
n_boots = 20

Complete Analysis Suite

[ClonalStats.in]
screpfile = ["ScRepCombiningExpression"]

[ClonalStats.envs.cases."Volume"]
viz_type = "volume"

[ClonalStats.envs.cases."Abundance"]
viz_type = "abundance"
plot_type = "density"

[ClonalStats.envs.cases."Diversity"]
viz_type = "diversity"
method = "shannon"

[ClonalStats.envs.cases."Rarefaction"]
viz_type = "rarefaction"

Common Patterns

Disease vs Healthy

[ClonalStats.envs.cases."Comparison"]
viz_type = "diversity"
method = "gini.coeff"
plot_type = "box"
group_by = "Condition"
comparisons = true

Time Course

[ClonalStats.envs.cases."Timepoint"]
viz_type = "volume"
x = "Timepoint"

[ClonalStats.envs.cases."Diversity"]
viz_type = "diversity"
method = "shannon"
group_by = "Timepoint"

Treatment Response

[ClonalStats.envs.cases."Response"]
viz_type = "diversity"
method = "gini.coeff"
group_by = "Response"
plot_type = "box"
comparisons = true

Dependencies

Upstream:
```
ScRepCombiningExpression
```
(required)
Related:
```
ScRepLoading
```
,
```
CDR3Clustering
```
,
```
TESSA
```
(optional)

Validation Rules

Input must be valid scRepertoire object
For
```
viz_type = "diversity"
```
, method must be supported
For rarefaction,
```
n_boots
```
should be ≥ 10
Use
```
sel()
```
syntax in
```
clones
```
parameter for filtering

Troubleshooting

Sample column not found: Input must have

Sample

column or specify

parameter.

Strange diversity values: Small repertoire sizes cause bias. Use

plot_type = "box"

Rarefaction curves noisy: Increase

n_boots

(try 50-100).

Too many clones in stat plots: Use

subset

or stricter

clones

thresholds.

Plot generation slow: Use

clone_call = "gene"

for speed, apply

subset

Missing comparisons: Set

comparisons = true

to add significance tests.

Best Practices

Start with default cases to see standard visualizations
Use multiple diversity metrics: Shannon + Gini
Check rarefaction curves to ensure sufficient sampling
Document clone thresholds when defining expanded clones
Use
```
clone_call = "gene"
```
for speed, "aa" for granularity
Set
```
save_data = true
```
for debugging (watch disk space)
Validate findings with complementary diversity indices
Consider sample size: small samples underestimate richness