Claude-skill-registry-data mechinterp-runner

Execute mechanistic interpretability experiments from JSON specs - family sweeps, itemsets, interactions, minimal cores, validation

install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry-data
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry-data "$T" && mkdir -p ~/.claude/skills && cp -r "$T/data/mechinterp-runner" ~/.claude/skills/majiayu000-claude-skill-registry-data-mechinterp-runner && rm -rf "$T"
manifest: data/mechinterp-runner/SKILL.md
source content

MechInterp Runner

Execute experiment specifications for mechanistic interpretability analysis. This skill takes JSON spec files and runs the appropriate analysis, producing structured result files.

Purpose

The runner skill:

  • Loads experiment specs from JSON files
  • Routes to the appropriate experiment runner
  • Executes the analysis with constraint enforcement
  • Produces structured JSON results with diagnostics

When to Use

Use this skill after:

  1. The planner has generated experiment specs
  2. You have manually created an experiment spec
  3. You need to re-run an experiment with different parameters

Supported Experiment Types

TypeDescription
family_1d_sweep
Test one ability family across AP rungs
family_2d_heatmap
Test two families in a 2D grid (e.g., SCU × ISS)
frequent_itemsets
Mine co-occurring token patterns
minimal_cores
Find irreducible activating token sets
pairwise_interactions
Compute token-pair synergy/redundancy
conditional_interactions
How a third token modulates interactions
split_half
Validation: correlation across random splits
shuffle_null
Validation: null distribution via shuffling
weapon_sweep
⚠️ CORRELATIONAL: Analyze activation by weapon (observational grouping)
kit_sweep
⚠️ CORRELATIONAL: Analyze activation by sub/special weapon

Usage

Subcommand Interface (Recommended)

Run experiments directly without writing JSON spec files:

cd /root/dev/SplatNLP

# 1D family sweep
poetry run python -m splatnlp.mechinterp.cli.runner_cli family-sweep \
    --feature-id 6235 --family quick_respawn --model ultra

# 2D heatmap
poetry run python -m splatnlp.mechinterp.cli.runner_cli heatmap \
    --feature-id 6235 --family-x special_charge_up --family-y quick_respawn

# Weapon sweep (correlational)
poetry run python -m splatnlp.mechinterp.cli.runner_cli weapon-sweep \
    --feature-id 6235 --model ultra --top-k 20

# Kit sweep (correlational)
poetry run python -m splatnlp.mechinterp.cli.runner_cli kit-sweep \
    --feature-id 6235 --model ultra --analyze-combinations

# Binary ability presence analysis
poetry run python -m splatnlp.mechinterp.cli.runner_cli binary \
    --feature-id 6235 --model ultra

# Core coverage analysis
poetry run python -m splatnlp.mechinterp.cli.runner_cli coverage \
    --feature-id 6235 --tokens comeback,stealth_jump,respawn_punisher

# Token influence analysis
poetry run python -m splatnlp.mechinterp.cli.runner_cli token-influence \
    --feature-id 6235 --model ultra

Subcommand Reference

SubcommandRequired ArgsOptional Args
family-sweep
--feature-id
,
--family
--model
,
--rungs
heatmap
--feature-id
,
--family-x
,
--family-y
--model
,
--rungs-x
,
--rungs-y
weapon-sweep
--feature-id
--model
,
--top-k
,
--min-examples
kit-sweep
--feature-id
--model
,
--top-k
,
--analyze-combinations
binary
--feature-id
--model
,
--tokens
coverage
--feature-id
--model
,
--tokens
,
--threshold
token-influence
--feature-id
--model
,
--high-percentile

JSON Spec Mode (Legacy/Advanced)

For complex experiments or batch processing, use JSON spec files:

cd /root/dev/SplatNLP

# Run an experiment spec
poetry run python -m splatnlp.mechinterp.cli.runner_cli \
    --spec-path /mnt/e/mechinterp_runs/specs/20250607__f18712__family-1d-sweep.json

# With custom output directory
poetry run python -m splatnlp.mechinterp.cli.runner_cli \
    --spec-path my_spec.json \
    --output-dir ./my_results/

# Dry run (validate spec only)
poetry run python -m splatnlp.mechinterp.cli.runner_cli \
    --spec-path my_spec.json \
    --dry-run

# List available experiment types
poetry run python -m splatnlp.mechinterp.cli.runner_cli --list-types

When to Use Subcommands vs JSON Specs

Use Subcommands WhenUse JSON Specs When
Quick one-off experimentsBatch processing multiple specs
Standard experiment configsCustom dataset slices needed
Interactive investigationNeed to track experiment provenance
You want to avoid writing JSONComplex constraint configurations

Programmatic

from splatnlp.mechinterp.schemas import ExperimentSpec, ExperimentType
from splatnlp.mechinterp.experiments import get_runner_for_type
from splatnlp.mechinterp.skill_helpers import load_context

# Create spec
spec = ExperimentSpec(
    type=ExperimentType.FAMILY_1D_SWEEP,
    feature_id=18712,
    model_type="ultra",
    variables={"family": "special_charge_up"},
)

# Load context and run
ctx = load_context("ultra")
runner = get_runner_for_type(spec.type)
result = runner.run(spec, ctx)

# Check result
print(result.get_summary())
if result.success:
    print(f"Mean delta: {result.aggregates.mean_delta}")

Spec File Format

{
  "id": "20250607_142531",
  "type": "family_1d_sweep",
  "feature_id": 18712,
  "model_type": "ultra",
  "dataset_slice": {
    "percentile_min": 10.0,
    "percentile_max": 90.0,
    "sample_size": 500
  },
  "variables": {
    "family": "special_charge_up",
    "rungs": [3, 12, 29, 41, 57],
    "include_absent": true
  },
  "constraints": ["one_rung_per_family"],
  "outputs": {
    "aggregates": true,
    "tables": true,
    "diagnostics": true,
    "figures": false
  },
  "description": "Test SCU response across rungs",
  "parent_hypothesis": "h001"
}

Result File Format

{
  "spec_id": "20250607_142531",
  "spec_path": "20250607_142531__f18712__family-1d-sweep.json",
  "feature_id": 18712,
  "experiment_type": "family_1d_sweep",
  "aggregates": {
    "mean_delta": 0.35,
    "std_delta": 0.12,
    "n_samples": 500,
    "custom": {
      "threshold_rung": 41,
      "max_rung_delta": 0.52
    }
  },
  "tables": {
    "rung_deltas": {
      "name": "rung_deltas",
      "columns": ["rung", "mean_delta", "std_error", "n"],
      "rows": [
        {"rung": 3, "mean_delta": 0.05, "std_error": 0.02, "n": 100},
        {"rung": 12, "mean_delta": 0.12, "std_error": 0.03, "n": 100}
      ]
    }
  },
  "diagnostics": {
    "relu_floor_detected": false,
    "relu_floor_rate": 0.02,
    "n_contexts_tested": 500,
    "warnings": []
  },
  "success": true,
  "duration_seconds": 45.3
}

File Locations

  • Specs:
    /mnt/e/mechinterp_runs/specs/
  • Results:
    /mnt/e/mechinterp_runs/results/
  • Figures:
    /mnt/e/mechinterp_runs/figures/

Constraint Enforcement

The runner enforces constraints specified in the spec:

  • one_rung_per_family: Prevents invalid multi-rung builds
  • no_weapon_gating_if_relu_floor: Warns/skips if base activation too low

Violations are logged in

diagnostics.constraint_violations
and
diagnostics.warnings
.

Error Handling

If an experiment fails:

  • result.success
    is
    False
  • result.error_message
    contains the error
  • Partial results may still be available in
    aggregates
    /
    tables

Weapon Analysis Workflow: weapon_sweep → kit_sweep

⚠️ IMPORTANT: Both weapon_sweep and kit_sweep are CORRELATIONAL analyses.

They show which weapons/kits are associated with high activation through observational grouping, NOT through counterfactual intervention. High activation for a weapon may be because:

  • The weapon itself drives the feature
  • Players of that weapon tend to use certain abilities
  • The weapon's kit (sub/special) is the actual driver

Always cross-reference with ability analysis to distinguish weapon effects from ability effects.

When analyzing weapon-specific patterns, follow this workflow:

Step 1: Run weapon_sweep

{
  "type": "weapon_sweep",
  "feature_id": 18712,
  "model_type": "ultra",
  "variables": {"min_examples": 10, "top_k_weapons": 20}
}

Check result diagnostics for dominant weapon warning:

  • If
    diagnostics.warnings
    contains "DOMINANT WEAPON", one weapon has >2x delta
  • aggregates.custom.dominant_weapon_detected
    will be
    true
  • aggregates.custom.recommended_followup
    will be
    "kit_sweep"

Step 2: Run kit_sweep (if dominant weapon detected)

{
  "type": "kit_sweep",
  "feature_id": 18712,
  "model_type": "ultra",
  "variables": {
    "min_examples": 10,
    "top_k": 10,
    "analyze_combinations": true
  }
}

Output tables:

  • sub_stats
    : Activation statistics by sub weapon (mean, std, n, delta_from_global)
  • special_stats
    : Activation statistics by special weapon
  • combo_stats
    : (if analyze_combinations=true) Statistics by sub+special pairs

Aggregates:

  • top_sub
    ,
    top_sub_mean
    ,
    top_sub_delta
  • top_special
    ,
    top_special_mean
    ,
    top_special_delta

Step 3: Cross-reference with splatoon3-meta

Use the splatoon3-meta skill to look up weapon kits:

  • Read
    .claude/skills/splatoon3-meta/references/weapons.md
  • Compare dominant weapon's kit with kit_sweep results
  • Check if high-activation weapons share sub or special

Example: Feature 18712

weapon_sweep results:
  - Octobrush Nouveau: +0.22 delta (DOMINANT - 2.4x second weapon)
  - Rapid Blaster: +0.09 delta

kit_sweep results:
  - Top special: Ink Storm (+0.18 delta)
  - Top sub: Squid Beakon (+0.08 delta)

splatoon3-meta lookup:
  - Octobrush Nouveau: Squid Beakon + Ink Storm

Conclusion: Feature encodes "Ink Storm spam builds" not "Octobrush Nouveau builds"

Known Limitations

Binary Tokens in family_2d_heatmap

LIMITATION: The

family_2d_heatmap
experiment type does NOT correctly handle binary abilities (comeback, stealth_jump, haunt, etc.).

The runner uses

parse_token()
which expects tokens in
family_name_AP
format (e.g.,
swim_speed_up_21
), but binary abilities appear as just the token name without an AP suffix (e.g.,
comeback
not
comeback_10
).

Workaround: Use manual 2D analysis code for binary abilities. See the Binary Ability Analysis Protocol in mechinterp-investigator.

Future Enhancement: Stable Tokens

A useful enhancement would be adding "stable tokens" to sweep experiments - tokens that are held constant across all conditions in the sweep. This would allow testing questions like:

  • "How does SCU affect activation when Comeback is present?"
  • "How does ISM scale on Stamper builds?"

Proposed spec format:

{
  "type": "family_1d_sweep",
  "variables": {
    "family": "special_charge_up",
    "stable_tokens": ["comeback", "stealth_jump"]  // Hold these constant
  }
}

This is not currently implemented.

See Also

  • mechinterp-next-step-planner: Generate experiment specs
  • mechinterp-state: Track research progress
  • mechinterp-summarizer: Convert results to notes
  • mechinterp-glossary-and-constraints: Domain reference
  • splatoon3-meta: Weapon kit lookups and meta knowledge