Claude-skill-registry-data mechinterp-validation-suite
Run credibility checks on feature interpretations including split-half stability and shuffle null tests
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry-data
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry-data "$T" && mkdir -p ~/.claude/skills && cp -r "$T/data/mechinterp-validation-suite" ~/.claude/skills/majiayu000-claude-skill-registry-data-mechinterp-validation-suite && rm -rf "$T"
manifest:
data/mechinterp-validation-suite/SKILL.mdsource content
MechInterp Validation Suite
Run comprehensive credibility checks on feature interpretations to ensure findings are robust and not artifacts.
Purpose
The validation suite skill:
- Tests stability of analysis results across data splits
- Creates null distributions to assess significance
- Generates validation reports with pass/fail criteria
- Helps identify unreliable interpretations early
When to Use
Use this skill when:
- A hypothesis reaches high confidence (>0.7) and needs validation
- You want to verify that a pattern is real, not noise
- Before finalizing feature labels or interpretations
- As part of a standard research checkpoint
Validation Tests
1. Split-Half Stability
Tests if analysis results are consistent across random splits of the data.
What it measures: Correlation of token frequency rankings between two halves Pass criterion: Mean correlation > 0.7
from splatnlp.mechinterp.schemas import ExperimentSpec, ExperimentType from splatnlp.mechinterp.experiments import get_runner_for_type from splatnlp.mechinterp.skill_helpers import load_context # Create split-half spec spec = ExperimentSpec( type=ExperimentType.SPLIT_HALF, feature_id=18712, model_type="ultra", variables={ "n_splits": 10, "metric": "token_frequency_correlation" } ) # Run validation ctx = load_context("ultra") runner = get_runner_for_type(spec.type) result = runner.run(spec, ctx) # Check results mean_corr = result.aggregates.custom["mean_correlation"] passed = result.aggregates.custom["stability_passed"] print(f"Split-half correlation: {mean_corr:.3f} ({'PASS' if passed else 'FAIL'})")
2. Shuffle Null Test
Creates a null distribution by shuffling activations to test if observed patterns are significant.
What it measures: Whether top-token concentration exceeds null expectation Pass criterion: p-value < 0.05
spec = ExperimentSpec( type=ExperimentType.SHUFFLE_NULL, feature_id=18712, model_type="ultra", variables={ "n_shuffles": 100 } ) result = runner.run(spec, ctx) p_value = result.aggregates.custom["p_value"] significant = result.aggregates.custom["significant"] print(f"Shuffle null p-value: {p_value:.4f} ({'SIGNIFICANT' if significant else 'NOT SIGNIFICANT'})")
Running Full Validation Suite
from splatnlp.mechinterp.schemas import ExperimentSpec, ExperimentType from splatnlp.mechinterp.experiments import get_runner_for_type from splatnlp.mechinterp.skill_helpers import load_context from splatnlp.mechinterp.state.io import SPECS_DIR, RESULTS_DIR from datetime import datetime import json def run_validation_suite(feature_id: int, model_type: str = "ultra"): """Run all validation tests for a feature.""" ctx = load_context(model_type) results = {} # Test 1: Split-half stability split_spec = ExperimentSpec( type=ExperimentType.SPLIT_HALF, feature_id=feature_id, model_type=model_type, variables={"n_splits": 10} ) runner = get_runner_for_type(split_spec.type) split_result = runner.run(split_spec, ctx) results["split_half"] = { "mean_correlation": split_result.aggregates.custom.get("mean_correlation"), "passed": split_result.aggregates.custom.get("stability_passed", 0) == 1 } # Test 2: Shuffle null null_spec = ExperimentSpec( type=ExperimentType.SHUFFLE_NULL, feature_id=feature_id, model_type=model_type, variables={"n_shuffles": 100} ) runner = get_runner_for_type(null_spec.type) null_result = runner.run(null_spec, ctx) results["shuffle_null"] = { "p_value": null_result.aggregates.custom.get("p_value"), "passed": null_result.aggregates.custom.get("significant", 0) == 1 } # Overall pass/fail all_passed = all(r["passed"] for r in results.values()) return { "feature_id": feature_id, "model_type": model_type, "tests": results, "overall_passed": all_passed, "timestamp": datetime.now().isoformat() } # Run suite validation = run_validation_suite(18712, "ultra") print(f"\nValidation Suite for Feature {validation['feature_id']}:") print(f" Split-half: {validation['tests']['split_half']['mean_correlation']:.3f} " f"({'PASS' if validation['tests']['split_half']['passed'] else 'FAIL'})") print(f" Shuffle null: p={validation['tests']['shuffle_null']['p_value']:.4f} " f"({'PASS' if validation['tests']['shuffle_null']['passed'] else 'FAIL'})") print(f"\nOVERALL: {'PASS' if validation['overall_passed'] else 'FAIL'}")
Interpretation Guide
Split-Half Results
| Correlation | Interpretation |
|---|---|
| > 0.8 | Excellent stability - results are highly reproducible |
| 0.7 - 0.8 | Good stability - results are reliable |
| 0.5 - 0.7 | Moderate stability - some patterns may be noisy |
| < 0.5 | Poor stability - interpret with caution |
Shuffle Null Results
| p-value | Interpretation |
|---|---|
| < 0.01 | Highly significant - pattern very unlikely by chance |
| 0.01 - 0.05 | Significant - pattern unlikely by chance |
| 0.05 - 0.10 | Marginally significant - borderline |
| > 0.10 | Not significant - pattern may be noise |
Workflow Integration
- Conduct research: Build hypotheses, gather evidence
- Reach confidence threshold: When hypothesis confidence > 0.7
- Run validation suite: Execute this skill
- Update state: Mark hypothesis as validated (or not)
- Document: Add validation results to evidence
from splatnlp.mechinterp.state import ResearchStateManager from splatnlp.mechinterp.schemas.research_state import HypothesisStatus # After validation passes manager = ResearchStateManager(18712, "ultra") manager.update_hypothesis( "h001", status=HypothesisStatus.SUPPORTED, confidence_absolute=0.9 ) manager.add_evidence( experiment_id="validation_suite", result_path="/mnt/e/mechinterp_runs/results/validation.json", summary="Passed split-half (r=0.85) and shuffle null (p<0.01)", strength=EvidenceStrength.STRONG, supports=["h001"] )
CLI Usage
cd /root/dev/SplatNLP # Run split-half validation poetry run python -m splatnlp.mechinterp.cli.runner_cli \ --spec-path specs/split_half_spec.json # Run shuffle null validation poetry run python -m splatnlp.mechinterp.cli.runner_cli \ --spec-path specs/shuffle_null_spec.json
See Also
- mechinterp-state: Update hypotheses after validation
- mechinterp-summarizer: Document validation results
- mechinterp-runner: Execute validation experiments