git clone https://github.com/jkitchin/jaxsr
T=$(mktemp -d) && git clone --depth=1 https://github.com/jkitchin/jaxsr "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/jaxsr-review" ~/.claude/skills/jkitchin-jaxsr-jaxsr-review && rm -rf "$T"
.claude/skills/jaxsr-review/SKILL.mdJAXSR Review Skill — Red-Team, Engineering & Pedagogical Review
Systematically review JAXSR code, documentation, guides, and notebooks for API correctness, engineering quality, and pedagogical clarity.
Skill Activation
Activate this skill when the user invokes
/jaxsr-review or asks to review, audit, or
check correctness of JAXSR-related files (notebooks, guides, templates, source code).
Invocation Syntax
/jaxsr-review <TARGET> # all scopes on one file/dir /jaxsr-review --scope api <TARGET> # API correctness only /jaxsr-review --scope engineering <TARGET> # safety/robustness only /jaxsr-review --scope pedagogy <TARGET> # clarity/explanation only
TARGET can be a file path, directory, or glob pattern.
If no target is given, review the whole project (examples/, docs/, src/jaxsr/).
Scope Mapping
| Scope | What it checks | CLAUDE.md checklists |
|---|---|---|
| Signatures, return types, imports, ANOVA filtering, copy-paste safety | #1 Red-Team, #5 Cross-Ref, #6 Copy-Paste |
| Numerical hazards, destructive ops, packaging, dead code, docstrings | #2 Software Engineering |
| Progression, term definitions, "why" explanations, coverage gaps | #3 Pedagogical, #4 Coverage Gap |
When no
--scope is specified, run all three scopes.
Phase 1: Parse Arguments
- Extract
value (default: all three scopes).--scope - Extract
path(s). Resolve globs. Verify files exist.TARGET - Classify each file by type:
(notebook),.ipynb
(guide/doc),.md
(source/template)..py - Read each target file. For notebooks, examine every code cell.
Phase 2: API Correctness Review (scope = api
)
apiThis is the highest-priority review. Incorrect examples teach users wrong patterns.
API Truth Table
For every JAXSR API call found in the target files, verify against this authoritative reference. Each entry lists: function/class, correct usage, and common mistake.
Return Types
| API | Returns | Common Mistake |
|---|---|---|
| — 4-tuple per term | Unpacking as — it's a 4-tuple |
| with keys , , , , , | Using or — it's a plain dict, use |
| with keys , , , , | Using , — it's a plain dict, use |
| with keys , , , | Tuple unpacking — returns dict |
| | Treating like dict — the method returns a tuple, the standalone function returns dict |
| with , , etc. | Treating return as array |
| dataclass | — |
Class Attributes
| Class | Correct Attribute | Common Mistake |
|---|---|---|
| (property, returns ) | (no trailing underscore) |
| (property, returns ) | (wrong name entirely) |
| | (wrong name) |
| , , , , , , | — |
| (list of ) | — |
| , , , , , | can be for summary rows |
Constructor Signatures
| Class/Function | Correct Signature | Common Mistake |
|---|---|---|
| is 2nd positional arg | — wrong arg order |
| is list of tuples | — no separate weights param |
| is required first arg | Missing |
| is required first arg | — missing |
| Parameter is | — wrong param name |
| Parameter is | — wrong param name |
Sklearn Compatibility Protocol
| API | Correct Usage | Common Mistake |
|---|---|---|
| Returns of constructor params | Accessing non-constructor attrs |
| Includes nested keys for | Expecting only flat keys |
| Sets constructor params, returns | Passing non-constructor param names |
| Double-underscore syntax for nested params | — wrong level |
| Uses | Manual parameter listing |
sklearn | Always use | — conflicts with JAX parallelism |
in Pipeline | Not a transformer; configure before creating estimator | Including as Pipeline step |
Method Names
| Class | Correct Method | Common Mistake |
|---|---|---|
| , , — separate methods | — wrong method name |
Parameter Values
| Parameter | Valid Values | Common Mistake |
|---|---|---|
| , , | — not supported as IC |
| , , , , | Note: and are valid HERE but NOT in |
| Defaults to (4 funcs) | Listing 7 default funcs (sin, cos, tan are NOT default) |
| — maps dimension index to allowed values | — wrong type |
ANOVA Filtering
When iterating over
anova_result.rows to compute percentages or display tables:
- MUST filter out rows with
insource
— these are summary rows.{"Model", "Residual", "Total"} - MUST null-guard
— it isp_value
forNone
and"Residual"
rows."Total" - Correct pattern:
for row in result.rows: if row.source in {"Model", "Residual", "Total"}: continue # skip summary rows pct = 100 * row.sum_sq / total_ss
Anti-False-Positive Rules
Do NOT flag these as errors:
— valid for standalone functioncompute_information_criterion(..., criterion="hqc")
returning a tuple — the method returnsmodel.predict_interval()(y_pred, lower, upper)
without underscore — this IS correct (no trailingBayesianModelAverage.weights
)_- Imports from
that exist injaxsr
— even if not commonly used__all__ - Using
on JAX arrays outside JIT — this is fine outside JIT.item()
Procedure
For each target file:
- Extract all JAXSR API calls — imports, function calls, class instantiations, attribute access.
- Check each call against the truth table above. Flag mismatches as CRITICAL.
- Check imports — verify every imported symbol exists in
.jaxsr.__init__.__all__ - Check return type usage — if a function returns a dict and code unpacks as tuple, flag CRITICAL.
- Check ANOVA loops — verify summary rows are filtered before computing percentages.
- Copy-paste safety — verify each code block is self-contained OR explicitly references prerequisites.
- Cross-references — verify any
or"See guides/..."
references point to files that exist."See templates/..."
Phase 3: Software Engineering Review (scope = engineering
)
engineeringNumerical Hazards
Check for:
on potentially singular matrices → suggestjnp.linalg.inv()
,lstsq
, or SVDpinv
without guardingjnp.log(x)
→ suggestx > 0jnp.log(jnp.clip(x, 1e-30))- Division by zero without guards
- Empty array operations (
of empty, indexing empty)mean
orfloat()
on JAX tracers insideint()
-decorated functions@jit- Python
on runtime values insideif/else
→ suggest@jit
orjax.lax.condjnp.where
Destructive Operations
Check for:
without path validation against deny-list (shutil.rmtree()
,/
,/home
,/usr
,/etc
,/var
,/tmp
)$HOME
oros.remove()
without existence checkos.unlink()- File overwrites without backup or confirmation
calls with unsanitized inputssubprocess
Packaging & Sync
Check for:
- Symbols in
that are not imported in__all____init__.py - Imports in
that are not in__init__.py__all__ - Duplicate file inclusions in
(pyproject.toml
vspackages
)force-include
and.claude/skills/jaxsr/
out of syncsrc/jaxsr/skill/
Dead Code
Check for:
- Unused imports (F401)
- Unused variables (F841)
- Bare expressions with no side effects (
,len(x)
without assignment)list(x) - Functions not exported and not called
Docstring Quality (for .py files)
Check that public functions have:
- Numpy-style docstring with Parameters, Returns, Raises sections
- Type annotations on all parameters and return type
- Docstring examples using actual function signatures (not stale names)
Notebook-Specific Checks
For
.ipynb files:
- Guard against negative values in physical simulations
- Use
notscipy.special.erfinvnp.math.erfinv - Add null-guards for ANOVA
p_value - Check for cross-cell variable dependencies that break when cells run out of order
- Check for temp files that should be cleaned up
Phase 4: Pedagogical Review (scope = pedagogy
)
pedagogyRead the target as a first-time user and evaluate:
Structure & Flow
- Is there a logical progression from simple to advanced?
- Are concepts introduced before they're used?
- Does the material start with a motivating example or question?
Terminology
- Are domain-specific terms explained on first use?
- Examples: "profile likelihood", "Pareto front", "AICc correction", "conformal prediction", "canonical analysis", "stationary point"
- Is terminology consistent within the document?
- Check: "information criterion" vs "IC" vs "selection criterion"
- Check: "basis functions" vs "candidate terms" vs "features"
- Check: "selection strategy" vs "search strategy" vs "algorithm"
Explanation Quality
- Are the "why" questions answered, not just the "how"?
- Example: Why AICc over BIC? When would you choose one over the other?
- Are decision points clearly marked with guidance?
- Are warnings and caveats placed before the code they apply to, not after?
Code Block Self-Sufficiency
Every code block must either:
- Be self-contained (includes all imports and data setup), or
- Clearly state "Continuing from above..." with a reference to the prerequisite section
Common failures to flag:
- Missing
orimport numpy as npfrom jaxsr import ... - Using variables defined in earlier code blocks without re-defining them
- Using
variable before the fitting code blockmodel
Coverage Gap Analysis
Check which JAXSR features are NOT covered by any guide, template, or notebook:
Currently covered:
- Basis library building, Model fitting & selection, Uncertainty quantification
- Constraints, DOE workflow, Active learning, RSM, Known-model fitting, CLI
- Scikit-learn integration (cross-validation, GridSearchCV, Pipeline, model comparison)
Known gaps (flag as INFO if relevant to the target):
- Metrics comparison guide (R^2 vs AIC vs cross-validation)
- Export & reporting guide (JSON, LaTeX, callable, Excel, Word)
- Categorical variables (indicators, interactions, encoding/decoding)
- SISSO / power-law / rational-form basis builders
- Model serialization round-trip
- BayesianModelAverage standalone workflow
- Conformal prediction standalone usage
- Multi-response / multi-objective workflows
Phase 5: Generate Report
Produce a structured markdown report with this exact format:
# Review Report **Target:** <file or directory path> **Scope:** <api | engineering | pedagogy | all> **Files reviewed:** <count> ## Summary | Severity | Count | |----------|-------| | CRITICAL | N | | WARNING | N | | INFO | N | ## CRITICAL ### [C1] <short title> - **File:** <path>, <location (line N or cell N, line N)> - **Code:** `<offending code snippet>` - **Problem:** <what is wrong> - **Fix:** `<corrected code>` - **Source:** <source file:line where the correct API is defined> ### [C2] ... ## WARNING ### [W1] <short title> - **File:** <path>, <location> - **Code:** `<relevant code>` - **Problem:** <what could go wrong> - **Suggestion:** <recommended change> ### [W2] ... ## INFO ### [I1] <short title> - **File:** <path>, <location> - **Note:** <observation or suggestion> --- *Review generated by `/jaxsr-review` skill*
Severity Definitions
| Severity | Definition |
|---|---|
| CRITICAL | Will cause runtime errors, wrong results, or teaches incorrect API usage. Must fix before merging. |
| WARNING | Won't crash but is fragile, unclear, or inconsistent. Should fix. |
| INFO | Stylistic, coverage gap, or minor improvement opportunity. Nice to have. |
Classification Rules
- Wrong return type usage (dict vs tuple, wrong attribute) → CRITICAL
- Wrong parameter name or argument order → CRITICAL
- Wrong parameter value (e.g.,
) → CRITICALinformation_criterion="cv" - Missing ANOVA summary row filter → CRITICAL
- Import of non-existent symbol → CRITICAL
- Cross-reference to non-existent file → WARNING
- Missing null-guard for ANOVA
→ WARNINGp_value
without singularity guard → WARNINGjnp.linalg.inv()- Unexplained domain term → WARNING
- Missing imports in code block → WARNING
- Coverage gap relevant to the target → INFO
- Inconsistent terminology → INFO
- Style suggestion → INFO
Execution Notes
- Read each target file completely before analyzing. Do not guess at contents.
- For directories, recursively find all
,.py
, and.md
files..ipynb - For notebooks, parse all code cells and markdown cells separately.
- When checking imports, read
to get the currentsrc/jaxsr/__init__.py
list.__all__ - When verifying cross-references, use Glob to check file existence.
- If no issues are found for a scope, state "No issues found" in that section.
- Always include the Summary table, even if all counts are 0.