Claude-code-minoan test-harness-auditor

Audit a repo's test, lint, type-check, static analysis, build, and debug infrastructure for AI coding agents. Generate scored reports and optimized configs for the lint-on-write hook. Triggers on audit tests, test harness, lint setup, check test infrastructure, entering a new repo.

install

source · Clone the upstream repo

git clone https://github.com/tdimino/claude-code-minoan

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/tdimino/claude-code-minoan "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/core-development/test-harness-auditor" ~/.claude/skills/tdimino-claude-code-minoan-test-harness-auditor && rm -rf "$T"

manifest: skills/core-development/test-harness-auditor/SKILL.md

source content

Test Harness Auditor

Audit any repo's feedback infrastructure across six layers and generate optimized configs for AI coding agents.

When to Run

Entering a new repo with no
```
.claude/lint-rules.json
```
User asks to audit tests, lint setup, or agent infrastructure
After cloning a repo to check what feedback loops exist
Periodically to catch configuration drift

Two-Phase Workflow

Phase 1: Audit (read-only)

Run the audit script to scan the current repo:

uv run ~/.claude/skills/test-harness-auditor/scripts/audit.py

Or target a specific directory:

uv run ~/.claude/skills/test-harness-auditor/scripts/audit.py /path/to/repo

For machine-readable output (consumed by Phase 2):

uv run ~/.claude/skills/test-harness-auditor/scripts/audit.py --json > /tmp/audit.json

To save a snapshot for drift detection (tracks score changes over time):

uv run ~/.claude/skills/test-harness-auditor/scripts/audit.py --save

Combine flags:

--json --save

saves the snapshot AND outputs JSON. On subsequent

--save

runs, the report includes a drift section showing score regressions, config changes, and residue file changes.

The script produces a structured Markdown report (or JSON with

--json

) with:

Stack summary: detected language, frameworks, package manager, actual scripts from package.json
Scorecard: 0-3 score for each of the six layers (test, lint, type-check, SA, build, debug)
Findings: per-layer details on what was detected
Debugging residue: files matching
```
*_v2.*
```
,
```
*_backup.*
```
,
```
*_fixed.*
```
patterns
Recommendations: prioritized by impact on agent feedback quality (P0-P3)

Present the report to the user. Ask which recommendations to implement before proceeding to Phase 2.

Phase 1.5: Convention Extraction (optional)

Extract "never X"/"always Y" constraints from CLAUDE.md into candidate lint rules:

uv run ~/.claude/skills/test-harness-auditor/scripts/extract_conventions.py

Outputs JSON with candidate lint-rules.json entries derived from project constraints. Present candidates to the user for approval before merging.

Phase 2: Config Generation (after user approval)

Run the generation script (optionally with audit JSON for accurate commands):

uv run ~/.claude/skills/test-harness-auditor/scripts/generate.py --audit /tmp/audit.json

Or without audit data (re-detects stack):

uv run ~/.claude/skills/test-harness-auditor/scripts/generate.py

When

--audit

is used, generate.py uses actual commands from package.json (vitest, playwright, biome, etc.) instead of generic templates, and detects separate E2E vs unit test runners.

This produces three outputs:

```
.claude/lint-rules.json
```
— custom grep-based rules for the lint-on-write hook
- Stack-specific rules (security, debugging residue, error boundaries, observability)
- Auto-includes matching rule packs from
```
rule-library/
```
  (react, rust-workspace, python-cli; functional-ts is opt-in only)
- Merges with existing config if present (preserves user customizations)
- Tagged rules (
```
_tag
```
  field) enable idempotent re-runs
CLAUDE.md testing section — test/lint/typecheck/build/SA commands
- Follows claude-md-manager conventions (command-first, concise)
- Section-aware merge: when existing CLAUDE.md is found, surgically replaces only
```
## Commands
```
  and
```
## Testing
```
  sections, preserving all other content
- Present as a proposal — do not overwrite existing CLAUDE.md content
Hook recommendations — which PostToolUse hooks to enable
- lint-on-write (primary), test-on-fix, type-check-on-write

For each generated config, present it to the user and ask for approval before writing.

Scoring System

Score	Meaning
0	Absent — agent is flying blind on this layer
1	Minimal — basic tool present but not configured for agents
2	Adequate — tool configured and runnable
3	Excellent — strict mode, mutation testing, or advanced config

Six Assessment Layers

Test suite: framework, runner command, coverage config, mutation testing
Linting: standard linter, custom rules, agent-specific rules
Type checking: type checker, strict mode, CI integration
Static analysis: security scanners, complexity checkers, dependency audit
Build/compilation: build command, incremental build, CI validation
Debugger/REPL: debugger availability, REPL access

Integration

lint-on-write hook: generated
```
lint-rules.json
```
is consumed by
```
~/.claude/hooks/lint-on-write.py
```
(violations are severity-tiered: BLOCKING > HIGH > MEDIUM)
claude-md-manager: generated CLAUDE.md sections follow its conventions (WHAT/WHY/HOW, command-first)
agents-md-manager: for cross-agent compatibility, consider also generating AGENTS.md
agnix: complementary tool — validates the agent config files themselves (385 rules for CLAUDE.md/AGENTS.md/SKILL.md stale paths, dead commands, context rot). Our skill validates the codebase infrastructure.

Rule Library

44 rules across 4 domain-specific packs in

rule-library/

. Auto-loaded packs are selected by

generate.py

based on detected frameworks and stack. All patterns are single-line

grep -En

detectable.

Pack	Matches	Rules	Highlights
`react.json`	react, next frameworks	10	disabled-exhaustive-deps, key-index, async-use-effect, disabled-hooks-rule, context-object-literal
`rust-workspace.json`	rust stack	8	expect-empty-msg, anyhow-in-lib, dbg-macro, panic-outside-tests, println-residue
`python-cli.json`	python stack	13	shell-true, insecure-deserialization, mutable-default-arg, requests-no-timeout, commonprefix
`functional-ts.json`	Opt-in only	13	array-mutation, sort-reverse, delete-operator, any-type, enum-declaration, namespace-declaration

Opt-in packs

Packs with

"_opt_in": true

are never auto-loaded. The functional-ts pack enforces strict-FP immutability patterns (Open Souls paradigm). To use it, manually copy its rules into your project's

.claude/lint-rules.json

Exclusion fields

Rules support two exclusion mechanisms:

exclude_paths
— glob-matched against file paths (e.g.
```
"*/bin/*"
```
,
```
"*/main.rs"
```
). Skips the file entirely before grep runs.
exclude_patterns
— regex-matched against grep output line text (e.g.
```
"test"
```
,
```
"// nosec"
```
). Filters matched lines after grep runs.

To add a custom rule pack, create a JSON file in

rule-library/

with

_frameworks

(list) and/or

_stack

(string) matching fields, plus a

rules

array. Set

"_opt_in": true

to prevent auto-loading. Pack rules use

pack:

prefix in

_tag

for dedup. See

rule-library/INDEX.md

for the full inventory.

References

Load these on-demand when deeper context is needed:

```
references/stack-profiles.md
```
— per-stack detection rules and tool recommendations
```
references/factory-lint-categories.md
```
— 7 Factory.ai agent lint categories with grep patterns
```
references/anti-patterns.md
```
— 10 AI-specific anti-patterns with detection heuristics

Scope

First-class stacks: JavaScript/TypeScript, Rust, Python, Go, Ruby
Other stacks get basic detection with generic recommendations
Does not write or modify test files
Does not install tools (recommends what to install)
Does not modify CI/CD pipelines