Claude-code-minoan test-harness-auditor

Audit a repo's test, lint, type-check, static analysis, build, and debug infrastructure for AI coding agents. Generate scored reports and optimized configs for the lint-on-write hook. Triggers on audit tests, test harness, lint setup, check test infrastructure, entering a new repo.

install
source · Clone the upstream repo
git clone https://github.com/tdimino/claude-code-minoan
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/tdimino/claude-code-minoan "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/core-development/test-harness-auditor" ~/.claude/skills/tdimino-claude-code-minoan-test-harness-auditor && rm -rf "$T"
manifest: skills/core-development/test-harness-auditor/SKILL.md
source content

Test Harness Auditor

Audit any repo's feedback infrastructure across six layers and generate optimized configs for AI coding agents.

When to Run

  • Entering a new repo with no
    .claude/lint-rules.json
  • User asks to audit tests, lint setup, or agent infrastructure
  • After cloning a repo to check what feedback loops exist
  • Periodically to catch configuration drift

Two-Phase Workflow

Phase 1: Audit (read-only)

Run the audit script to scan the current repo:

uv run ~/.claude/skills/test-harness-auditor/scripts/audit.py

Or target a specific directory:

uv run ~/.claude/skills/test-harness-auditor/scripts/audit.py /path/to/repo

For machine-readable output (consumed by Phase 2):

uv run ~/.claude/skills/test-harness-auditor/scripts/audit.py --json > /tmp/audit.json

To save a snapshot for drift detection (tracks score changes over time):

uv run ~/.claude/skills/test-harness-auditor/scripts/audit.py --save

Combine flags:

--json --save
saves the snapshot AND outputs JSON. On subsequent
--save
runs, the report includes a drift section showing score regressions, config changes, and residue file changes.

The script produces a structured Markdown report (or JSON with

--json
) with:

  • Stack summary: detected language, frameworks, package manager, actual scripts from package.json
  • Scorecard: 0-3 score for each of the six layers (test, lint, type-check, SA, build, debug)
  • Findings: per-layer details on what was detected
  • Debugging residue: files matching
    *_v2.*
    ,
    *_backup.*
    ,
    *_fixed.*
    patterns
  • Recommendations: prioritized by impact on agent feedback quality (P0-P3)

Present the report to the user. Ask which recommendations to implement before proceeding to Phase 2.

Phase 1.5: Convention Extraction (optional)

Extract "never X"/"always Y" constraints from CLAUDE.md into candidate lint rules:

uv run ~/.claude/skills/test-harness-auditor/scripts/extract_conventions.py

Outputs JSON with candidate lint-rules.json entries derived from project constraints. Present candidates to the user for approval before merging.

Phase 2: Config Generation (after user approval)

Run the generation script (optionally with audit JSON for accurate commands):

uv run ~/.claude/skills/test-harness-auditor/scripts/generate.py --audit /tmp/audit.json

Or without audit data (re-detects stack):

uv run ~/.claude/skills/test-harness-auditor/scripts/generate.py

When

--audit
is used, generate.py uses actual commands from package.json (vitest, playwright, biome, etc.) instead of generic templates, and detects separate E2E vs unit test runners.

This produces three outputs:

  1. .claude/lint-rules.json
    — custom grep-based rules for the lint-on-write hook

    • Stack-specific rules (security, debugging residue, error boundaries, observability)
    • Auto-includes matching rule packs from
      rule-library/
      (react, rust-workspace, python-cli; functional-ts is opt-in only)
    • Merges with existing config if present (preserves user customizations)
    • Tagged rules (
      _tag
      field) enable idempotent re-runs
  2. CLAUDE.md testing section — test/lint/typecheck/build/SA commands

    • Follows claude-md-manager conventions (command-first, concise)
    • Section-aware merge: when existing CLAUDE.md is found, surgically replaces only
      ## Commands
      and
      ## Testing
      sections, preserving all other content
    • Present as a proposal — do not overwrite existing CLAUDE.md content
  3. Hook recommendations — which PostToolUse hooks to enable

    • lint-on-write (primary), test-on-fix, type-check-on-write

For each generated config, present it to the user and ask for approval before writing.

Scoring System

ScoreMeaning
0Absent — agent is flying blind on this layer
1Minimal — basic tool present but not configured for agents
2Adequate — tool configured and runnable
3Excellent — strict mode, mutation testing, or advanced config

Six Assessment Layers

  1. Test suite: framework, runner command, coverage config, mutation testing
  2. Linting: standard linter, custom rules, agent-specific rules
  3. Type checking: type checker, strict mode, CI integration
  4. Static analysis: security scanners, complexity checkers, dependency audit
  5. Build/compilation: build command, incremental build, CI validation
  6. Debugger/REPL: debugger availability, REPL access

Integration

  • lint-on-write hook: generated
    lint-rules.json
    is consumed by
    ~/.claude/hooks/lint-on-write.py
    (violations are severity-tiered: BLOCKING > HIGH > MEDIUM)
  • claude-md-manager: generated CLAUDE.md sections follow its conventions (WHAT/WHY/HOW, command-first)
  • agents-md-manager: for cross-agent compatibility, consider also generating AGENTS.md
  • agnix: complementary tool — validates the agent config files themselves (385 rules for CLAUDE.md/AGENTS.md/SKILL.md stale paths, dead commands, context rot). Our skill validates the codebase infrastructure.

Rule Library

44 rules across 4 domain-specific packs in

rule-library/
. Auto-loaded packs are selected by
generate.py
based on detected frameworks and stack. All patterns are single-line
grep -En
detectable.

PackMatchesRulesHighlights
react.json
react, next frameworks10disabled-exhaustive-deps, key-index, async-use-effect, disabled-hooks-rule, context-object-literal
rust-workspace.json
rust stack8expect-empty-msg, anyhow-in-lib, dbg-macro, panic-outside-tests, println-residue
python-cli.json
python stack13shell-true, insecure-deserialization, mutable-default-arg, requests-no-timeout, commonprefix
functional-ts.json
Opt-in only13array-mutation, sort-reverse, delete-operator, any-type, enum-declaration, namespace-declaration

Opt-in packs

Packs with

"_opt_in": true
are never auto-loaded. The functional-ts pack enforces strict-FP immutability patterns (Open Souls paradigm). To use it, manually copy its rules into your project's
.claude/lint-rules.json
.

Exclusion fields

Rules support two exclusion mechanisms:

  • exclude_paths
    — glob-matched against file paths (e.g.
    "*/bin/*"
    ,
    "*/main.rs"
    ). Skips the file entirely before grep runs.
  • exclude_patterns
    — regex-matched against grep output line text (e.g.
    "test"
    ,
    "// nosec"
    ). Filters matched lines after grep runs.

To add a custom rule pack, create a JSON file in

rule-library/
with
_frameworks
(list) and/or
_stack
(string) matching fields, plus a
rules
array. Set
"_opt_in": true
to prevent auto-loading. Pack rules use
pack:
prefix in
_tag
for dedup. See
rule-library/INDEX.md
for the full inventory.

References

Load these on-demand when deeper context is needed:

  • references/stack-profiles.md
    — per-stack detection rules and tool recommendations
  • references/factory-lint-categories.md
    — 7 Factory.ai agent lint categories with grep patterns
  • references/anti-patterns.md
    — 10 AI-specific anti-patterns with detection heuristics

Scope

  • First-class stacks: JavaScript/TypeScript, Rust, Python, Go, Ruby
  • Other stacks get basic detection with generic recommendations
  • Does not write or modify test files
  • Does not install tools (recommends what to install)
  • Does not modify CI/CD pipelines