Odin-claude-plugin test-driven

Test-Driven Development (TDD) - design tests from requirements, then execute RED -> GREEN -> REFACTOR cycle. Use when implementing features or fixes with TDD methodology, writing tests before code, or following XP-style development across any supported language.

install

source · Clone the upstream repo

git clone https://github.com/OutlineDriven/odin-claude-plugin

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/OutlineDriven/odin-claude-plugin "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/test-driven" ~/.claude/skills/outlinedriven-odin-claude-plugin-test-driven && rm -rf "$T"

manifest: skills/test-driven/SKILL.md

source content

Test-driven development (XP-style)

Tests define the specification. Design them from requirements before any implementation. The RED-GREEN-REFACTOR cycle is the heartbeat: write a failing test, make it pass with minimal code, then clean up while green.

Modern insight (2025): TDD + property-based testing pairing is the standard -- example tests prevent regressions, property tests discover edge cases. TDD also serves AI-assisted development: structural integrity keeps code understandable for both human and AI collaborators (Kent Beck, "Augmented Coding"). Mutation testing validates test quality beyond coverage metrics (TDD+Mutation: 63.3% vs TDD-alone: 39.4% mutation coverage).

See frameworks for language-specific test runners, property testing, coverage, and mutation tools. See examples for brief TDD cycle patterns per language.

When to Apply

New features with clear requirements (both inside-out and outside-in approaches valid)
Bug fixes -- write a failing test that proves the bug before fixing
Refactoring -- ensure coverage exists before restructuring
API contract enforcement -- test the interface, not internals
Property-based invariants -- complement example tests with PBT
Legacy code -- add characterization tests before modifying (Michael Feathers pattern)

When NOT to Apply

Exploratory prototyping or spike research
One-off scripts, data migrations, generated code
Purely visual UI layout work (prefer visual regression testing)
Highly experimental algorithmic research (but PBT still helps)
Throwaway code with <1 week lifespan

Anti-patterns

Test-last: Writing tests after implementation defeats the design benefit
Testing implementation details: Tests should verify behavior, not internal structure -- breaks refactoring confidence
Over-mocking: Testing the mocks instead of the code; mock external I/O, not core logic
Skipping RED: Tests that never fail aren't tests -- they verify nothing
100% coverage obsession: Coverage does not equal quality. Mutation testing exposes gaps coverage cannot
Refactoring on RED: Never restructure with failing tests
Test-induced architectural damage: Letting mock boundaries dictate design
Snapshot bloat: Approval-style tests without curation become maintenance burden

Two Schools (decision guidance, not prescription)

Inside-Out (Classic/Detroit): Start with unit tests for smallest pieces, build upward. Minimizes mocks. Best for well-understood domains, algorithms, utility functions.
Outside-In (London/Mockist): Start with acceptance test for user-facing behavior, use mocks to discover interfaces. Best for layered systems, APIs, microservices.
Pragmatic teams use both depending on context. Neither is superior.

Test Doubles Hierarchy

Stubs: Return predefined data; verify outcomes (state-based)
Mocks: Verify interactions/calls were made (behavior-based)
Fakes: Working implementations (e.g., in-memory database)
Spies: Record calls while using real behavior
Rule: Mock external dependencies. Never mock core domain logic.

Workflow (language-neutral)

CREATE -- Write failing tests: error cases -> edge cases -> happy paths -> property tests
RED -- Run tests, verify all fail. If any pass, the test is wrong or behavior already exists.
GREEN -- Minimal code to pass. No extras, no optimization, no cleanup.
REFACTOR -- Clean up while green. Separate structural changes from behavioral (Tidy First). Re-run tests after every change.

Constitutional Rules (Non-Negotiable)

Design Tests First: Plan all test cases from requirements before implementation; write each test iteratively in the RED-GREEN-REFACTOR loop
RED Before GREEN: Each new test MUST fail before you write implementation for it
Error Cases First: Implement error handling before success paths
One Test at a Time: Write one failing test, make it pass, refactor, then add the next test
Refactor Only on GREEN: Never refactor with failing tests

Validation Gates

Gate	Pass Criteria	Blocking
Tests Created	Test files exist for target module	Yes
RED State	All new tests fail before implementation	Yes
GREEN State	All tests pass after implementation	Yes
Coverage	>= 80% line coverage	No
Mutation	Mutation score reviewed (no threshold enforced)	No

Exit Codes

Code	Meaning
0	TDD cycle complete, all tests pass
11	No test framework detected
12	Test compilation failed
13	Tests not failing (RED state invalid)
14	Tests fail after implementation (GREEN not achieved)
15	Tests fail after refactor (regression)