Odin-claude-plugin test-driven

Test-Driven Development (TDD) - design tests from requirements, then execute RED -> GREEN -> REFACTOR cycle. Use when implementing features or fixes with TDD methodology, writing tests before code, or following XP-style development across any supported language.

install
source · Clone the upstream repo
git clone https://github.com/OutlineDriven/odin-claude-plugin
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/OutlineDriven/odin-claude-plugin "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/test-driven" ~/.claude/skills/outlinedriven-odin-claude-plugin-test-driven && rm -rf "$T"
manifest: skills/test-driven/SKILL.md
source content

Test-driven development (XP-style)

Tests define the specification. Design them from requirements before any implementation. The RED-GREEN-REFACTOR cycle is the heartbeat: write a failing test, make it pass with minimal code, then clean up while green.

Modern insight (2025): TDD + property-based testing pairing is the standard -- example tests prevent regressions, property tests discover edge cases. TDD also serves AI-assisted development: structural integrity keeps code understandable for both human and AI collaborators (Kent Beck, "Augmented Coding"). Mutation testing validates test quality beyond coverage metrics (TDD+Mutation: 63.3% vs TDD-alone: 39.4% mutation coverage).

See frameworks for language-specific test runners, property testing, coverage, and mutation tools. See examples for brief TDD cycle patterns per language.


When to Apply

  • New features with clear requirements (both inside-out and outside-in approaches valid)
  • Bug fixes -- write a failing test that proves the bug before fixing
  • Refactoring -- ensure coverage exists before restructuring
  • API contract enforcement -- test the interface, not internals
  • Property-based invariants -- complement example tests with PBT
  • Legacy code -- add characterization tests before modifying (Michael Feathers pattern)

When NOT to Apply

  • Exploratory prototyping or spike research
  • One-off scripts, data migrations, generated code
  • Purely visual UI layout work (prefer visual regression testing)
  • Highly experimental algorithmic research (but PBT still helps)
  • Throwaway code with <1 week lifespan

Anti-patterns

  • Test-last: Writing tests after implementation defeats the design benefit
  • Testing implementation details: Tests should verify behavior, not internal structure -- breaks refactoring confidence
  • Over-mocking: Testing the mocks instead of the code; mock external I/O, not core logic
  • Skipping RED: Tests that never fail aren't tests -- they verify nothing
  • 100% coverage obsession: Coverage does not equal quality. Mutation testing exposes gaps coverage cannot
  • Refactoring on RED: Never restructure with failing tests
  • Test-induced architectural damage: Letting mock boundaries dictate design
  • Snapshot bloat: Approval-style tests without curation become maintenance burden

Two Schools (decision guidance, not prescription)

  • Inside-Out (Classic/Detroit): Start with unit tests for smallest pieces, build upward. Minimizes mocks. Best for well-understood domains, algorithms, utility functions.
  • Outside-In (London/Mockist): Start with acceptance test for user-facing behavior, use mocks to discover interfaces. Best for layered systems, APIs, microservices.
  • Pragmatic teams use both depending on context. Neither is superior.

Test Doubles Hierarchy

  • Stubs: Return predefined data; verify outcomes (state-based)
  • Mocks: Verify interactions/calls were made (behavior-based)
  • Fakes: Working implementations (e.g., in-memory database)
  • Spies: Record calls while using real behavior
  • Rule: Mock external dependencies. Never mock core domain logic.

Workflow (language-neutral)

  1. CREATE -- Write failing tests: error cases -> edge cases -> happy paths -> property tests
  2. RED -- Run tests, verify all fail. If any pass, the test is wrong or behavior already exists.
  3. GREEN -- Minimal code to pass. No extras, no optimization, no cleanup.
  4. REFACTOR -- Clean up while green. Separate structural changes from behavioral (Tidy First). Re-run tests after every change.

Constitutional Rules (Non-Negotiable)

  1. Design Tests First: Plan all test cases from requirements before implementation; write each test iteratively in the RED-GREEN-REFACTOR loop
  2. RED Before GREEN: Each new test MUST fail before you write implementation for it
  3. Error Cases First: Implement error handling before success paths
  4. One Test at a Time: Write one failing test, make it pass, refactor, then add the next test
  5. Refactor Only on GREEN: Never refactor with failing tests

Validation Gates

GatePass CriteriaBlocking
Tests CreatedTest files exist for target moduleYes
RED StateAll new tests fail before implementationYes
GREEN StateAll tests pass after implementationYes
Coverage>= 80% line coverageNo
MutationMutation score reviewed (no threshold enforced)No

Exit Codes

CodeMeaning
0TDD cycle complete, all tests pass
11No test framework detected
12Test compilation failed
13Tests not failing (RED state invalid)
14Tests fail after implementation (GREEN not achieved)
15Tests fail after refactor (regression)