Claude-Skills tdd-guide
git clone https://github.com/borghei/Claude-Skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/borghei/Claude-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/engineering/tdd-guide" ~/.claude/skills/borghei-claude-skills-tdd-guide && rm -rf "$T"
engineering/tdd-guide/SKILL.mdTDD Guide
The agent guides red-green-refactor TDD workflows, generates framework-specific test stubs from requirements, parses coverage reports to identify prioritized gaps, and calculates test quality metrics including smell detection and assertion density. Supports Jest, Pytest, JUnit, Vitest, and Mocha.
Quick Start
# Generate test cases from requirements (Python API) from test_generator import TestGenerator, TestFramework gen = TestGenerator(framework=TestFramework.PYTEST, language="python") cases = gen.generate_from_requirements(requirements) # Analyze coverage gaps from LCOV report from coverage_analyzer import CoverageAnalyzer analyzer = CoverageAnalyzer() analyzer.parse_coverage_report(content, "lcov") gaps = analyzer.identify_gaps(threshold=80.0) # Guide TDD cycle from tdd_workflow import TDDWorkflow wf = TDDWorkflow() wf.start_cycle("User can reset password via email")
Core Workflows
Workflow 1: TDD a New Feature
- Write a failing test for the feature requirement (RED phase)
- Call
-- confirms test exists and failsvalidate_red_phase() - Write minimal code to make the test pass (GREEN phase)
- Call
-- confirms all tests passvalidate_green_phase() - Refactor while keeping tests green (REFACTOR phase)
- Call
-- confirms tests still pass after cleanupvalidate_refactor_phase() - Validation checkpoint: Each cycle completes in under 10 minutes; zero test smells introduced
Workflow 2: Analyze Coverage Gaps
- Generate coverage report:
ornpm test -- --coveragepytest --cov - Detect format with
and parse withdetect_format()parse_coverage_report() - Run
to get prioritized file list (P0/P1/P2)identify_gaps(threshold=80.0) - Generate test stubs for P0 files (business-critical, lowest coverage)
- Validation checkpoint: Line coverage >= 80%; branch coverage >= 70%; zero P0 gaps in critical paths
Workflow 3: Generate Tests from Requirements
- Structure requirements as user stories with acceptance criteria
- Call
with target frameworkgenerate_from_requirements() - Review generated test cases for completeness (happy path, error, edge cases)
- Generate test file with
generate_test_file() - Validation checkpoint: Each acceptance criterion has at least one test; all tests compile
Tools
| Tool | Purpose |
|---|---|
| Generate test cases from requirements/specs |
| Parse LCOV/JSON/XML reports, find gaps |
| Guide red-green-refactor cycles |
| Convert tests between frameworks |
| Generate test data and mocks with seeds |
| Calculate complexity and test quality |
| Auto-detect language and framework |
| Format output for CLI/desktop/CI |
Anti-Patterns
- Tests that pass immediately -- a test with no real assertion or
skips the RED phase; every test must fail before implementationassert True - Testing implementation details -- coupling tests to internal method names makes refactoring break tests; test behavior and outputs, not internals
- Non-deterministic fixtures -- random data without a seed produces different failures across CI runs; always pass
toseed=<int>FixtureGenerator - Skipping the refactor phase -- GREEN code that works but is messy accumulates; refactoring is not optional in TDD
- Coverage theater -- writing tests that hit lines without meaningful assertions; use
to detect low assertion densitymetrics_calculator.py - Conditional test logic --
inside tests masks failures; each test should have a single clear pathif/else
Troubleshooting
| Problem | Cause | Solution |
|---|---|---|
| Generated tests pass immediately (no RED phase) | Test has no real assertion or asserts a trivially true value | Ensure every test contains an assertion against the actual unit under test; remove placeholder stubs before running |
| Coverage report fails to parse | Report format does not match the expected LCOV, JSON, or XML structure | Run first to verify the detected format; convert non-standard reports (e.g., Clover) to Cobertura XML |
| Framework adapter produces wrong import style | Source and target framework were swapped, or language/framework mismatch | Verify the and arguments match your project; use on existing test code to auto-detect |
| Fixture generator produces non-deterministic data | No random seed was supplied, so each run yields different values | Pass to for reproducible fixtures across CI runs |
| Metrics calculator reports 0 test functions | Test code uses an unsupported naming convention (e.g., prefix) | Rename tests to follow / / conventions, or extend the regex patterns in |
| TDD workflow validates GREEN phase but tests still fail locally | Test result dict passed to has not set to | Ensure your test runner output is normalized to or before passing it in |
| Coverage gaps list is empty despite low overall coverage | All individual files meet the threshold even though the aggregate does not | Lower the argument in or inspect per-file coverage with |
Success Criteria
- Test-first ratio above 80% -- at least 4 out of every 5 features begin with a failing test before any implementation code is written.
- Red-green-refactor cycle under 10 minutes -- each TDD micro-cycle (write failing test, make it pass, refactor) completes within a single focused interval.
- Line coverage at or above 80% -- measured by
against LCOV/JSON/XML reports, with branch coverage at or above 70%.coverage_analyzer.py - Test quality score at or above 75/100 -- as reported by
, combining assertion density, isolation, naming quality, and absence of test smells.metrics_calculator.py - Zero P0 coverage gaps in critical paths -- business-critical modules (auth, payments, data persistence) have no files flagged P0 by
.identify_gaps() - Test smell count of zero for high-severity items -- no
,missing_assertions
, orsleepy_test
smells detected at high severity.conditional_test_logic - Fixture reproducibility across CI -- all generated fixtures use a fixed seed and produce identical output on every pipeline run.
Scope & Limitations
This skill covers:
- Unit test generation, scaffolding, and stub creation for Jest, Pytest, JUnit, Vitest, and Mocha
- Static coverage report parsing (LCOV, JSON/Istanbul, XML/Cobertura) with gap identification and prioritized recommendations
- Red-green-refactor workflow guidance with phase validation and cycle tracking
- Test quality assessment including complexity analysis, isolation scoring, naming quality, and test smell detection
This skill does NOT cover:
- Integration, end-to-end, or performance test generation -- see
for E2E patterns andsenior-qa
for load testingsenior-devops - Runtime test execution or live coverage measurement -- scripts perform static analysis only; you must run your test suite externally
- Visual/snapshot testing or browser-based test workflows -- use Playwright, Cypress, or Storybook for UI-level testing
- Security-focused test generation (fuzz testing, penetration testing) -- see
andsenior-security
skillssenior-secops
Integration Points
| Skill | Integration | Data Flow |
|---|---|---|
| Generated test stubs feed into QA review workflows; QA coverage standards inform threshold settings | output → QA review → approved test suite |
| Metrics calculator output provides quantitative data for code review checklists | quality report → code review scoring |
| Scaffolded projects include test infrastructure; TDD guide generates tests for scaffolded modules | output → input |
| Coverage reports from CI pipelines are parsed by coverage analyzer; recommendations feed back into pipeline gates | CI coverage artifact → → pass/fail gate |
| Edge-case fixtures for auth and API scenarios complement security-focused test plans | auth/API edge cases → security test plan |
| Framework detection informs stack evaluation; test quality metrics feed into technology assessment | analysis → stack evaluation input |
Tool Reference
1. test_generator.py
test_generator.pyPurpose: Generate test cases from requirements, user stories, and API specs, then produce framework-specific test stubs and complete test files.
Module:
TestGenerator class
Usage:
from test_generator import TestGenerator, TestFramework, TestType gen = TestGenerator(framework=TestFramework.PYTEST, language="python") cases = gen.generate_from_requirements(requirements, test_type=TestType.UNIT) stub = gen.generate_test_stub(cases[0]) file_content = gen.generate_test_file("my_module", cases) suggestions = gen.suggest_missing_scenarios(existing_tests, code_analysis)
Constructor Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
| | Yes | Target framework: , , , , |
| | Yes | Programming language: , , , |
Key Methods:
| Method | Parameters | Returns |
|---|---|---|
| : dict with , , ; : enum (default ) | of test case specs |
| : single test case dict | -- framework-specific test stub code |
| : str; : optional list (uses stored cases if omitted) | -- complete test file with imports |
| : list of test name strings; : dict with , , | of suggested test scenarios |
Output Formats: Python dict/list (test case specifications), string (generated code).
Example:
requirements = { "user_stories": [{"action": "login", "given": ["valid credentials"], "when": "submit form", "then": "redirect to dashboard"}], "api_specs": [{"method": "POST", "path": "/auth/login", "requires_auth": False, "required_params": ["email", "password"]}] } gen = TestGenerator(framework=TestFramework.JEST, language="typescript") cases = gen.generate_from_requirements(requirements) print(gen.generate_test_file("auth_service", cases))
2. coverage_analyzer.py
coverage_analyzer.pyPurpose: Parse coverage reports in LCOV, JSON (Istanbul/nyc), and XML (Cobertura) formats. Calculate summary metrics, identify files below threshold, and generate prioritized recommendations.
Module:
CoverageAnalyzer class
Usage:
from coverage_analyzer import CoverageAnalyzer analyzer = CoverageAnalyzer() data = analyzer.parse_coverage_report(report_content, format_type="lcov") summary = analyzer.calculate_summary() gaps = analyzer.identify_gaps(threshold=80.0) recs = analyzer.generate_recommendations() file_detail = analyzer.get_file_coverage("src/auth.ts") detected = analyzer.detect_format(raw_content)
Constructor Parameters: None.
Key Methods:
| Method | Parameters | Returns |
|---|---|---|
| : str; : , , , | of per-file coverage data |
| None | with , , , totals |
| : float (default ) | of files below threshold with priority P0/P1/P2 |
| None | of prioritized recommendations |
| : str | with per-file line/branch/function coverage |
| : str | -- , , or |
Output Formats: Python dict/list. Use
output_formatter.py for terminal/markdown/JSON rendering.
Example:
with open("coverage/lcov.info") as f: content = f.read() analyzer = CoverageAnalyzer() fmt = analyzer.detect_format(content) analyzer.parse_coverage_report(content, fmt) summary = analyzer.calculate_summary() # {'line_coverage': 76.5, 'branch_coverage': 62.3, ...} gaps = analyzer.identify_gaps(threshold=80.0) # [{'file': 'src/auth.ts', 'line_coverage': 45.0, 'priority': 'P0', ...}]
3. tdd_workflow.py
tdd_workflow.pyPurpose: Guide users through red-green-refactor TDD cycles with phase validation, workflow state tracking, and refactoring suggestions.
Module:
TDDWorkflow class
Usage:
from tdd_workflow import TDDWorkflow wf = TDDWorkflow() guidance = wf.start_cycle("User can reset password via email") red_result = wf.validate_red_phase(test_code, test_result={"status": "failed"}) green_result = wf.validate_green_phase(impl_code, {"status": "passed"}) refactor_result = wf.validate_refactor_phase(original, refactored, {"status": "passed"}) phase_guide = wf.get_phase_guidance() summary = wf.generate_workflow_summary()
Constructor Parameters: None.
Key Methods:
| Method | Parameters | Returns |
|---|---|---|
| : str -- user story or feature description | with phase, instruction, checklist, tips |
| : str; : optional dict with key | with , validations, next instruction |
| : str; : dict with key | with , validations, |
| : str; : str; : dict with key | with , , next steps |
| : optional enum (uses current phase if omitted) | with goal, steps, common mistakes, tips |
| None | -- markdown summary of current state and completed cycles |
Output Formats: Python dict (validation results), string (summary).
Example:
wf = TDDWorkflow() wf.start_cycle("Add email validation to signup form") result = wf.validate_red_phase("def test_invalid_email():\n assert validate('bad') == False", {"status": "failed"}) # {'phase_complete': True, 'next_phase': 'GREEN', ...}
4. framework_adapter.py
framework_adapter.pyPurpose: Provide multi-framework support with adapters for Jest, Vitest, Pytest, unittest, JUnit, TestNG, Mocha, and Jasmine. Generate framework-specific imports, test suites, test functions, assertions, and setup/teardown hooks.
Module:
FrameworkAdapter class
Usage:
from framework_adapter import FrameworkAdapter, Framework, Language adapter = FrameworkAdapter(framework=Framework.JEST, language=Language.TYPESCRIPT) imports = adapter.generate_imports() suite = adapter.generate_test_suite_wrapper("AuthService", test_content) test_fn = adapter.generate_test_function("should reject invalid email", body, "Validates email format") assertion = adapter.generate_assertion("result", "true", "true") hooks = adapter.generate_setup_teardown(setup_code="db = create_test_db()", teardown_code="db.close()") detected = adapter.detect_framework(existing_code)
Constructor Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
| | Yes | , , , , , , , |
| | Yes | , , , |
Key Methods:
| Method | Parameters | Returns |
|---|---|---|
| None | -- framework-specific import statements |
| : str; : str | -- complete test suite wrapping content |
| : str; : str; : str (default ) | -- complete test function |
| : str; : str; : , , , , (default ) | -- assertion statement |
| : str (default ); : str (default ) | -- setup/teardown hooks |
| : str | enum or |
Output Formats: String (generated code).
Example:
adapter = FrameworkAdapter(Framework.PYTEST, Language.PYTHON) print(adapter.generate_imports()) # import pytest print(adapter.generate_assertion("calculate_total(items)", "150.0", "equals")) # assert calculate_total(items) == 150.0
5. fixture_generator.py
fixture_generator.pyPurpose: Generate realistic test data, boundary values, edge-case scenarios, and mock objects for various domains (auth, payment, form, API, file upload).
Module:
FixtureGenerator class
Usage:
from fixture_generator import FixtureGenerator gen = FixtureGenerator(seed=42) boundaries = gen.generate_boundary_values("int", {"min": 0, "max": 255}) edge_cases = gen.generate_edge_cases("auth") mocks = gen.generate_mock_data(schema, count=5) fixture_content = gen.generate_fixture_file("users", mocks, format="json")
Constructor Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
| or | No | Random seed for reproducible output (default ) |
Key Methods:
| Method | Parameters | Returns |
|---|---|---|
| : , , , , , ; : optional dict (, , , , , ) | of boundary values |
| : , , , , ; : optional dict (required for with key) | of edge case scenarios |
| : dict mapping field names to defs; : int (default ) | of mock objects |
| : str; : any; : , , (default ) | -- fixture file content |
Supported Schema Field Types:
string, int, float, bool, email, date, array.
Output Formats: Python list/dict (data), string (file content in JSON/Python/YAML).
Example:
gen = FixtureGenerator(seed=123) schema = { "id": {"type": "int", "min": 1, "max": 9999}, "email": {"type": "email"}, "active": {"type": "bool"} } users = gen.generate_mock_data(schema, count=3) print(gen.generate_fixture_file("test_users", users, format="json"))
6. metrics_calculator.py
metrics_calculator.pyPurpose: Calculate comprehensive test and code quality metrics including cyclomatic/cognitive complexity, testability scoring, test quality assessment (assertions, isolation, naming, smells), and execution analysis.
Module:
MetricsCalculator class
Usage:
from metrics_calculator import MetricsCalculator calc = MetricsCalculator() all_metrics = calc.calculate_all_metrics(source_code, test_code, coverage_data, execution_data) complexity = calc.calculate_complexity(source_code) test_quality = calc.calculate_test_quality(test_code) execution = calc.analyze_execution_metrics(execution_data) summary = calc.generate_metrics_summary()
Constructor Parameters: None.
Key Methods:
| Method | Parameters | Returns |
|---|---|---|
| : str; : str; : optional dict; : optional dict | with , , , |
| : str | with , , , |
| : str | with , , , , , , |
| : dict with list (each having , , optional ) | with , timing stats, , , |
| None | -- human-readable markdown summary |
Output Formats: Python dict (metrics data), string (markdown summary).
Example:
calc = MetricsCalculator() complexity = calc.calculate_complexity(open("src/auth.py").read()) # {'cyclomatic_complexity': 8, 'cognitive_complexity': 12, 'testability_score': 82.0, 'assessment': 'Medium complexity - moderately testable'} quality = calc.calculate_test_quality(open("tests/test_auth.py").read()) # {'quality_score': 78.5, 'test_smells': [], ...}
7. format_detector.py
format_detector.pyPurpose: Automatically detect programming language, testing framework, coverage report format, and project structure from code content or file paths.
Module:
FormatDetector class
Usage:
from format_detector import FormatDetector detector = FormatDetector() language = detector.detect_language(code) framework = detector.detect_test_framework(test_code) cov_format = detector.detect_coverage_format(report_content) input_info = detector.detect_input_format(raw_input) file_info = detector.extract_file_info("/src/auth.service.ts") test_name = detector.suggest_test_file_name("auth.service.ts", "jest") patterns = detector.identify_test_patterns(test_code) project = detector.analyze_project_structure(file_path_list) env = detector.detect_environment()
Constructor Parameters: None.
Key Methods:
| Method | Parameters | Returns |
|---|---|---|
| : str | -- , , , , |
| : str | -- , , , , , , |
| : str | -- , , , |
| : str | with , , , |
| : str | with , , , , |
| : str; : str | -- suggested test file name |
| : str | of detected patterns (AAA, Given-When-Then, etc.) |
| : list of str | with , , |
| None | with , |
Output Formats: String (detection result), Python dict (detailed analysis).
Example:
detector = FormatDetector() print(detector.detect_language("const add = (a: number, b: number): number => a + b;")) # "typescript" print(detector.suggest_test_file_name("UserService.java", "junit")) # "UserserviceTest.java" print(detector.identify_test_patterns("// Arrange\nsetup()\n// Act\nresult = run()\n// Assert\nassert result")) # ['AAA (Arrange-Act-Assert)']
8. output_formatter.py
output_formatter.pyPurpose: Context-aware output formatting for different environments (Desktop/markdown, CLI/terminal, API/JSON). Supports progressive disclosure, token-efficient summary reports, and output truncation.
Module:
OutputFormatter class
Usage:
from output_formatter import OutputFormatter fmt = OutputFormatter(environment="cli", verbose=False) cov_output = fmt.format_coverage_summary(summary, detailed=True) rec_output = fmt.format_recommendations(recommendations, max_items=5) test_output = fmt.format_test_results(results, show_details=True) report = fmt.create_summary_report(coverage, metrics, recommendations) should_detail = fmt.should_show_detailed(data_size=50) truncated = fmt.truncate_output(long_text, max_lines=30)
Constructor Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
| | No | Target environment: , , (default ) |
| | No | Include detailed output (default ) |
Key Methods:
| Method | Parameters | Returns |
|---|---|---|
| : dict; : bool (default ) | -- formatted coverage (markdown/terminal/JSON based on environment) |
| : list of dicts; : optional int | -- formatted recommendations grouped by priority |
| : dict with , , , , ; : bool (default ) | -- formatted test results |
| : dict; : dict; : list | -- token-efficient summary (<200 tokens) |
| : int | -- whether to show detailed output |
| : str; : int (default ) | -- truncated text with remaining-lines indicator |
Output Formats: String in markdown (desktop), plain text (CLI), or JSON (API) depending on
environment setting.
Example:
fmt = OutputFormatter(environment="desktop", verbose=True) print(fmt.format_coverage_summary({"line_coverage": 82.5, "branch_coverage": 71.0, "function_coverage": 90.0})) # ## Test Coverage Summary # ### Overall Metrics # - **Line Coverage**: 82.5% # ...