Taskflow regression-runner

Regression Runner Skill V3 - Pragmatic Regression Testing

install

source · Clone the upstream repo

git clone https://github.com/Brownbull/taskflow

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/Brownbull/taskflow "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/quality/regression-runner" ~/.claude/skills/brownbull-taskflow-regression-runner && rm -rf "$T"

manifest: .claude/skills/quality/regression-runner/skill.md

source content

Regression Runner Skill V3 - Pragmatic Regression Testing

Purpose

Runs accumulated regression tests with phase-appropriate expectations, focusing on high-value tests and providing escape hatches for rapid development.

Core Change from V2

Instead of running all tests all the time:

Runs critical path tests first
Skips low-value tests in early phases
Provides quick regression modes
Auto-fixes test-implementation mismatches

Phase-Aware Regression Strategy

Test Selection by Phase

prototype:
  run: "critical_only"
  tests:
    - smoke tests
    - login works
    - data saves
  skip:
    - edge cases
    - performance
    - visual tests
  time_budget: "5 minutes"

mvp:
  run: "critical_and_user_facing"
  tests:
    - critical paths
    - user workflows
    - auth/security basics
  skip:
    - rare edge cases
    - performance optimization
  time_budget: "15 minutes"

growth:
  run: "comprehensive"
  tests:
    - all critical
    - edge cases
    - performance baselines
  skip:
    - visual regression
    - rare scenarios
  time_budget: "30 minutes"

scale:
  run: "everything"
  time_budget: "60 minutes"

Smart Test Discovery

Find Only Relevant Tests

def discover_tests(phase):
    """Find tests worth running"""

    test_patterns = {
        'prototype': ['*smoke*', '*critical*', '*login*'],
        'mvp': ['*auth*', '*payment*', '*user*', '*api*'],
        'growth': ['!*visual*', '!*ui_polish*'],  # Exclude
        'scale': ['*']  # Everything
    }

    tests = []
    for pattern in test_patterns[phase]:
        if pattern.startswith('!'):
            # Exclude pattern
            tests = exclude_tests(tests, pattern[1:])
        else:
            tests.extend(find_tests(pattern))

    return tests

Execution Modes

Quick Regression (Default for Prototype/MVP)

/regression-quick
# Runs in < 5 minutes
# Only critical paths
# Skips slow tests
# Perfect for rapid iteration

Standard Regression (Growth)

/regression-standard
# Runs in < 30 minutes
# Most tests except visual
# Good coverage/speed balance

Full Regression (Scale)

/regression-full
# Runs everything
# Can take an hour
# Only for major releases

Custom Regression

/regression-custom --only "auth,payment"
# Run specific domains
# Useful for targeted testing

Auto-Fix Test Mismatches

Fix Before Failing

def run_test_with_auto_fix(test):
    """Try to fix common mismatches before failing"""

    try:
        run_test(test)
    except FieldMismatchError as e:
        # Test expects 'uploaded_by', code has 'user'
        fixed_test = update_field_names(test, get_actual_fields())
        run_test(fixed_test)
        save_fixed_test(fixed_test)
    except SignatureMismatchError as e:
        # Wrong parameter order
        fixed_test = match_signature(test, get_actual_signature())
        run_test(fixed_test)
        save_fixed_test(fixed_test)

Failure Handling V3

Progressive Response

def handle_regression_failure(test, phase):
    """React appropriately to phase"""

    if phase == 'prototype':
        if is_critical_path(test):
            return 'FIX_NOW'
        return 'LOG_AND_CONTINUE'

    if phase == 'mvp':
        if affects_users(test):
            return 'FIX_BEFORE_DEPLOY'
        return 'CREATE_TECH_DEBT_TICKET'

    if phase == 'growth':
        if test_importance(test) > 'MEDIUM':
            return 'FIX_NOW'
        return 'FIX_NEXT_SPRINT'

    # Scale - fix everything
    return 'FIX_NOW'

ROI-Based Test Prioritization

Run High-Value Tests First

def prioritize_tests(tests):
    """Order by value/cost ratio"""

    scored_tests = []
    for test in tests:
        score = {
            'test': test,
            'roi': calculate_roi(test),
            'runtime': estimate_runtime(test),
            'catches_bugs': bug_catch_rate(test)
        }
        scored_tests.append(score)

    # High ROI, fast tests first
    return sorted(scored_tests,
                 key=lambda x: x['roi'] / x['runtime'],
                 reverse=True)

Skip Low-Value Tests

LOW_VALUE_PATTERNS = [
    'test_button_color',
    'test_tooltip_text',
    'test_css_classes',
    'test_html_structure',
    'test_every_validation_message'
]

def should_skip_test(test, phase):
    if phase in ['prototype', 'mvp']:
        if any(pattern in test.name for pattern in LOW_VALUE_PATTERNS):
            return True
    return False

Escape Hatches

Emergency Skips

# Skip all regression tests
/regression-skip --reason "hotfix"

# Skip slow tests
/regression-fast --timeout 60

# Skip flaky tests
/regression-stable --skip-flaky

Partial Runs

# Only run what changed
/regression-changed --since "yesterday"

# Only run critical
/regression-critical

# Skip specific domains
/regression --skip "visual,performance"

Test Organization V3

Smart Test Classification

ai-state/regressions/
├── critical/          # Always run (login, payment, data)
│   ├── auth/
│   ├── payment/
│   └── data/
├── user-facing/       # Run for MVP+ (user workflows)
│   ├── onboarding/
│   ├── dashboard/
│   └── settings/
├── edge-cases/        # Run for Growth+ (boundaries)
│   ├── limits/
│   ├── errors/
│   └── recovery/
├── performance/       # Run for Scale (speed)
│   ├── load/
│   ├── stress/
│   └── benchmarks/
└── low-priority/      # Run when time permits
    ├── visual/
    ├── polish/
    └── rare/

Reporting V3

Actionable Reports

regression_report:
  phase: "mvp"
  duration: "4 minutes"

  summary:
    total_tests: 45
    ran: 20  # Only phase-appropriate
    skipped: 25  # Not relevant for MVP
    passed: 18
    failed: 2

  failures:
    - test: "test_payment_processing"
      importance: "CRITICAL"
      action: "FIX_NOW"
      suggested_fix: "Update field name from 'total' to 'amount'"

    - test: "test_tooltip_position"
      importance: "LOW"
      action: "DEFER"
      reason: "UI polish not critical for MVP"

  recommendations:
    - "Fix payment test before deploy"
    - "Defer UI tests until growth phase"
    - "Consider deleting 5 low-value tests"

Migration from V2

Reduce Test Suite

def migrate_regression_suite():
    """Reduce from 400+ tests to manageable"""

    all_tests = discover_all_tests()
    phase = detect_project_phase()

    if phase == 'prototype':
        keep_tests = filter_critical_only(all_tests)
        archive_rest(all_tests - keep_tests)
        print(f"Reduced from {len(all_tests)} to {len(keep_tests)} tests")

    organize_by_value(keep_tests)

Performance Optimization

Fast Feedback Loops

def optimize_regression_speed():
    """Make regressions fast"""

    strategies = [
        run_in_parallel(workers=4),
        skip_slow_tests_in_dev(),
        cache_test_fixtures(),
        use_in_memory_database(),
        mock_external_services()
    ]

    # Prototype: < 1 minute
    # MVP: < 5 minutes
    # Growth: < 15 minutes
    # Scale: < 30 minutes

Example Execution

Prototype Phase

$ /regression-quick

Running CRITICAL regression tests...
Phase: Prototype
Tests to run: 8 (out of 45 total)

✅ smoke_test_api (0.5s)
✅ test_user_login (1.2s)
✅ test_data_saves (0.8s)
✅ test_basic_auth (0.6s)
⚠️ Skipped 37 non-critical tests

Duration: 3.1 seconds
Result: PASS (4/4 critical tests passed)

MVP Phase

$ /regression-standard

Running MVP regression tests...
Phase: MVP
Tests to run: 25 (out of 120 total)

✅ Critical: 8/8 passed
✅ Auth: 4/4 passed
❌ User flows: 7/8 passed
✅ API: 5/5 passed
⚠️ Skipped 95 edge-case and performance tests

Duration: 12 minutes
Result: 1 failure (non-critical)
Action: Created tech debt ticket

Summary

Regression Runner V3:

Runs phase-appropriate tests only
Prioritizes by ROI
Auto-fixes mismatches
Provides escape hatches
Focuses on speed

Goal: Fast feedback with appropriate coverage for current phase.