Aiwg regression-report

Generate comprehensive regression analysis reports combining bisect, baseline, and metrics data with actionable recommendations

install

source · Clone the upstream repo

git clone https://github.com/jmagly/aiwg

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/jmagly/aiwg "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.agents/skills/regression-report" ~/.claude/skills/jmagly-aiwg-regression-report && rm -rf "$T"

manifest: .agents/skills/regression-report/SKILL.md

source content

regression-report

Generate comprehensive regression analysis reports combining bisect, baseline, and metrics data.

Triggers

Alternate expressions and non-obvious activations (primary phrases are matched automatically from the skill description):

"full regression report" → comprehensive regression analysis
"test health" → regression summary report

Purpose

This skill produces comprehensive regression reports by:

Synthesizing data from bisect, baseline, and metrics
Correlating regressions with code changes
Identifying systemic issues and patterns
Providing actionable recommendations
Creating executive summaries
Supporting postmortem analysis

Behavior

When triggered, this skill:

Gathers regression data:
- Load bisect reports for root cause
- Import baseline comparison results
- Fetch regression metrics and trends
- Retrieve issue tracker data
- Collect CI/CD logs
Correlates information:
- Link regressions to code changes
- Map to requirements and features
- Identify common root causes
- Connect to team/component ownership
Analyzes impact:
- Assess user impact
- Calculate downtime/degradation
- Measure business cost
- Evaluate reputation impact
Identifies patterns:
- Recurring regression types
- High-risk components
- Time-based patterns
- Process gaps
Generates insights:
- Root cause categories
- Prevention strategies
- Process improvements
- Tooling recommendations
Produces reports:
- Executive summary
- Technical deep-dive
- Action plan
- Lessons learned

Report Types

Incident Report

incident_report:
  description: Single regression deep-dive
  scope: One specific regression incident
  audience: Technical team and stakeholders

  sections:
    - incident_summary
    - timeline
    - root_cause_analysis
    - impact_assessment
    - resolution_steps
    - prevention_measures
    - lessons_learned

Sprint Regression Report

sprint_report:
  description: All regressions in a sprint
  scope: Sprint boundary
  audience: Development team

  sections:
    - sprint_summary
    - regression_list
    - metrics_summary
    - hotspot_analysis
    - recommendations
    - goals_for_next_sprint

Release Regression Report

release_report:
  description: Regression analysis for release
  scope: Release cycle
  audience: Release management and stakeholders

  sections:
    - release_summary
    - regression_timeline
    - escape_analysis
    - quality_gates_review
    - lessons_learned
    - release_readiness

Quarterly Regression Analysis

quarterly_report:
  description: Strategic regression analysis
  scope: 3 months
  audience: Leadership and engineering managers

  sections:
    - executive_summary
    - trend_analysis
    - strategic_insights
    - investment_recommendations
    - process_improvements
    - success_metrics

Comprehensive Report Format

# Comprehensive Regression Report

**Period**: Sprint 13 (2026-01-14 to 2026-01-28)
**Project**: User Service
**Report Type**: Sprint Regression Analysis
**Generated**: 2026-01-28 17:00:00
**Analyzer**: regression-report skill

---

## Executive Summary

**Overall Assessment**: ⚠️ Acceptable Quality with Concerns

| Metric | Value | Status |
|--------|-------|--------|
| Regressions Detected | 4 | ✅ Within target (< 5) |
| Production Escapes | 1 | ⚠️ Above target (0 preferred) |
| Mean Time to Detect | 8.5h | ✅ Excellent |
| Mean Time to Fix | 18.7h | ⚠️ Close to target |
| User Impact | 500 users | ⚠️ Medium |

**Key Findings**:
- Regression rate within acceptable range but auth module concerning
- One production escape indicates staging test gap
- Detection speed excellent, fix time acceptable
- Systemic issue: Integration testing coverage insufficient

**Priority Actions**:
1. Add integration tests for authentication flows
2. Improve staging environment test coverage
3. Implement regression test requirement for all fixes

---

## Regression Inventory

### REG-001: JWT Issuer Validation Breaks Existing Sessions

**Severity**: High
**Status**: Fixed (deployed 2026-01-16)
**Detection**: Automated test failure
**Environment**: Staging

**Timeline**:
- 2026-01-15 14:32: Breaking commit merged (pqr901)
- 2026-01-15 16:15: CI test failure detected (2h MTTD)
- 2026-01-15 18:45: Root cause identified via bisect
- 2026-01-16 09:30: Fix deployed (15h MTTF)

**Root Cause**: Required `iss` claim in JWT validation without backward compatibility

**Impact**: All existing user sessions invalidated

**Files Changed**:
- src/auth/validate-token.ts (+15/-5)
- src/auth/generate-token.ts (+8/-2)

**Fix**: Added feature flag for gradual rollout

**Prevention**:
- [ ] Add integration tests for auth changes
- [ ] Require backward compatibility review

**References**:
- Bisect Report: @.aiwg/testing/regression-bisect-jwt.md
- Issue: #456
- PR: #789

---

### REG-002: User Profile Update Returns 500 on Invalid Email

**Severity**: Critical
**Status**: Fixed (deployed 2026-01-18)
**Detection**: Production monitoring
**Environment**: Production

**Timeline**:
- 2026-01-17 08:00: Deployed to production
- 2026-01-17 20:15: Error spike detected (12h MTTD)
- 2026-01-18 02:30: Root cause identified
- 2026-01-18 04:15: Hotfix deployed (8h MTTF)

**Root Cause**: Validation middleware error handling broken, returns 500 instead of 400

**Impact**:
- 500 users affected
- 45 support tickets
- 12 hours partial service degradation

**Files Changed**:
- src/api/validation-middleware.ts (+3/-8)

**Cost**:
- Engineering: 8 hours incident response
- Support: 12 hours ticket triage
- Reputation: Customer satisfaction impact

**Fix**: Restored proper error handling

**Prevention**:
- [ ] Add error code validation tests
- [ ] Improve staging test coverage
- [ ] Add monitoring for error rate spikes

**References**:
- Incident Report: @.aiwg/incidents/INC-2026-001.md
- Issue: #478
- Hotfix PR: #791

---

### REG-003: Password Reset Email Fails for Gmail Users

**Severity**: Medium
**Status**: Fixed (deployed 2026-01-22)
**Detection**: Manual QA testing
**Environment**: Staging

**Timeline**:
- 2026-01-20 11:00: Code merged
- 2026-01-21 15:30: QA detected (28.5h MTTD)
- 2026-01-22 10:00: Fix deployed (18.5h MTTF)

**Root Cause**: Email template contained invalid HTML characters rejected by Gmail

**Impact**: Password reset broken for ~40% of user base (Gmail users)

**Files Changed**:
- templates/email/password-reset.html (+5/-3)

**Fix**: HTML entity encoding for special characters

**Prevention**:
- [ ] Add email template validation
- [ ] Test with multiple email providers
- [ ] Add email delivery monitoring

**References**:
- Issue: #482
- PR: #794

---

### REG-004: Dashboard Widget Load Time Increased

**Severity**: Low
**Status**: Fixed (deployed 2026-01-26)
**Detection**: Performance baseline comparison
**Environment**: Staging

**Timeline**:
- 2026-01-24 09:00: Performance test detected slowdown
- 2026-01-24 10:15: Root cause identified (1.25h MTTD)
- 2026-01-26 08:00: Optimization deployed (45.75h MTTF)

**Root Cause**: N+1 query introduced in dashboard widget

**Impact**: Dashboard load time increased from 0.8s to 2.3s

**Files Changed**:
- src/dashboard/widgets/activity-feed.ts (+12/-5)

**Fix**: Added eager loading to eliminate N+1

**Prevention**:
- [ ] Add performance regression tests
- [ ] Code review focus on query optimization

**References**:
- Baseline Comparison: @.aiwg/testing/baseline-comparison-perf.md
- Issue: #485
- PR: #797

---

## Impact Analysis

### User Impact

| Regression | Users Affected | Duration | Severity |
|------------|----------------|----------|----------|
| REG-001 | 0 (staging) | N/A | High (avoided) |
| REG-002 | 500 | 12h | Critical |
| REG-003 | 0 (staging) | N/A | Medium (avoided) |
| REG-004 | 0 (staging) | N/A | Low (avoided) |

**Total Production Impact**: 500 users, 12 hours
**Escapes Prevented**: 3 of 4 regressions caught pre-production

### Business Impact

Direct Costs:

Engineering incident response: 24 hours @ $150/h = $3,600
Support ticket triage: 12 hours @ $75/h = $900 Total Direct: $4,500

Indirect Costs:

Customer satisfaction impact (estimated)
Reputation damage (1 production incident)
Lost productivity (500 users × 12h) Estimated Indirect: $8,000

Total Estimated Cost: $12,500


### Component Impact

| Component | Regressions | User Impact | Risk Level |
|-----------|-------------|-------------|------------|
| src/auth/ | 1 | High (avoided) | 🔴 High |
| src/api/ | 1 | Critical (500 users) | 🔴 High |
| templates/ | 1 | Medium (avoided) | 🟡 Medium |
| src/dashboard/ | 1 | Low (avoided) | 🟢 Low |

**High-Risk Components**: Auth and API modules require increased testing

---

## Root Cause Analysis

### Root Cause Categories

| Category | Count | % | Examples |
|----------|-------|---|----------|
| Missing integration tests | 2 | 50% | REG-001, REG-002 |
| Missing validation | 1 | 25% | REG-003 |
| Performance not tested | 1 | 25% | REG-004 |

**Systemic Issue**: Integration testing coverage insufficient (50% of regressions)

### Prevention Success Rate

| Prevention Measure | Present? | Effective? |
|-------------------|----------|------------|
| Unit tests | ✅ Yes | ⚠️ Partial |
| Integration tests | ❌ No | N/A |
| Performance tests | ⚠️ Partial | ⚠️ Partial |
| Email validation | ❌ No | N/A |
| Staging environment | ✅ Yes | ✅ Yes (caught 3/4) |

**Gap**: Integration tests would have prevented 50% of regressions

---

## Metrics Summary

### Sprint Performance

| Metric | Sprint 13 | Sprint 12 | Change |
|--------|-----------|-----------|--------|
| Regressions | 4 | 4 | → Stable |
| MTTD | 8.5h | 12h | ↓ Improved |
| MTTF | 18.7h | 28.4h | ↓ Improved |
| Escape Rate | 25% (1/4) | 0% | ↑ Worsened |

**Trend**: Detection and fix times improving, but production escape concerning

### Quality Gates

| Gate | Target | Actual | Status |
|------|--------|--------|--------|
| Unit test coverage | 80% | 85% | ✅ Pass |
| Integration test coverage | 60% | 42% | ❌ Fail |
| Performance baseline | No degradation | 1 regression | ⚠️ Warning |
| Security scan | No critical | Pass | ✅ Pass |

**Failed Gate**: Integration test coverage below target

---

## Recommendations

### Immediate (This Week)

**Priority 1: Add Integration Tests for Auth**
- Reason: 1 auth regression, staging escape
- Impact: Prevent auth-related regressions
- Effort: 2 days
- Owner: Test Engineer
- Target: +20% integration coverage

**Priority 2: Fix Email Template Validation**
- Reason: REG-003 could have been automated
- Impact: Catch email issues pre-deployment
- Effort: 1 day
- Owner: DevOps Engineer

**Priority 3: Implement Regression Test Requirement**
- Reason: 25% recurrence rate on fixes without tests
- Impact: Prevent regression recurrence
- Effort: Process change (1 hour)
- Owner: Tech Lead

### Short-term (This Sprint)

**Priority 4: Improve Staging Test Coverage**
- Reason: Production escape indicates gap
- Impact: Reduce escape rate to <5%
- Effort: 1 week
- Owner: Test Architect

**Priority 5: Add Performance Regression Tests**
- Reason: REG-004 caught late
- Impact: Earlier performance issue detection
- Effort: 3 days
- Owner: Performance Engineer

**Priority 6: Review Error Handling Standards**
- Reason: REG-002 returned 500 instead of 400
- Impact: Consistent error behavior
- Effort: 2 days (review + guidelines)
- Owner: API Designer

### Ongoing

**Priority 7: Require Integration Tests for Auth PRs**
- Reason: Auth module high-risk
- Impact: No auth regressions without tests
- Effort: PR template update
- Owner: Tech Lead

**Priority 8: Weekly Regression Review**
- Reason: Early pattern identification
- Impact: Faster response to trends
- Effort: 30 min/week
- Owner: Test Lead

---

## Lessons Learned

### What Went Well

1. **Fast Detection**: MTTD of 8.5h shows automation working
2. **Staging Environment**: Caught 3 of 4 regressions before production
3. **Bisect Tooling**: Root cause identification very fast
4. **Team Response**: Fast fix times (18.7h MTTF)

### What Didn't Go Well

1. **Production Escape**: REG-002 bypassed all pre-production testing
2. **Integration Coverage**: Too low to catch cross-component issues
3. **Email Validation**: No automated testing for email templates
4. **Backward Compatibility**: REG-001 broke existing sessions

### Process Improvements

| Issue | Improvement | Timeline |
|-------|-------------|----------|
| Production escape | Add integration test gates | This sprint |
| Missing email validation | Automate template testing | This sprint |
| Auth regressions | Require integration tests | Immediate |
| Performance regressions | Add performance baselines | Next sprint |

---

## Sprint Goals for Sprint 14

Based on this analysis, Sprint 14 regression goals:

| Goal | Target | Success Criteria |
|------|--------|------------------|
| Regression Rate | < 4 | Fewer total regressions |
| Integration Coverage | 60% | Meet quality gate |
| Escape Rate | 0% | No production regressions |
| MTTD | < 8h | Maintain current level |
| MTTF | < 18h | Slightly faster fixes |

**Focus Areas**: Integration testing, staging coverage, auth module

---

## Appendices

### A. Regression Timeline

2026-01-15: REG-001 introduced (auth JWT) 2026-01-15: REG-001 detected (2h) 2026-01-16: REG-001 fixed (15h) 2026-01-17: REG-002 introduced (validation) 2026-01-17: REG-002 escaped to production 2026-01-18: REG-002 detected (12h) 2026-01-18: REG-002 fixed (8h) 2026-01-20: REG-003 introduced (email) 2026-01-21: REG-003 detected (28.5h) 2026-01-22: REG-003 fixed (18.5h) 2026-01-24: REG-004 introduced (performance) 2026-01-24: REG-004 detected (1.25h) 2026-01-26: REG-004 fixed (45.75h)


### B. References

- Bisect Reports: @.aiwg/testing/regression-bisect-*/
- Baseline Comparisons: @.aiwg/testing/baseline-comparisons/
- Metrics Dashboard: @.aiwg/testing/regression-metrics-dashboard.md
- Incident Reports: @.aiwg/incidents/
- Issues: GitHub Issues (label: regression)

### C. Data Sources

- Regression test results: `.aiwg/testing/regression-results/`
- CI/CD logs: GitHub Actions
- Issue tracker: GitHub Issues
- Monitoring: Datadog
- User reports: Support tickets

Usage Examples

Generate Sprint Report

User: "Generate regression report for this sprint"

Skill executes:
1. Identify sprint boundary (Sprint 13)
2. Collect all regression data
3. Correlate with code changes
4. Analyze patterns and impact
5. Generate recommendations

Output:
"Sprint Regression Report Generated

Sprint 13 Summary:
- 4 regressions detected
- 1 production escape
- 8.5h MTTD, 18.7h MTTF
- 500 users impacted

Key Findings:
- Integration testing gap (50% of regressions)
- Auth module high-risk (needs attention)
- Staging caught 75% (good)

Top Recommendations:
1. Add integration tests for auth
2. Improve staging coverage
3. Require regression tests for fixes

Full report: .aiwg/testing/regression-report-sprint-13.md"

Incident Postmortem

User: "Regression report for the validation incident"

Skill analyzes:
- REG-002 incident data
- Timeline and impact
- Root cause from bisect
- Related regressions

Output:
"Incident Postmortem: REG-002

Timeline: 12 hours (detection to fix)
Impact: 500 users, 45 support tickets
Cost: ~$12,500 (direct + indirect)

Root Cause: Validation middleware broken
Prevention: Integration tests missing

Recommendations:
- Add error code validation tests
- Improve staging coverage
- Add monitoring for error spikes

Lessons Learned:
- Fast response (8h MTTF)
- Staging gap allowed escape
- Need integration test gate

Full postmortem: .aiwg/testing/regression-postmortem-REG-002.md"

Quarterly Analysis

User: "Quarterly regression analysis"

Skill generates:
"Quarterly Regression Analysis (Q1 2026)

Executive Summary:
- 58 total regressions (-35% vs Q4 2025)
- 3 production escapes (-70% vs Q4)
- MTTD: 9.2h (↓ from 24h)
- MTTF: 22h (↓ from 42h)

Strategic Insights:
- Automation investment paying off (76% faster detection)
- Integration testing gap remains (40% of regressions)
- Auth and API modules highest risk

Investment Recommendations:
- $50k: Integration test automation
- $30k: Performance testing platform
- $20k: Staging environment expansion

Full analysis: .aiwg/testing/regression-analysis-Q1-2026.md"

Integration

This skill uses:

```
regression-bisect
```
: Import root cause analysis
```
regression-baseline
```
: Import drift data
```
regression-metrics
```
: Import statistical trends
```
project-awareness
```
: Detect sprints/releases
```
traceability-check
```
: Link to requirements

Agent Orchestration

agents:
  synthesis:
    agent: technical-writer
    focus: Report generation and clarity

  analysis:
    agent: metrics-analyst
    focus: Data correlation and insights

  recommendations:
    agent: test-architect
    focus: Prevention strategies

Configuration

Report Templates

report_templates:
  incident:
    template: templates/regression/incident-report.md
    sections: [summary, timeline, root_cause, impact, prevention]

  sprint:
    template: templates/regression/sprint-report.md
    sections: [summary, inventory, metrics, recommendations]

  quarterly:
    template: templates/regression/quarterly-report.md
    sections: [executive, trends, strategic, investments]

Data Aggregation

aggregation_config:
  sources:
    - bisect_reports: .aiwg/testing/regression-bisect-*/
    - baselines: .aiwg/testing/baseline-comparisons/
    - metrics: .aiwg/testing/regression-metrics-dashboard.md
    - issues: github_issues
    - incidents: .aiwg/incidents/

  correlation_rules:
    - link_bisect_to_issue
    - map_component_to_owner
    - calculate_business_impact

Output Locations

Reports:

.aiwg/testing/regression-report-{period}.md

Postmortems:

.aiwg/testing/regression-postmortem-{issue}.md

Quarterly:

.aiwg/testing/regression-analysis-Q{n}-{year}.md

References

@$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/schemas/test/regression-report-schema.yaml
@$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/templates/regression/incident-report.md
@$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/agents/technical-writer.md
@$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/agents/metrics-analyst.md