Harness-engineering test-e2e-strategy

Test E2E Strategy

install

source · Clone the upstream repo

git clone https://github.com/Intense-Visions/harness-engineering

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/Intense-Visions/harness-engineering "$T" && mkdir -p ~/.claude/skills && cp -r "$T/agents/skills/codex/test-e2e-strategy" ~/.claude/skills/intense-visions-harness-engineering-test-e2e-strategy-39f2b8 && rm -rf "$T"

manifest: agents/skills/codex/test-e2e-strategy/SKILL.md

source content

Test E2E Strategy

Choose the right test layer (unit/integration/E2E) and prevent flaky tests in CI

When to Use

Designing a test strategy for a new project or feature
Deciding which test layer to write for a specific behavior
Reducing test suite flakiness in CI
Balancing test coverage with execution speed

Instructions

Apply the test trophy (not pyramid) — prioritize integration tests:

          /  E2E  \        — Few: critical user journeys
         / Integration \    — Many: service + database tests
        /    Unit Tests  \  — Some: pure logic, algorithms
       / Static Analysis  \ — Always: TypeScript, ESLint

Choose the right layer:
- Static analysis — type errors, lint rules, formatting. Runs on every save
- Unit tests — pure functions, calculations, data transforms. No I/O
- Integration tests — service + database, API endpoints, module composition
- E2E tests — critical user journeys through the full stack
Write E2E tests for critical paths only:

// Good E2E candidates:
// - User registration and login
// - Checkout and payment flow
// - Core feature happy path

// Bad E2E candidates:
// - Input validation (unit test)
// - API error handling (integration test)
// - Conditional rendering (component test)

Prevent flakiness with deterministic test design:

// Bad: depends on timing
await page.click('#submit');
await new Promise((r) => setTimeout(r, 2000));
expect(page.getByText('Success')).toBeVisible();

// Good: waits for specific condition
await page.click('#submit');
await expect(page.getByText('Success')).toBeVisible({ timeout: 10000 });

Isolate test data — each test creates its own data:

test('user can update their profile', async ({ page }) => {
  // Create unique test user — no shared state with other tests
  const user = await createTestUser();
  await loginAs(page, user);
  // ... test continues
});

Structure CI test runs by speed:

# .github/workflows/test.yml
jobs:
  static:
    steps: [typecheck, lint] # 30 seconds
  unit:
    steps: [vitest --run] # 1-2 minutes
  integration:
    steps: [vitest --config vitest.integration.ts] # 2-5 minutes
  e2e:
    steps: [playwright test] # 5-15 minutes

Handle flaky tests:
- First: fix the root cause (race condition, shared state, timing)
- Temporary: add
```
retries: 2
```
  in CI config
- Track: annotate with
```
test.fixme()
```
  or a tracking issue
- Never: delete the test or ignore it indefinitely
Coverage targets by layer:
- Unit: 80%+ branch coverage on business logic
- Integration: cover all API endpoints and error paths
- E2E: cover 3-5 critical user journeys
- Total: 70-85% combined coverage

Details

A test strategy defines which behaviors are tested at which layer. The goal is maximum confidence with minimum execution time.

Test trophy vs test pyramid: The traditional test pyramid (many unit, few integration, fewer E2E) optimizes for speed but misses integration bugs. The test trophy (coined by Kent C. Dodds) prioritizes integration tests because they catch the most bugs per test.

Cost of flaky tests:

Developers lose trust in the test suite and ignore failures
CI becomes unreliable — teams retry instead of investigating
Flaky tests waste CI compute time
Undetected flakes can mask real failures

Common flakiness causes:

Shared mutable state between tests (global variables, database state)
Race conditions (testing async behavior with timeouts instead of conditions)
Time dependency (tests that break at midnight or on leap days)
Network dependency (tests that call real external APIs)
Order dependency (test A must run before test B)

Flakiness detection:

Run the test suite 10x with
```
--repeat=10
```
to find intermittent failures
Use
```
--bail=1
```
to stop at the first failure and investigate
Track flaky test rates with a CI dashboard

Trade-offs:

More E2E tests catch more real bugs — but are slow and expensive to maintain
Integration tests have the best ROI — but require test databases and setup
Unit tests are fast — but can pass while the system is broken
Strict CI rules (no flakes allowed) maintain quality — but can slow development

Source

https://testing-library.com/docs/guiding-principles

Process

Read the instructions and examples in this document.
Apply the patterns to your implementation, adapting to your specific context.
Verify your implementation against the details and edge cases listed above.

Harness Integration

Type: knowledge — this skill is a reference document, not a procedural workflow.
No tools or state — consumed as context by other skills and agents.

Success Criteria

The patterns described in this document are applied correctly in the implementation.
Edge cases and anti-patterns listed in this document are avoided.