install
source · Clone the upstream repo
git clone https://github.com/Intense-Visions/harness-engineering
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/Intense-Visions/harness-engineering "$T" && mkdir -p ~/.claude/skills && cp -r "$T/agents/skills/codex/test-e2e-strategy" ~/.claude/skills/intense-visions-harness-engineering-test-e2e-strategy-39f2b8 && rm -rf "$T"
manifest:
agents/skills/codex/test-e2e-strategy/SKILL.mdsource content
Test E2E Strategy
Choose the right test layer (unit/integration/E2E) and prevent flaky tests in CI
When to Use
- Designing a test strategy for a new project or feature
- Deciding which test layer to write for a specific behavior
- Reducing test suite flakiness in CI
- Balancing test coverage with execution speed
Instructions
- Apply the test trophy (not pyramid) — prioritize integration tests:
/ E2E \ — Few: critical user journeys / Integration \ — Many: service + database tests / Unit Tests \ — Some: pure logic, algorithms / Static Analysis \ — Always: TypeScript, ESLint
-
Choose the right layer:
- Static analysis — type errors, lint rules, formatting. Runs on every save
- Unit tests — pure functions, calculations, data transforms. No I/O
- Integration tests — service + database, API endpoints, module composition
- E2E tests — critical user journeys through the full stack
-
Write E2E tests for critical paths only:
// Good E2E candidates: // - User registration and login // - Checkout and payment flow // - Core feature happy path // Bad E2E candidates: // - Input validation (unit test) // - API error handling (integration test) // - Conditional rendering (component test)
- Prevent flakiness with deterministic test design:
// Bad: depends on timing await page.click('#submit'); await new Promise((r) => setTimeout(r, 2000)); expect(page.getByText('Success')).toBeVisible(); // Good: waits for specific condition await page.click('#submit'); await expect(page.getByText('Success')).toBeVisible({ timeout: 10000 });
- Isolate test data — each test creates its own data:
test('user can update their profile', async ({ page }) => { // Create unique test user — no shared state with other tests const user = await createTestUser(); await loginAs(page, user); // ... test continues });
- Structure CI test runs by speed:
# .github/workflows/test.yml jobs: static: steps: [typecheck, lint] # 30 seconds unit: steps: [vitest --run] # 1-2 minutes integration: steps: [vitest --config vitest.integration.ts] # 2-5 minutes e2e: steps: [playwright test] # 5-15 minutes
-
Handle flaky tests:
- First: fix the root cause (race condition, shared state, timing)
- Temporary: add
in CI configretries: 2 - Track: annotate with
or a tracking issuetest.fixme() - Never: delete the test or ignore it indefinitely
-
Coverage targets by layer:
- Unit: 80%+ branch coverage on business logic
- Integration: cover all API endpoints and error paths
- E2E: cover 3-5 critical user journeys
- Total: 70-85% combined coverage
Details
A test strategy defines which behaviors are tested at which layer. The goal is maximum confidence with minimum execution time.
Test trophy vs test pyramid: The traditional test pyramid (many unit, few integration, fewer E2E) optimizes for speed but misses integration bugs. The test trophy (coined by Kent C. Dodds) prioritizes integration tests because they catch the most bugs per test.
Cost of flaky tests:
- Developers lose trust in the test suite and ignore failures
- CI becomes unreliable — teams retry instead of investigating
- Flaky tests waste CI compute time
- Undetected flakes can mask real failures
Common flakiness causes:
- Shared mutable state between tests (global variables, database state)
- Race conditions (testing async behavior with timeouts instead of conditions)
- Time dependency (tests that break at midnight or on leap days)
- Network dependency (tests that call real external APIs)
- Order dependency (test A must run before test B)
Flakiness detection:
- Run the test suite 10x with
to find intermittent failures--repeat=10 - Use
to stop at the first failure and investigate--bail=1 - Track flaky test rates with a CI dashboard
Trade-offs:
- More E2E tests catch more real bugs — but are slow and expensive to maintain
- Integration tests have the best ROI — but require test databases and setup
- Unit tests are fast — but can pass while the system is broken
- Strict CI rules (no flakes allowed) maintain quality — but can slow development
Source
https://testing-library.com/docs/guiding-principles
Process
- Read the instructions and examples in this document.
- Apply the patterns to your implementation, adapting to your specific context.
- Verify your implementation against the details and edge cases listed above.
Harness Integration
- Type: knowledge — this skill is a reference document, not a procedural workflow.
- No tools or state — consumed as context by other skills and agents.
Success Criteria
- The patterns described in this document are applied correctly in the implementation.
- Edge cases and anti-patterns listed in this document are avoided.