Ai-coding-project-boilerplate integration-e2e-testing

Designs integration and E2E tests with mock boundaries and behavior verification rules. Use when writing E2E or integration tests.

install
source · Clone the upstream repo
git clone https://github.com/shinpr/ai-coding-project-boilerplate
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/shinpr/ai-coding-project-boilerplate "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills-en/integration-e2e-testing" ~/.claude/skills/shinpr-ai-coding-project-boilerplate-integration-e2e-testing && rm -rf "$T"
manifest: .claude/skills-en/integration-e2e-testing/SKILL.md
source content

Integration Test & E2E Test Design/Implementation Rules

References

Test Types and Limits

TypePurposeFile FormatLimit
Integration TestComponent interaction verification
*.int.test.ts
3 per feature
E2E TestCritical user journey verification
*.e2e.test.ts
1-2 per feature

Critical User Journey: Features with revenue impact, legal requirements, or daily use by majority of users

Behavior-First Principle

Observability Check (All YES = Include)

CheckQuestionIf NO
ObservableCan user observe the result?Exclude
System ContextDoes it require integration of multiple components?Exclude
AutomatableCan it run stably in CI environment?Exclude

Include/Exclude Criteria

Include: Business logic accuracy, data integrity, user-visible features, error handling Exclude: External live connections, performance metrics, implementation details, UI layout

Skeleton Specification

Required Comment Format

Each test MUST include the following annotations.

// AC: "[Acceptance criteria original text]"
// ROI: [0-100] | Business Value: [0-10] | Frequency: [0-10]
// Behavior: [Trigger] -> [Process] -> [Observable Result]
// @category: core-functionality | integration | edge-case | ux | e2e
// @dependency: none | [component name] | full-system
// @complexity: low | medium | high
// @real-dependency: [component name] (optional, when Test Boundaries specify non-mock setup)

Property Annotations

// Property: `[Verification expression]`
// fast-check: fc.property(fc.[arbitrary], (input) => [invariant])

Multi-Step User Journey Definition

A feature qualifies as containing a multi-step user journey when ALL of the following are true:

  1. 2+ distinct interaction boundaries are traversed in sequence to complete a user goal. What counts as a boundary depends on the system type:
    • Web: distinct routes/pages
    • Mobile native: distinct screens/views
    • CLI: distinct command invocations or interactive prompts
    • API: distinct API calls forming a transaction (e.g., create → confirm → finalize)
  2. State carries across steps — data produced or actions taken in one step affect what the next step accepts or displays
  3. The journey has a completion point — a final state the user or caller reaches (e.g., confirmation page, saved record, API success response, completed workflow)

User-Facing vs Service-Internal Journeys

Multi-step journeys are further classified for E2E budget decisions:

ClassificationConditionE2E Reserved SlotExample
User-facingA human user directly triggers and observes the steps (via UI, CLI, or direct API interaction)EligibleWeb checkout flow, CLI setup wizard, mobile onboarding
Service-internalSteps are triggered by backend services without direct user interactionNot eligible for reserved slotAsync job pipeline, service-to-service saga, scheduled batch processing

Scope of this classification:

  • Reserved E2E slot: Only user-facing journeys qualify. Service-internal journeys are excluded from the reserved slot.
  • Normal ROI > 50 path: Both user-facing and service-internal journeys compete for the additional E2E slot (up to 1) on ROI merit alone. Classification does not affect this path.
  • E2E Gap Check: Only user-facing journeys trigger the gap warning. Service-internal journeys do not.

ROI Calculation

ROI is used to rank candidates within the same test type (integration candidates against each other, E2E candidates against each other). Cross-type comparison is unnecessary because integration and E2E budgets are selected independently.

ROI Score = Business Value × User Frequency + Legal Requirement × 10 + Defect Detection
              (range: 0–120)

Higher ROI Score = higher priority within its test type. No normalization or capping is applied — the raw score is used directly for ranking. Deduplication is a separate step that removes candidates entirely; it does not modify scores.

ROI Threshold for E2E

E2E tests have high ownership cost (creation, execution, and maintenance are each 3-10× higher than integration tests). To justify creation, an E2E candidate (beyond the must-keep reserved slot) requires ROI Score > 50.

ROI Calculation Examples

ScenarioBVFreqLegalDefectROI ScoreTest TypeSelection Outcome
Core checkout flow109true9109E2ESelected (reserved slot: user-facing multi-step journey)
Payment error handling83false731E2EBelow threshold (31 < 50), not selected
Profile save flow76false648E2EBelow threshold (48 < 50), not selected
DB persistence check88false872IntegrationSelected (rank 1 of 3)
Error message display53false419IntegrationSelected (rank 2 of 3)
Optional filter toggle34false214IntegrationNot selected (rank 4, budget full)

Implementation Rules

Property-Based Test Implementation

When Property annotation exists, fast-check library is required:

import fc from 'fast-check'

it('AC2-property: Model name is always gemini-3-pro-image-preview', () => {
  fc.assert(
    fc.property(fc.string(), (prompt) => {
      const result = client.generate(prompt)
      return result.model === 'gemini-3-pro-image-preview'
    })
  )
})

Requirements:

  • Write in
    fc.assert(fc.property(...))
    format
  • Reflect skeleton's
    // fast-check:
    comment directly in implementation
  • When failure case discovered, add as concrete unit test (regression prevention)

Behavior Verification Implementation

Behavior Description Verification Levels:

Step TypeVerification TargetExample
TriggerReproduce in ArrangeAPI failure -> mockResolvedValue({ ok: false })
ProcessIntermediate state or callFunction call, state change
Observable ResultFinal output valueReturn value, error message, log output

Pass Criteria: Pass if "observable result" is verified as return value or mock call argument of test target

Verification Item Determination Rules

Skeleton StateVerification Item Determination Method
// Verification items:
listed
Implement all listed items with expect
No
// Verification items:
Derive from "observable result" in "Behavior" description
Both presentPrioritize verification items, use behavior as supplement

Integration Test Mock Boundaries

Judgment CriteriaMockActual
Part of test target?No -> Can mockYes -> Actual required
Is call verification target of test?No -> Can mockYes -> Actual or verifiable mock
External network communication?Yes -> Mock requiredNo -> Actual recommended

Judgment Flow:

  1. External API (HTTP communication) -> Mock required
  2. Component interaction under test -> Actual required
  3. Log output verification needed -> Use verifiable mock (vi.fn())
  4. Log output verification not needed -> Actual or ignore

E2E Test Execution Conditions

  • Execute only after all components are implemented
  • Do not use mocks (
    @dependency: full-system
    )

Review Criteria

Skeleton and Implementation Consistency

CheckFailure Condition
Property VerificationProperty annotation exists but fast-check not used
Behavior VerificationNo expect for "observable result"
Verification Item CoverageListed verification items not included in expect
Mock BoundaryInternal components mocked in integration test

Implementation Quality

CheckFailure Condition
AAA StructureArrange/Act/Assert separation unclear
IndependenceState sharing between tests, execution order dependency
ReproducibilityDepends on date/random, results vary
ReadabilityTest name and verification content don't match