Atlas-session-lifecycle test-spec-gen

Universal test specification generator that explores codebases, researches best practices, and generates comprehensive test specs via multi-agent orchestration. Outputs Hermes-style test specification documents with TC-XXX formatting, area segmentation, and optional Trello card conversion.

install

source · Clone the upstream repo

git clone https://github.com/anombyte93/atlas-session-lifecycle

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/anombyte93/atlas-session-lifecycle "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/test-spec-gen" ~/.claude/skills/anombyte93-atlas-session-lifecycle-test-spec-gen && rm -rf "$T"

manifest: .claude/skills/test-spec-gen/SKILL.md

source content

Test Specification Generator Skill

Generates production-grade test specifications for any application through multi-agent exploration, research, and specialist generation.

UX Contract

User runs
```
/test-spec-gen [optional-filter]
```
Skill enters plan mode, presents approach, gets approval
Executes multi-agent pipeline silently
Presents test spec document for review
Iterates with quick-clarify until satisfied
Offers Trello card conversion

Phase 0: Plan Mode Entry (CRITICAL)

HARD GATE: Do NOT proceed with any exploration, research, or generation until plan mode is approved.

Upon invocation:

Read existing codebase to understand project type (web/desktop/mobile)
Present the multi-agent approach with estimated agent count
Ask user approval to proceed
Only AFTER approval, begin the exploration phase

Plan Mode Presentation Template

# Test Specification Generator

I will generate a comprehensive test specification for this project using multi-agent orchestration:

## Discovery Phase (5 parallel agents)
- Agent 1: Routes & Pages exploration
- Agent 2: Auth & RBAC mapping
- Agent 3: Data & Backend analysis
- Agent 4: Framework & Config discovery
- Agent 5: Navigation & UX flows

## Research Phase
- /research-before-coding based on discovery findings

## Generation Phase (5 specialist agents)
- Domains determined from exploration + research
- Each generates TC-XXX formatted test cases

## Verification Phase
- Doubt agent review for maintenance burden prevention
- Traceability matrix generation

## Output
- Hermes-style test specification document
- Optional Trello card conversion via /trello-test

Proceed?

EnterPlanMode Trigger

Call

EnterPlanMode()

immediately after UX Contract section with this prompt:

Generate test specification for [project_name] using multi-agent orchestration:

1. Discovery: 5 parallel explore agents map codebase
2. Research: /research-before-coding for best practices
3. Generation: 5 specialist agents create TC-XXX test cases
4. Verification: Doubt agent review + traceability matrix
5. Output: Test spec doc + optional Trello conversion

Wait for user approval before proceeding to Phase 1.

Phase 0.5: Dependency Pre-flight Check

After plan approval, BEFORE any exploration:

Check Required MCP Servers

# Check if MCP servers are configured
grep -E "perplexity|context7|playwright|trello" ~/.claude/mcp_config.json || echo "MISSING"

Check Test Frameworks

Detect project type and verify:

# Node.js projects
[ -f "package.json" ] && grep -E "(playwright|pytest|vitest|jest)" package.json || echo "NO TEST FRAMEWORK"

# Python projects
[ -f "pyproject.toml" ] && grep -E "(pytest|playwright)" pyproject.toml || echo "NO TEST FRAMEWORK"

Auto-fix with Confirmation

For each missing dependency:

❌ Missing: [dependency]
  Impact: [what breaks without it]
  Auto-fix: [command to install]
  Approve? (Y/n)

If user declines:

Log the refusal
Continue only if non-critical
Refuse to proceed if critical blocker

Critical vs Non-critical

Critical (must refuse without):

MCP servers for exploration (file read tools)
Basic test framework detection

Non-critical (can continue with warning):

Trello API (only needed for card conversion)
Visual regression tools
Performance monitoring tools

MCP Discovery Commands

# Check MCP config directly
grep -E "perplexity|context7|playwright|trello" ~/.claude/mcp_config.json 2>/dev/null && echo "MCP servers configured" || echo "Some MCP servers missing"

# Alternative: check ~/.claude.json (Claude Code's actual registry)
grep -E "perplexity|context7|playwright|trello" ~/.claude.json 2>/dev/null | grep -q "mcp" && echo "MCP servers registered" || echo "Some MCP servers missing"

Phase 1: Codebase Discovery (5 Parallel Agents)

Spawn 5 independent explore agents using Task tool. All agents run in parallel with no shared state.

Agent 1: Routes & Pages Explorer

Task(
  subagent_type: "Explore",
  model: "haiku",
  description: "Explore routes and pages",
  prompt: `You are a route and page discovery agent.

OBJECTIVE: Map ALL routes, pages, and UI components in this codebase.

OUTPUT FORMAT (JSON):
{
  "routes": [
    {"path": "/path", "component": "ComponentName", "file": "src/file.tsx", "type": "page|api|static"}
  ],
  "pages": [
    {"name": "Dashboard", "route": "/", "file": "src/pages/Dashboard.tsx", "key_elements": ["stats", "charts"]}
  ],
  "navigation": [
    {"from": "Sidebar", "to": "Dashboard", "label": "Home"}
  ]
}

SEARCH STRATEGY:
1. Find router configuration (App.tsx, router.ts, routes/)
2. Find page components (pages/, views/, screens/)
3. Find navigation components (Nav.tsx, Sidebar.tsx, Header.tsx)
4. Extract route-to-component mappings

Return ONLY the JSON. No explanation.`
)

Agent 2: Auth & RBAC Explorer

Task(
  subagent_type: "Explore",
  model: "haiku",
  description: "Explore auth and RBAC",
  prompt: `You are an authentication and authorization discovery agent.

OBJECTIVE: Map ALL auth mechanisms, roles, permissions, and access controls.

OUTPUT FORMAT (JSON):
{
  "auth_mechanism": "jwt|session|oauth|none",
  "roles": ["admin", "user", "guest"],
  "permissions": ["create:resource", "read:resource"],
  "auth_files": ["src/middleware/auth.ts"],
  "protected_routes": [
    {"route": "/admin", "roles": ["admin"], "guard": "requireAuth"}
  ],
  "login_endpoint": "/api/auth/login"
}

SEARCH STRATEGY:
1. Find auth middleware (auth.ts, middleware/, guards/)
2. Find role definitions (roles.ts, permissions.ts)
3. Find protected route decorators/middleware
4. Find login/logout endpoints

Return ONLY the JSON. No explanation.`
)

Agent 3: Data & Backend Explorer

Task(
  subagent_type: "Explore",
  model: "haiku",
  description: "Explore data and backend",
  prompt: `You are a data and backend discovery agent.

OBJECTIVE: Map ALL data models, API endpoints, and persistence mechanisms.

OUTPUT FORMAT (JSON):
{
  "database": "postgres|mongodb|sqlite|none",
  "orm": "prisma|drizzle|sequelize|none",
  "models": [
    {"name": "User", "fields": ["id", "email", "role"], "file": "models/User.ts"}
  ],
  "api_endpoints": [
    {"method": "GET", "path": "/api/users", "controller": "UserController"}
  ],
  "persistence_files": ["db/schema.prisma"]
}

SEARCH STRATEGY:
1. Find schema definitions (schema.prisma, models/, entities/)
2. Find API routes (api/, routes/, controllers/)
3. Find database config (db.ts, database.ts)
4. Find ORM usage (prisma., drizzle., sequelize.)

Return ONLY the JSON. No explanation.`
)

Agent 4: Framework & Config Explorer

Task(
  subagent_type: "Explore",
  model: "haiku",
  description: "Explore framework and config",
  prompt: `You are a framework and configuration discovery agent.

OBJECTIVE: Map framework, build tooling, CI config, and environment setup.

OUTPUT FORMAT (JSON):
{
  "framework": "react|vue|svelte|next|nuxt|custom",
  "language": "typescript|javascript|python",
  "build_tool": "vite|webpack|rollup|none",
  "test_framework": "playwright|jest|vitest|pytest|none",
  "ci_system": "github-actions|gitlab-ci|none",
  "env_files": [".env", ".env.example"],
  "config_files": ["next.config.js", "vite.config.ts"]
}

SEARCH STRATEGY:
1. Check package.json for dependencies
2. Check for framework config files
3. Check .github/workflows/ for CI
4. Check for .env files

Return ONLY the JSON. No explanation.`
)

Agent 5: Navigation & UX Explorer

Task(
  subagent_type: "Explore",
  model: "haiku",
  description: "Explore navigation and UX",
  prompt: `You are a navigation and user experience discovery agent.

OBJECTIVE: Map navigation flows, state management, and user interactions.

OUTPUT FORMAT (JSON):
{
  "navigation_structure": {
    "main": ["Dashboard", "Settings"],
    "sidebar": ["Items", "Reports"]
  },
  "state_management": "redux|zustand|context|none",
  "key_flows": [
    {"name": "Login", "steps": ["Enter credentials", "Submit", "Redirect"]}
  ],
  "interactive_elements": ["forms", "modals", "dropdowns"]
}

SEARCH STRATEGY:
1. Find navigation components (Nav, Sidebar, Menu)
2. Find state management (store.ts, context/, redux/)
3. Find form components (Form.tsx, Input.tsx)
4. Find modal/dialog components

Return ONLY the JSON. No explanation.`
)

Aggregate Discovery Results

After all 5 agents complete, aggregate their JSON outputs into a single discovery document:

## Discovery Summary

### Routes & Pages: [count] routes mapped
### Auth & RBAC: [mechanism] with [count] roles
### Data & Backend: [database] with [orm]
### Framework: [framework] with [test_framework]
### Navigation & UX: [state_management] with [count] key flows

Phase 2: Targeted Research

After discovery, invoke

/research-before-coding

with context from all 5 explore agents.

Research Agent Prompt

Task(
  subagent_type: "general-purpose",
  model: "haiku",
  description: "Research test generation best practices",
  prompt: `You are a research distillation agent for test generation.

CONTEXT FROM DISCOVERY:
Framework: {framework}
Auth: {auth_mechanism}
Database: {database}
Test Framework: {test_framework}

RESEARCH TOPIC: Test generation best practices for {framework} applications

CALLER NEEDS TO KNOW:
1. What are the test patterns for {framework}?
2. How to handle {auth_mechanism} in tests?
3. What are the common pitfalls for {database} testing?
4. What test coverage is industry standard for this stack?

LIBRARIES IN SCOPE: {framework}, {test_framework}, {orm}

## Step 1: Perplexity Research (3 queries)

perplexity_batch(queries: [
  {"query": "best practices {framework} {test_framework} testing 2026", "mode": "auto", "id": "practices"},
  {"query": "{framework} architecture patterns test automation", "mode": "auto", "id": "architecture"},
  {"query": "github {framework} {test_framework} examples production", "mode": "auto", "id": "examples"}
])

## Step 2: Context7 Documentation Queries (3 queries)

1. resolve-library-id("{framework}") -> query-docs(id, "testing fixtures patterns")
2. resolve-library-id("{test_framework}") -> query-docs(id, "test organization structure")
3. resolve-library-id("{orm}") -> query-docs(id, "database testing patterns")

## Step 3: Divergent WebSearch (exactly 1)

WebSearch("problems with {framework} automated testing flaky tests alternatives")

## Step 4: Distill

Produce this output format:

---
## Research: {Framework} Test Generation

### Recommended Approach
- **Pattern**: [pattern name]
- **Tools**: [recommended tools]
- **Why**: [rationale]

### Key Test Patterns
[code pattern]

### Pitfalls to Avoid
- [pitfall 1]
- [pitfall 2]

### Test Domains to Cover
Based on research, these test domains are recommended:
1. [Domain 1]
2. [Domain 2]
3. [Domain 3]
4. [Domain 4]
5. [Domain 5]
---

Return ONLY this distilled output.`
)

Determine Test Domains

From research output, extract the 5 test domains for specialist agents.

Fallback domains if research doesn't specify:

UI Component Testing
User Flow Testing
Error Handling & Edge Cases
Data Integrity & CRUD
Access Control & Security

Phase 3: Test Specification Generation

Spawn 5 specialist agents (one per test domain from research). Each generates TC-XXX formatted test cases.

Specialist Agent Template

For each domain, spawn an agent with this structure:

Task(
  subagent_type: "general-purpose",
  model: "sonnet",
  description: "Generate {domain} test cases",
  prompt: `You are a {domain} test specification specialist.

DISCOVERY CONTEXT:
{paste relevant discovery JSON}

RESEARCH CONTEXT:
{paste relevant research findings}

YOUR DOMAIN: {domain_name}

OBJECTIVE:
Generate comprehensive test cases for the {domain_name} domain following Hermes TC-XXX format.

OUTPUT FORMAT (Markdown):

## TC-{XXX}: [Test Name]

- **Area**: {domain_name}
- **Priority**: [Critical|High|Medium|Low]
- **Preconditions**:
  - [condition 1]
  - [condition 2]

### Test Steps
1. [action with selector/endpoint]
2. [verification step]
3. [assertion]

### Expected Outcome
[clear description of expected result]

### Pass Criteria
- [specific condition 1]
- [specific condition 2]

### Performance Threshold (if applicable)
- [metric]: [value]
- Timeout: [ms]

### Notes
- [any special considerations]

---

GENERATION RULES:
1. Each test must be independently executable
2. Use specific selectors/endpoints from discovery
3. Include negative tests (error cases)
4. Cover edge cases for this domain
5. Reference specific files/components from discovery

Generate 5-15 test cases for this domain. Return ONLY the markdown.`
)

Domain-Specific Guidelines

UI Component Testing:

Focus on visibility, interaction, responsiveness
Test all component states (loading, error, success, empty)
Include accessibility checks

User Flow Testing:

Cover complete user journeys
Test happy path + alternate paths
Include flow interruptions and recovery

Error Handling & Edge Cases:

Network failures
Invalid inputs
Boundary conditions
Concurrent operations

Data Integrity & CRUD:

Create, Read, Update, Delete operations
Data validation
Foreign key relationships
Transaction rollback

Access Control & Security:

Role-based access
Unauthorized access attempts
Session management
CSRF/XSS considerations

Phase 4: Document Assembly

After all 5 specialist agents complete:

Aggregate all TC-XXX outputs
Load the test-spec.md template
Replace placeholders:
- ```
{PROJECT_NAME}
```
  -> from discovery
- ```
{DATE}
```
  -> current date
- ```
{FRAMEWORK}
```
  ,
```
{LANGUAGE}
```
  ,
```
{TEST_FRAMEWORK}
```
  -> from discovery
Insert specialist outputs into
```
{SPECIALIST_OUTPUTS}
```
section
Generate traceability matrix from TC references

Save to

docs/test-specifications/{PROJECT_NAME}-test-spec.md

Phase 5: Hierarchical Verification

Following Anthropic's testing practices, implement two-layer verification.

Doubt Agent Review

Task(
  subagent_type: "general-purpose",
  model: "sonnet",
  description: "Review test spec for maintenance burden",
  prompt: `You are a doubt agent reviewing a generated test specification.

YOUR ROLE: Ruthlessly critique the test specification for:
1. Maintenance burden - will these tests break constantly?
2. Business logic validity - do tests actually verify real requirements?
3. Coverage gaps - what's missing?
4. Brittleness - what assumptions will fail?

TEST SPECIFICATION:
{paste full test spec}

OUTPUT FORMAT:

## Doubt Agent Review

### Critical Issues (must fix)
- [issue 1] with recommendation
- [issue 2] with recommendation

### Maintenance Risk Assessment
- **Risk Level**: [High|Medium|Low]
- **Projected Invalidation Rate**: [%] over 6 months
- **Mitigation Recommendations**: [list]

### Coverage Gaps
- [missing domain/feature]
- [missing edge case]

### Over-testing (remove to reduce burden)
- [redundant test]
- [too-specific test]

### Approval
[ ] APPROVED - Proceed to quick-clarify
[ ] REVISE - Address critical issues first`
)

Finality Agent Verification (after user approval)

Task(
  subagent_type: "general-purpose",
  model: "sonnet",
  description: "Final verification of test spec",
  prompt: `You are a finality agent. Verify the test specification is production-ready.

CHECKLIST:
- [ ] All TC-XXX tests have unique IDs
- [ ] Preconditions are complete and achievable
- [ ] Expected outcomes are unambiguous
- [ ] Pass criteria are objectively verifiable
- [ ] Performance thresholds include units
- [ ] Traceability matrix is complete
- [ ] Exit criteria are measurable
- [ ] Doubt agent issues addressed

FINAL DECISION:
[ ] VERIFIED - Ready for Trello conversion
[ ] INCOMPLETE - Specify remaining issues

Test Specification:
{paste test spec}`
)

Phase 6: Quick-Clarify Iteration

After verification, present the test specification to the user for review.

Invoke Quick-Clarify

Skill("quick-clarify")

With this context:

Test specification generated for {PROJECT}.
{TOTAL_TESTS} test cases across {DOMAIN_COUNT} domains.

Key areas:
- {DOMAIN_1}: {COUNT} tests
- {DOMAIN_2}: {COUNT} tests
- {DOMAIN_3}: {COUNT} tests
- {DOMAIN_4}: {COUNT} tests
- {DOMAIN_5}: {COUNT} tests

Document saved to: {SPEC_PATH}

Iteration Loop

If user requests changes:

Identify which section/domain to modify
Re-run ONLY the affected specialist agent
Merge new output with existing document
Re-run doubt agent verification
Present updated version

Repeat until user satisfied.

Loop Exit Conditions

User indicates satisfaction by:

Explicit approval ("looks good", "approved")
Choosing to proceed to Trello conversion
No further modifications requested

Phase 7: Trello Card Conversion (Optional)

After user approves the test specification:

Ask User

Test specification approved. Convert to Trello cards?

Options:
- "Yes" -> Create Trello cards via /trello-test
- "No" -> Exit with test spec document only
- "Show mapping first" -> Preview card structure

Card Mapping Strategy

Each TC-XXX test case becomes one Trello card with:

Card Name: "Test: TC-XXX - [Test Name]"
Description:
## TC-{XXX}: [Test Name]

**Area**: {domain}
**Priority**: [Critical|High|Medium|Low]

### Preconditions
- [list]

### Test Steps
1. [step]
2. [step]

### Expected Outcome
[outcome]

### Pass Criteria
- [criteria]

Checklist:
- [ ] Test implemented
- [ ] Test passes locally
- [ ] Test passes in CI
- [ ] Documentation updated

Invoke Trello-Test Skill

Skill("trello-test")

Pass the parsed test cases for card creation.

Board Structure

If new board needed:

List: "Untested" - All new cards
List: "In Progress" - Cards being implemented
List: "Failed" - Cards with failing tests
List: "Passed" - Cards with passing tests
List: "Partial" - Cards with partial implementation

Error Handling

Agent Failures

If an explore agent fails:

Log the failure with agent ID and error
Continue with other agents
Use partial discovery data
Flag missing domains in output

If a specialist agent fails:

Log the failure
Create placeholder TC-XXX: "TODO: {domain} tests - generation failed"
Document the error
Continue with other domains

Refusal Conditions

The skill MUST refuse to proceed if:

No Runnable App Detected
- No framework config found
- No source code detected
- Output: "Cannot generate tests - no application detected"
Auth Cannot Be Stabilized
- Multiple conflicting auth mechanisms
- No clear auth pattern
- Output: "Cannot generate tests - auth pattern unclear"
Data Layer Inaccessible
- No database connection info
- ORM models not discoverable
- Output: "Cannot generate tests - data layer unclear"
User Cancels
- At any phase, user may cancel
- Save partial progress if requested

Recovery

If generation is interrupted:

Save completed sections to temp file
Offer to resume or restart
Log interruption point

Traceability Matrix Generation

After test case generation:

Parse all TC-XXX entries
Extract component/file references
Create requirement-to-test mapping
Generate traceability matrix
Include in test spec appendix

Parse Template

# Pseudo-code for traceability generation
traceability_rows = []
for test_case in test_cases:
    row = {
        "test_case": test_case.id,
        "requirement": test_case.requirement or "TBD",
        "component": test_case.component,
        "file": test_case.file,
        "line": test_case.line or "TBD",
        "status": "pending",
        "last_run": "-"
    }
    traceability_rows.append(row)

User Experience

Progress Reporting

Report progress at each phase completion:

✓ Discovery complete: 5 agents, X routes mapped
✓ Research complete: Best practices for {framework} identified
✓ Test domains determined: {domain1}, {domain2}, {domain3}, {domain4}, {domain5}
⏳ Generating tests: 2/5 specialist agents complete...
✓ Generation complete: {TOTAL} test cases generated
⏳ Verifying: Doubt agent review in progress...
✓ Verification complete: 0 critical issues
✓ Test spec saved to: {PATH}

Silent Execution

Don't narrate internal steps
Don't announce "Starting Phase X"
User sees ONLY progress checkpoints and final output
All agent spawning happens silently

Error Messages

Clear, actionable error messages:

❌ Discovery Failed: Agent 3 (Data & Backend) crashed
  Cause: [error from agent]
  Impact: Database-related tests may be incomplete
  Action: Continuing with partial discovery...

Usage Examples

Basic invocation

/test-spec-gen

With filter

/test-spec-gen --filter "auth"

Specific domains

/test-spec-gen --domains "ui,flows,errors"

Skip Trello prompt

/test-spec-gen --no-trello