Crucible spec

Use when you have a GitHub epic (or equivalent) with child tickets and want to autonomously produce design docs, implementation plans, and machine-readable contracts for each ticket without human interaction. Triggers on /spec, 'spec out', 'write specs for', 'spec this epic'.

install
source · Clone the upstream repo
git clone https://github.com/raddue/crucible
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/raddue/crucible "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/spec" ~/.claude/skills/raddue-crucible-spec && rm -rf "$T"
manifest: skills/spec/SKILL.md
source content

Autonomous Spec Writer

Overview

<!-- CANONICAL: shared/dispatch-convention.md -->

All subagent dispatches use disk-mediated dispatch. See

shared/dispatch-convention.md
for the full protocol.

Fully autonomous skill that takes a GitHub epic (or equivalent issue tracker artifact), processes child tickets without human interaction, and produces complete design docs + implementation plans + machine-readable contracts per ticket. Designed to run unattended while a separate agent (or human) handles implementation.

The core insight: Separate the cognitive work (design, investigation, decision-making, planning) from the execution work (implementation, testing). One agent specs autonomously, another builds. The spec agent requires no human input after the initial invocation -- it investigates the codebase, makes design decisions, documents its reasoning, and flags uncertainty via terminal alerts rather than blocking on human answers. Contracts solve the hard problem of two async agents communicating through prose -- prose is ambiguous, contracts make inter-ticket interfaces structural and verifiable.

Invocation:

/spec https://github.com/org/repo/issues/123
/spec PROJ-456    # if Jira/Atlassian MCP is available

Announce at start: "I'm using the spec skill to autonomously produce design docs, implementation plans, and contracts for this epic."

Communication Requirement (Non-Negotiable)

After each wave completes and after each ticket within a wave reports back, output a status update to the terminal. This is NOT optional -- the user cannot see agent activity without your narration.

Every status update must include:

  1. Current wave -- Which wave is in progress or just completed
  2. Tickets completed / remaining -- Counts for the current wave and overall
  3. Alerts emitted -- Any medium/low/block confidence decisions from that wave
  4. Re-queued tickets -- Any tickets moved to a later wave due to dependency discovery

After compaction: If you just experienced context compaction, follow the Compaction Recovery procedure, re-read state from the scratch directory, and output current status before continuing. Do NOT proceed silently.

Example of GOOD narration:

"Wave 2 complete. 3/3 tickets committed. 1 medium-confidence alert on #45 (chose Redis over Postgres for session store). #67 re-queued to Wave 3 (new dependency on #45 discovered). Overall: 7/12 tickets done, 5 remaining across 2 waves."

Anti-Rationalization Table — spec

RationalizationRebuttalRule
"This ticket is small, I can skip the per-document quality gate."Ticket size does not predict specification defects. Small tickets frequently hide ambiguity that only QG surfaces.Run the quality gate on every design doc and every implementation plan, regardless of ticket size.
"All per-document gates passed, the integration check is unnecessary."Per-document PASS does not imply cross-ticket consistency. Contracts drift between tickets even when each is individually clean.Always run the end-of-run integration quality gate after the per-ticket gates pass.
"The ticket body looks clear, I can skip the investigation and go straight to writing."Ticket bodies consistently under-specify. Without investigation, autonomous decisions are made without grounding and surface as block-confidence alerts downstream.Investigate the codebase before writing any design content; cite investigation artifacts in the design doc.
"The decision looks obvious, I'll record it as high-confidence without alternatives."Listing alternatives is a forcing function for honest confidence calibration. Skipping it hides the fact that no alternatives were considered.Every decision logs ≥1 alternative or is explicitly marked
no-alternatives: true
with justification.
"I can save state in context memory instead of the scratch directory."Context is lost on compaction. Scratch-directory state is load-bearing for recovery.Every orchestrator state change writes to the scratch directory before narrating.
"The user's general 'looks good' counts as approval to skip a gate."Only an unambiguous instruction specifically referencing the gate is skip approval.Record
Status: SKIPPED
only after explicit gate-referencing skip instruction.

Pipeline Status

Write a status file to

~/.claude/projects/<hash>/memory/pipeline-status.md
at every narration point. This file is overwritten (not appended) and provides ambient awareness for the user in a second terminal.

Write Triggers

Write the status file at every point where the Communication Requirement mandates narration: before dispatch, after completion, phase transitions, health changes, escalations, and after compaction recovery.

Status File Format

The status file uses this structure (overwritten in full each time):

# Pipeline Status
**Updated:** <current timestamp>
**Started:** <timestamp from first write — persisted across compaction>
**Skill:** spec
**Phase:** <current phase, e.g. "Wave 2 (3/4 tickets in progress)">
**Health:** <GREEN|YELLOW|RED>
**Suggested Action:** <omit when GREEN; concrete one-sentence action when YELLOW/RED>
**Elapsed:** <computed from Started>

## Recent Events
- [HH:MM] <most recent event>
- [HH:MM] <previous event>
(last 5 events, newest first)

Skill-Specific Body

Append after the shared header:

## Tickets
- Wave 1: 3/3 complete
- Wave 2: 2/4 in progress (#45 writing, #67 investigating)
- Alerts: 1 medium-confidence on #45

## Compression State
Goal: [epic URL and description]
Key Decisions:
- [accumulated decisions, max 10]
Active Constraints:
- [dependency constraints, re-queued tickets]
Next Steps:
1. [immediate next action]
2. [subsequent actions]

Health State Machine

Health transitions are one-directional within a phase: GREEN -> YELLOW -> RED. Phase boundaries reset to GREEN.

  • Phase boundaries (reset to GREEN): each new wave
  • YELLOW: ticket re-queued more than once, teammate failure on a ticket, medium-confidence alert
  • RED: 2+ tickets failed in same wave, unresolvable dependency cycle, block-confidence alert

When health is YELLOW or RED, include

**Suggested Action:**
with a concrete, context-specific sentence (e.g., "Ticket #45 re-queued twice — may have an unresolvable dependency. Check dependency graph.").

Inline CLI Format

Output concise inline status alongside the status file write:

  • Minor transitions (dispatch, completion): one-liner, e.g.
    Wave 2 [7/12] #45 spec committed | GREEN | 45m
  • Phase changes and escalations: expanded block with
    ---
    separators
  • Health transitions: always expanded with old -> new health

Compaction Recovery

After compaction, before re-writing the status file: 0. Read the

## Compression State
section from
pipeline-status.md
— recover Goal, Key Decisions, Active Constraints, and Next Steps. If absent, skip to step 1.

  1. Read the rest of
    pipeline-status.md
    to recover
    Started
    timestamp and
    Recent Events
    buffer
  2. Reconstruct phase, health, and skill-specific body from internal state files
  3. Emit a Compression State Block into the conversation to seed the new context window
  4. Write the updated status file
  5. Output inline status to CLI

Epic Extraction

GitHub has no first-class "epic with child tickets" API. Use a fallback chain to extract scope units from the provided issue:

Extraction Strategy (ordered fallback)

  1. Sub-issues via GraphQL: Query the
    trackedIssues
    field on the issue. If the issue has sub-issues, use them as scope units.
  2. Task list checkboxes: Parse the issue body for task list items (
    - [ ]
    /
    - [x]
    ) that reference issues via
    #NNN
    syntax. Extract referenced issue numbers as scope units.
  3. Body issue references: Parse the issue body for any GitHub issue URLs (
    https://github.com/.../issues/NNN
    ) or
    #NNN
    references not inside task lists. Extract as scope units.
  4. Manual identification: If none of the above yield scope units, present the full issue body to the user and ask: "I couldn't find discrete child tickets. Can you identify the scope units for this work? You can provide issue numbers, paste URLs, or describe the work items and I'll create tickets for each."

Handling No Discrete Tickets

If the epic represents a single monolithic piece of work (no children, user confirms it's one unit), process it as a single-ticket run: one investigation, one design doc, one contract. The orchestration flow still applies, just with a single item in the queue.

Scratch Directory

All orchestrator and teammate state is persisted to

~/.claude/projects/<project-hash>/memory/spec/scratch/<run-id>/
. Full file-by-file schema (shared files, per-ticket directories, JSON shapes), reconciliation rules, stale-cleanup policy, and project-hash recovery: see
scratch-directory.md
.

Context Budget Management

Processing 5+ tickets will exhaust the orchestrator's context window. The skill uses cascading context compression:

Preemptive Context Checkpoint

The orchestrator triggers a planned save-and-compact cycle: compact after every 2 waves, or after any single wave that contained 4+ tickets. These thresholds are tied to the amount of work processed rather than unreliable context capacity estimates.

When a checkpoint triggers:

  1. Persist all current state to the scratch directory (ticket statuses, dependency graph, wave schedule, decisions log).
  2. Emit a Compression State Block into the conversation capturing Goal, accumulated decisions, active constraints, and next steps.
  3. Trigger compaction explicitly between waves rather than hitting mid-ticket compaction.
  4. After compaction, recover state via the Compaction Recovery procedure below.
  5. Resume processing with the next wave.

This prevents mid-ticket compaction, which wastes partial investigation work. The checkpoint always occurs at a clean boundary between waves.

Per-Ticket Context Lifecycle

  1. Before ticket investigation: Read only the ticket body, dependency graph, and upstream contracts relevant to this ticket from the scratch directory. Do not load prior tickets' full investigation results into context.
  2. During investigation: Run investigation agents as sub-agents (Agent tool). They return summaries, not full search results.
  3. After ticket completion: Write all outputs to disk (design doc, contract, status update). Compress the ticket's context contribution to a single-paragraph summary appended to
    decisions.md
    . Release the full investigation context.

Ticket Complexity Triage

Teammates run as sub-agents with their own context windows. Complex tickets can exhaust a teammate's context before investigation completes. Mitigate by triaging complexity before dispatch:

  1. Complexity signal: Count the number of design dimensions requiring investigation (inferred from ticket body + upstream dependency count + codebase area size from cartographer). If a ticket has 5+ design dimensions or 3+ upstream contracts to consume, flag it as "complex."
  2. Simplified investigation for complex tickets: Complex tickets use quick-scan investigation for ALL dimensions (read recon brief per dimension instead of 3-agent deep dive), with more aggressive summarization. The teammate's task description includes: "This ticket is flagged as complex. Use quick-scan investigation for all dimensions. Summarize each finding to 2-3 sentences before proceeding to the next dimension."
  3. Two-phase split for very large tickets: If a ticket has 8+ design dimensions, the orchestrator splits investigation into two phases with an intermediate disk persist. Phase A investigates the first half of dimensions, writes findings to
    tickets/<ticket-number>/partial-investigation.md
    , and completes. Phase B reads the partial investigation from disk, investigates the remaining dimensions, and proceeds to writing. This doubles the effective context budget at the cost of one extra sub-agent dispatch.

Compaction Recovery

After context compaction: 0. Read

## Compression State
from pipeline-status.md — recover Goal, Key Decisions, Active Constraints, Next Steps. If absent, skip to step 1.

  1. Read
    scratch/<run-id>/invocation.md
    first -- recover epic URL, extraction method, and user preferences.
  2. Read
    scratch/<run-id>/ticket-status.json
    -- determine which tickets are complete, in-progress, or pending.
  3. Read
    scratch/<run-id>/wave-schedule.json
    -- recover the current wave schedule.
  4. Read
    scratch/<run-id>/dependency-graph.json
    -- recover the current dependency DAG.
  5. Read
    scratch/<run-id>/decisions.md
    -- recover the decision log for context cascading to remaining tickets.
  6. For any ticket with status
    investigating
    ,
    dependency-check
    ,
    writing
    , or
    validating
    : restart from the beginning of its current phase.
  7. Emit a Compression State Block into the conversation to seed the new context window. 7.5. Read session index summary (supplementary): If the CSB Scratch State contains a
    Session Index:
    path, or if globbing
    ~/.claude/projects/<hash>/memory/session-index/*/summary.md
    finds a recent file, read
    summary.md
    . Include the Activity Timeline, Files Modified, and Key Decisions sections in the post-compaction narration. If no session index exists, skip silently — this step is purely additive.
  8. Resume processing from the wave schedule, skipping completed/committed tickets.

Checkpoint Timing

Emit a Compression State Block at:

  • Wave boundaries: After each wave completes, before starting the next
  • Preemptive context checkpoints: After every 2 waves, or after any single wave with 4+ tickets
  • Ticket re-queues: When tickets are re-queued to later waves due to dependency discovery
  • Escalations: Before any escalation to user
  • Health transitions: On any GREEN->YELLOW or YELLOW->RED transition

Pipeline-Active Marker

Before any dispatch work, check for a crashed prior spec session:

  1. Check
    <scratch>/.pipeline-active
    (where
    <scratch>
    is
    ~/.claude/projects/<hash>/memory/
    )
  2. Not found: Write the pipeline-active marker (JSON with
    pipeline_id
    set to current session ID,
    skill
    set to
    "spec"
    ,
    phase
    set to
    "init"
    ,
    start_time
    set to current ISO-8601 timestamp,
    scratch_dir
    and
    dispatch_dir
    paths,
    branch
    from
    git branch --show-current
    ,
    baseline_sha
    from
    git rev-parse HEAD
    ). Proceed to Orchestration Flow.
  3. Found, same
    pipeline_id
    :
    Compaction recovery (existing behavior). Do not re-write the marker.
  4. Found, different
    pipeline_id
    :
    Previous spec session crashed. Check marker's
    branch
    against current branch — if mismatched, warn the user which branch the crashed session was on. Present to user:

    "Previous spec session on branch [marker.branch] crashed. Start fresh? [yes]" Delete the stale marker. Write a fresh marker. Proceed to Orchestration Flow. (Full replay orchestration for spec is deferred -- detection and cleanup only for now.)

Marker cleanup: Delete

.pipeline-active
after the summary report (step [12]) completes.

Orchestration Flow

/spec <epic-url>
  |
  +-- [1] Consult cartographer (once) + forge feed-forward (once)
  |
  +-- [2] Fetch epic, extract child tickets (fallback chain)
  |
  +-- [3] Read ALL tickets upfront
  |
  +-- [4] Content-analyze tickets, infer dependency graph
  |
  +-- [5] Build wave schedule from dependency graph
  |       Group independent tickets into waves. Within a wave,
  |       all tickets are guaranteed to have no cross-dependencies.
  |
  +-- [6] Present execution plan to user
  |       "Wave 1: #1, #3, #5 (independent). Wave 2: #2 (depends on #1), #4..."
  |
  +-- [7] Ask: "Auto-create a PR for the epic, or just commit to the branch?"
  |
  +-- [8] Persist initial state to scratch directory
  |       Write invocation.md, scope-units.json, dependency-graph.json,
  |       wave-schedule.json, ticket-status.json (all pending)
  |
  +-- [9] Create team + tasks (Agent Teams, with sequential fallback)
  |
  +-- [10] Process waves sequentially, tickets within each wave in parallel
  |        |
  |        +-- Per wave:
  |            +-- Preemptive context checkpoint (every 2 waves, or after large waves)
  |            +-- Dispatch all tickets in wave as parallel teammates
  |            +-- Per ticket (teammate writes to tickets/<ticket-number>/):
  |                +-- Skip if completed (silent)
  |                +-- Skip if spec docs exist (mention in output)
  |                +-- Update local status -> "investigating"
  |                +-- Run investigation (same depth as /design)
  |                +-- Dependency discovery check -> write discoveries.json
  |                +-- Update local status -> "writing"
  |                +-- Security signal scan (shared/security-signals.md) -> include security_review in contract if signals detected
  |                +-- Write design doc + implementation plan + contract to output/
  |                +-- Contract schema validation
  |                +-- Update local status -> "validating"
  |                +-- Lightweight per-ticket validation (5 checks)
  |                +-- Update local status -> "committed"
  |                +-- Persist outputs + local decisions to ticket dir
  |                +-- (On failure: local status -> "failed", log reason, continue)
  |            +-- After wave completes (orchestrator):
  |                +-- Reconcile per-ticket outputs into shared state files
  |                +-- Cascade contracts: copy to shared contracts/ directory
  |                +-- Copy outputs from ticket dirs to docs/plans/
  |                +-- Commit outputs to spec/<epic-number> branch (serialized)
  |                +-- Check for re-queued tickets, update wave schedule
  |                +-- Output status update to terminal (include security_review status per ticket if present)
  |
  +-- [11] End-of-run quality gate
  |        +-- Phase 1: Per-document gates (design + plan per ticket, in parallel)
  |        +-- Phase 2: Cross-ticket integration check (contracts + dep graph only)
  |
  +-- [12] Summary report

Step-by-Step Detail

[1] Consult cartographer + forge: Use

crucible:cartographer
(consult mode) to review the codebase map and
crucible:forge
(feed-forward mode) to consult past lessons. Run once at the start of the run.

[2] Fetch epic, extract child tickets: Use the extraction fallback chain (sub-issues, task list checkboxes, body references, manual identification). See Epic Extraction section.

[3] Read ALL tickets upfront: Fetch the full title, body, labels, and linked issues for every extracted ticket. This enables dependency analysis before any investigation begins.

[4] Content-analyze tickets, infer dependency graph: Read every ticket's content and identify explicit references ("after #123 is done", "depends on the interface from #456") and implicit dependencies (ticket A defines an interface, ticket B consumes it). Build a DAG. See Dependency Analysis section for cycle handling.

[5] Build wave schedule: Topological sort the dependency graph. Assign each ticket to the earliest wave where all upstream dependencies are in prior waves. No intra-wave dependencies. See Wave-Based Scheduling section.

[6] Present execution plan: Show the user the wave schedule with ticket groupings and dependency rationale. The user can override (reorder, force sequential, etc.) or approve.

[7] Ask about auto-PR: "Auto-create a PR for the epic, or just commit to the branch?"

[8] Persist initial state: Write

invocation.md
,
scope-units.json
,
dependency-graph.json
,
wave-schedule.json
, and
ticket-status.json
(all tickets as
pending
) to the scratch directory.

[9] Create team + tasks: Use Agent Teams (TeamCreate/TaskCreate) for parallel execution. If Agent Teams unavailable, fall back to sequential subagent dispatch. See Parallel Execution section.

[10] Process waves: Sequential between waves, parallel within waves. Per-wave details in the flow diagram above. Post-wave reconciliation updates shared state, cascades contracts, copies outputs to

docs/plans/
, commits to the epic branch, and checks for re-queued tickets.

[11] End-of-run quality gate: Two phases -- per-document gates on each design doc and plan, then cross-ticket integration check on contracts and dependency graph. See End-of-Run Quality Gate section.

Decision Extraction (After All Waves Complete)

After all waves complete and before branch/PR operations:

  1. Read
    scratch/<run-id>/decisions.md
    (shared decision log)
  2. For each ticket in committed status, read
    tickets/<ticket-number>/decisions.md
  3. Collect all file paths from committed design docs (the
    Path:
    or file references within each design doc's Current State Analysis)
  4. Map decisions to cartographer modules using file path prefix matching
  5. Dispatch cartographer recorder with directive "Extract decisions for cartographer" Input: collected decisions, module mapping, existing module files, existing decisions.md
  6. Write recorder output to cartographer storage
  7. This step is RECOMMENDED, not REQUIRED -- failure does not block the spec run

[12] Summary report: Output a final report with: tickets completed, tickets failed (with reasons), tickets blocked (with decision context), alerts emitted, contracts produced, and the branch/PR URL.

Parallel Execution via Agent Teams

The orchestrator uses Agent Teams (TeamCreate/TaskCreate) to dispatch tickets within a wave in parallel:

  1. Create team at run start:

    TeamCreate: team_name="spec-<epic-number>", description="Speccing epic #NNN"
    
  2. Create tasks for each ticket via TaskCreate, with description containing the ticket body, upstream contracts, and relevant decisions log entries.

  3. Dispatch teammates for each ticket in the current wave. Each teammate writes all outputs to its isolated scratch directory (

    tickets/<ticket-number>/output/
    ). Teammates do not perform any git operations -- the orchestrator handles all git work after the wave completes.

  4. Track completion via TaskGet/TaskList. As teammates complete, the orchestrator collects results and updates the scratch directory.

Agent Teams Fallback

If

TeamCreate
fails (agent teams not available), output a clear one-time warning:

Agent teams are not available. Recommended: set

CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1
Falling back to sequential subagent dispatch via Agent tool.

Then fall back to sequential subagent dispatch via the Agent tool. Each ticket in a wave is dispatched sequentially instead of in parallel. All other behavior (wave scheduling, dependency discovery, validation, quality gate) is unchanged -- the run is slower but functionally identical.

Wave-Based Scheduling

Tickets are grouped into execution waves based on the dependency graph. This eliminates the need for runtime cancellation of in-progress tickets -- a capability Claude Code does not support.

Wave Construction

  1. Topological sort the dependency graph.
  2. Assign each ticket to the earliest wave where all its upstream dependencies are in prior waves.
  3. All tickets within a single wave are guaranteed independent of each other -- no ticket in wave N depends on another ticket in wave N.
  4. Persist the wave schedule to
    scratch/<run-id>/wave-schedule.json
    .

Wave Execution

Waves execute sequentially. Within each wave, all tickets execute in parallel (via Agent Teams) or sequentially (via Agent tool fallback). A wave does not begin until all tickets in the prior wave have reached a terminal state (committed or failed).

Re-queuing on Dependency Discovery

If investigation reveals a new dependency between two tickets assigned to the same wave:

  1. The downstream ticket's status is set to
    "re-queued"
    with reason "new upstream dependency discovered from #NNN".
  2. The downstream ticket is removed from the current wave and added to the next wave (or a new wave is created if none exists).
  3. The wave schedule is persisted to disk.
  4. The downstream ticket's work products (if any) are discarded -- it will restart from scratch in its new wave.

Because all tickets within a wave are dispatched simultaneously, the re-queued ticket may already be in progress. This is acceptable: the teammate will complete its work, but the orchestrator discards the results and re-processes the ticket in the correct wave with the upstream dependency's outputs available. No runtime cancellation is needed -- the wasted work is bounded to a single ticket's investigation and writing.

Per-Ticket Spec Writing

Each ticket goes through the same investigation process as

/design
, but fully autonomous. The per-ticket flow is encoded in the prompt template
spec/spec-writer-prompt.md
.

Step 1: Investigation

At the start of each ticket's investigation, dispatch

/recon
for structural context:

/recon
  task: "<ticket title and description>"
  session_id: "<spec-epic-run-id>"
  modules: ["impact-analysis"]

The

session_id
is the epic run's session ID -- shared across all tickets for Structure Scout cache reuse. The Structure Scout runs once for the first ticket and is cached for all subsequent tickets.

On recon failure: "Recon failed: [reason]. Falling back to inline investigation." Proceed without recon context -- dimension investigations explore from scratch.

Same depth as

/design
Phase 2 -- for each design dimension:

  • Deep dive (architectural decisions): 2 parallel agents (domain researcher, impact analyst) with recon brief context + challenger
  • Quick scan (implementation approach): read relevant sections of the recon brief (no agent dispatch needed)
  • Direct resolution (no technical implications): decide immediately

All investigation results cascade -- prior ticket decisions inform subsequent investigations via the decisions log in the scratch directory.

If the ticket is flagged "complex" (5+ design dimensions or 3+ upstream contracts), use quick-scan for ALL dimensions (read recon brief). Summarize each finding to 2-3 sentences.

Step 2: Dependency Discovery

After investigation completes but before writing begins:

  1. Compare investigation findings against the current dependency graph.
  2. If new cross-ticket dependencies found, write to
    tickets/<ticket-number>/discoveries.json
    .
  3. If no new dependencies, write empty discoveries:
    { "dependencies": [] }
    .

The orchestrator reconciles all discoveries after the wave completes:

  • Downstream ticket pending: Update graph. Re-queue if same wave.
  • Downstream ticket in same wave (already dispatched): Mark as
    re-queued
    . Orchestrator discards results and re-processes in a subsequent wave.
  • Downstream ticket already committed: Set to
    needs-respec
    . Emit terminal alert. Add to summary report.

Step 3: Autonomous Decision-Making

Where

/design
presents options and waits for the user,
/spec
decides:

  • Synthesizes investigation results
  • Picks the recommended option (or the only viable path)
  • Documents reasoning in the design doc
  • Assigns a confidence level to each decision

Decision thresholds:

ConfidenceCriteriaAction
HighOne option clearly dominates on technical merit, codebase alignment, and riskDecide silently, document in design doc
Medium2+ viable options with trade-offs that could go either wayDecide, emit terminal alert. Err on the side of alerting.
LowRequires domain knowledge, business context, or has irreversible consequencesDecide with strong recommendation to review. Emit terminal alert.
BlockIrreversible AND security/data-integrity implications (encryption, data migration, auth model)Do NOT decide. Set ticket to
blocked
. Document context and options. Emit alert.

Alert format:

SPEC ALERT [#123] (medium confidence): Chose X over Y -- see design doc for reasoning
SPEC ALERT [#123] (low confidence): Chose X over Y -- REVIEW RECOMMENDED before /build picks this up
SPEC ALERT [#123] (blocked): Cannot decide autonomously -- irreversible security/data-integrity decision. See scratch dir for options. Provide input on re-invocation.

Step 4: Document Generation

Produces three artifacts per ticket in

tickets/<ticket-number>/output/
. The orchestrator copies these to
docs/plans/
after the wave completes:

a. Design doc (

YYYY-MM-DD-<topic>-design.md
):

Frontmatter:

---
ticket: "#123"
epic: "#100"
title: "Brief ticket title"
date: "2026-03-21"
source: "spec"
---

Body sections:

  • Current state analysis
  • Target state
  • Key decisions with confidence scores and alternatives considered
  • Migration/implementation path (high-level direction, not task-level)
  • Risk areas
  • Acceptance criteria

b. Implementation plan (

YYYY-MM-DD-<topic>-implementation-plan.md
):

Same frontmatter as design doc (

ticket
,
epic
,
title
,
date
,
source
fields). Task-level granularity: which files to touch, approach per task, dependencies between tasks. Uses the crucible:planning task metadata format (Files, Complexity, Dependencies). NOT bite-sized TDD steps --
/build
's Plan Writer fills in that detail.
/build
still runs Plan Review + quality-gate on this plan.

c. Contract (

YYYY-MM-DD-<topic>-contract.yaml
):

See

contract-schema.md
for the full schema.

Step 5: Contract Schema Validation

After generating the contract YAML, validate against the schema. The rules below must stay consistent with the field definitions in

contract-schema.md
— if the schema changes, update both.

  1. Required fields present: Verify
    version
    ,
    ticket
    ,
    epic
    ,
    title
    ,
    date
    ,
    api_surface
    , and
    invariants
    all exist.
  2. Field value validation:
    • api_surface[].type
      must be one of:
      function
      ,
      class
      ,
      interface
      ,
      endpoint
      ,
      event
    • api_surface[].params
      must be present for
      function
      ,
      class
      , and
      interface
      types. Each param must have
      name
      ,
      type
      , and
      required
      fields.
    • invariants.checkable[].check_method
      must be one of:
      grep
      ,
      code-inspection
      ,
      file-structure
    • invariants.testable[].test_tag
      must match the pattern
      contract:<category>:<id>
  3. Security review field validation (when present):
    • security_review.status
      must be one of:
      required
      ,
      recommended
    • security_review.signals_detected
      must be a non-empty array
    • Each entry must have
      category
      (one of:
      auth
      ,
      crypto
      ,
      external_input
      ,
      secrets
      ,
      network
      ,
      pii_data
      ,
      dependencies
      ) and
      evidence
      (non-empty string)
    • security_review.deployment_context
      (if present) must be one of:
      public
      ,
      intranet
      ,
      hybrid
  4. Integration point validation: For each entry in
    integration_points
    , verify that the referenced contract file exists in
    docs/plans/
    or the scratch directory's
    contracts/
    folder. If the referenced contract does not yet exist (upstream ticket not yet processed), log a warning but do not block.
  5. On validation failure: Report specific errors. Re-dispatch the contract generation step with the validation errors as feedback. If the second attempt also fails, log the errors, mark the contract as having validation warnings, and continue -- do not block the entire run on a malformed contract.

Step 6: Lightweight Per-Ticket Validation

Five checks before committing:

  1. Contract schema check: Verify the contract passed Step 5 validation without errors.
  2. Acceptance criteria present: Verify the design doc contains an acceptance criteria section with at least one concrete criterion.
  3. Invariants defined: Verify the contract contains at least one checkable or testable invariant.
  4. Frontmatter complete: Verify all required frontmatter fields (
    ticket
    ,
    epic
    ,
    title
    ,
    date
    ,
    source
    ) are present in both the design doc and implementation plan.
  5. Cross-reference check: Verify the design doc, implementation plan, and contract all reference the same ticket number.

If any check fails, set ticket status to

"failed"
with the specific validation errors. Log and continue.

Step 7: Error Handling

On any failure during per-ticket processing:

  1. Write status to
    tickets/<ticket-number>/status.json
    : set status to
    "failed"
    , record the error reason.
  2. Log the failure to the terminal with the ticket number and error summary.
  3. Continue processing remaining tickets -- do not halt the entire run.
  4. Include failed tickets in the summary report with failure reasons.

Re-invocation resume logic:

On re-invocation of

/spec
with the same epic URL:

  1. Detect existing scratch directory (match on epic URL in
    invocation.md
    ). If not found at the canonical path, use the project-hash recovery procedure.
  2. Read
    ticket-status.json
    to determine resume point.
  3. Skip
    committed
    tickets. Retry
    failed
    and
    re-queued
    tickets. Resume
    pending
    tickets. Re-process
    needs-respec
    tickets with upstream contracts now available. Unblock
    blocked
    tickets: present blocking decision context and options to user, collect input, then resume.
  4. Present the resume plan to the user before proceeding.

Quality Gate Requirement (Non-Negotiable)

Every quality gate in this pipeline MUST run to completion. This is NOT optional — you may NOT self-assess whether a quality gate is "needed" based on ticket size, complexity, or scope. Spec dispatches quality gates on every committed ticket (potentially dozens), which creates strong temptation to skip on "simple" tickets. Do not yield to this temptation.

Fixing findings is NOT the same as passing the gate. The iteration loop must complete with a clean verification round (0 Fatal, 0 Significant on a fresh review). Spec is the highest-volume gate dispatcher — the short-circuit temptation is strongest here.

The only valid skip is an unambiguous user instruction specifically referencing the gate. General feedback is not skip approval.

Gate tracking: Before compiling the end-of-run summary, verify that every committed ticket has per-document gate round counts >= 1 with clean final rounds. If any gate was skipped with explicit user approval, record it as

USER_SKIP
. A zero without user approval indicates a gate was dropped — report this in the summary.

End-of-Run Quality Gate

After all waves complete and all tickets are in terminal states, run a two-phase quality gate.

Phase 1: Per-Document Quality Gates

For each committed ticket, dispatch two standard quality gate passes using existing artifact types:

  1. Design doc gate: (Non-negotiable — see Quality Gate Requirement.) Dispatch
    crucible:quality-gate
    with artifact type
    design
    on the ticket's design doc. Review scope: Are decisions well-reasoned? Are acceptance criteria testable? Is the current-state analysis accurate?
  2. Implementation plan gate: (Non-negotiable — see Quality Gate Requirement.) Dispatch
    crucible:quality-gate
    with artifact type
    plan
    on the ticket's implementation plan. Review scope: Are tasks concrete? Do they align with the design doc? Are dependencies between tasks identified?

These use the quality gate's existing iterative fix loop. Each gate runs within normal context budgets (one document per gate invocation). Per-document gates can run in parallel across tickets (via Agent Teams, or sequentially via Agent tool fallback).

Phase 2: Cross-Ticket Integration Check

After all per-document gates pass, run a mandatory integration check across ticket boundaries using the prompt template

spec/integration-check-prompt.md
. (Non-negotiable — see Quality Gate Requirement.) This check is mandatory but is NOT dispatched through
crucible:quality-gate
's iterative loop — it is a focused consistency review with targeted remediation.

Input (kept small for context budget):

  • All contract YAML files (500-1000 tokens each)
  • The final dependency graph
  • The decisions log, filtered: only cross-ticket decisions and medium/low/block confidence decisions. Single-ticket high-confidence decisions are excluded to keep context within budget.

Review scope:

  • Do contracts at integration points agree on signatures, types, and params?
  • Are there contradictory decisions across tickets?
  • Does the dependency graph match the actual integration points declared in contracts?
  • Are there gaps -- tickets that should have integration points but don't?

On findings: Each finding identifies a specific ticket and document. The orchestrator routes the fix based on finding type:

  • Design or plan findings: Dispatch a per-document quality gate (
    design
    or
    plan
    artifact type) on the identified document, with the integration finding included as review context.
  • Contract findings (mismatched signatures, missing integration points, contradictory surface declarations): Re-run the contract generation pipeline for the affected ticket -- re-execute Step 4 (contract portion) and Step 5 (validation). Contracts are re-derived from the source of truth rather than patched by a fix agent.

Verification re-pass: After all integration-triggered fixes complete, re-run the cross-ticket integration check exactly once as a verification pass. If the verification pass finds new issues, do NOT enter another fix cycle -- escalate to the user: "Integration verification found [N] new issue(s) after fix pass. These require manual review: [list findings]." Include unresolved findings in the summary report. This bounds the integration check to exactly two passes (initial + verification).

Dependency Analysis

The skill reads all tickets upfront and infers the dependency graph from content analysis:

  • Reads every ticket's title, body, labels, and any linked issues
  • Identifies explicit references ("after #123 is done", "depends on the interface from #456")
  • Identifies implicit dependencies (ticket A defines an interface, ticket B consumes it)
  • Builds a DAG, detects cycles

Cycle Detection

On cycle detection:

  1. Present the cycle concretely: Display the cycle as a list of edges: "#A depends on #B depends on #C depends on #A" with the specific dependency reason for each edge.
  2. Suggest breaking strategies: For each edge in the cycle, assess which dependency is weakest and suggest: (a) merge the cyclic tickets if tightly coupled, (b) defer the weakest dependency edge -- process downstream without upstream contract, mark for re-validation, or (c) remove a dependency edge if the user determines it is not a true blocker.
  3. User resolves: The user selects a breaking strategy or removes a specific dependency edge. The orchestrator updates the dependency graph, persists the modified
    dependency-graph.json
    with an annotation in
    decisions.md
    , and re-runs wave construction.
  4. If user unavailable (re-invocation scenario): Apply the weakest-edge deferral strategy automatically. Remove the weakest edge, log the decision as a medium-confidence autonomous decision with a terminal alert, and continue. The deferred ticket is marked for re-validation after its upstream completes.

The dependency graph is presented to the user before execution begins. The user can override (reorder, force sequential, etc.) or approve.

The dependency graph is a living document -- it may be updated during investigation (see Dependency Discovery in Step 2). The initial graph is the best guess from ticket content; investigation reveals ground truth from the codebase.

Skip Logic

  • Completed tickets (checked off in the epic): skipped silently.
  • Committed tickets (status
    committed
    in
    ticket-status.json
    ): skipped silently on re-invocation.
  • Tickets with existing spec docs (matching frontmatter
    ticket
    field in
    docs/plans/*-design.md
    ): mentioned in output, skipped. User sees: "Skipping #123 -- spec doc already exists at
    docs/plans/2026-03-15-auth-refactor-design.md
    "
  • Needs-respec tickets (status
    needs-respec
    in
    ticket-status.json
    ): re-processed on re-invocation with upstream contracts available. If the re-processed ticket generates a filename matching an existing file in
    docs/plans/
    , the new output overwrites the old file.

Branch Strategy

All tickets in an epic commit to a single branch:

spec/<epic-number>
. This avoids merge conflicts -- each ticket produces new files in
docs/plans/
with unique names, so parallel commits to the same branch never conflict.

  • Branch naming:
    spec/<epic-number>
    (e.g.,
    spec/123
    )
  • No per-teammate worktrees: Teammates do not use git worktrees. Each teammate writes all outputs to its isolated scratch directory. Teammates perform no git operations.
  • Orchestrator handles git: After each wave completes, the orchestrator copies outputs from per-ticket scratch directories to
    docs/plans/
    , then commits to the
    spec/<epic-number>
    branch. All git operations are serialized through the orchestrator. If the orchestrator needs a worktree for the epic branch (e.g., the user's working tree is on a different branch), it creates a single worktree for its own use.
  • Commit ordering: Within a wave, the orchestrator commits each ticket's outputs sequentially. Across waves, commits are naturally sequential.
  • Repository safety check (before PR creation): Before creating the PR, run
    gh repo view --json isPrivate -q .isPrivate
    . If the repo is public, scan all commit messages and the PR body for proprietary company information, internal names, or sensitive data. STOP and confirm with the user if anything looks sensitive. This is especially critical for /spec because it runs largely autonomously and files multiple issues in batch.
  • PR creation: A single PR is created (if user opted in) from the
    spec/<epic-number>
    branch after all tickets complete.

Contract Format

Machine-readable contracts make inter-ticket interfaces structural and verifiable rather than relying on ambiguous prose. Full schema (YAML), version rejection rule, invariant categories, cascading rules, and required-fields table: see

contract-schema.md
.

Red Flags

  • Skipping Compression State Block emission at checkpoint boundaries
  • Emitting a Compression State Block with stale or missing Key Decisions (decisions must be cumulative across all prior blocks)
  • Allowing the Goal field to drift across successive Compression State Blocks (must match original user request)
  • Exceeding 10 entries in the Key Decisions list without overflow-compressing the oldest
  • Skipping a per-document quality gate because the ticket seems "small", "simple", or "trivial"
  • Self-assessing that a quality gate is unnecessary based on perceived ticket complexity
  • Declaring a quality gate "done" after fixing findings without a clean verification round (fixing is not passing)
  • Skipping the integration check because "all per-document gates passed so it's fine"
  • Interpreting general user feedback as approval to skip a quality gate that has not yet run — once a gate has run and presented findings to the user, the user's decision to proceed is authoritative
  • Treating session index summary as authoritative over CSB state (session index is supplementary narrative, CSB is authoritative state)

Integration

Sub-skills used:

  • crucible:cartographer -- consult mode, once at start of run
  • crucible:forge -- feed-forward mode, once at start of run
  • crucible:design -- investigation prompts (parallel agents) reused for autonomous investigation. Templates in
    design/investigation-prompts.md
    .
  • crucible:recon -- dispatched per-ticket at investigation start with
    modules: ["impact-analysis"]
    and epic-level
    session_id
    for Structure Scout cache reuse across tickets. Replaces Codebase Scout. Fallback: investigate from scratch.
  • crucible:assay -- dispatched per architectural dimension (Deep Dive) with
    decision_type: "architecture"
    . Confidence routing: high=accept, medium=alert, low=block-alert. Fallback: manual synthesis.
  • crucible:quality-gate -- per-document gates (artifact types
    design
    and
    plan
    ) + cross-ticket integration check on contracts
  • crucible:worktree -- orchestrator-only, for the epic branch if the user's working tree is on a different branch

Prompt templates:

  • spec/spec-writer-prompt.md
    -- Per-ticket teammate prompt encoding the full 7-step spec writing flow
  • spec/integration-check-prompt.md
    -- Cross-ticket integration check prompt for Phase 2 quality gate

Trigger words:

/spec
, "spec out", "write specs for", "spec this epic"

Contract between /spec and /build:

/spec
produces files in
docs/plans/
with the naming convention
YYYY-MM-DD-<topic>-{design,implementation-plan,contract}.{md,yaml}
, with YAML frontmatter containing
ticket
,
epic
,
title
,
date
, and
source
fields.
/build
locates these files by scanning
docs/plans/
for frontmatter with a matching
ticket
field. The contract YAML schema (version 1.0) is the interface format -- both skills must agree on it.

Key Principles

  • Autonomous execution -- No human input after initial invocation. Investigate, decide, document, flag uncertainty.
  • Wave-based parallelism -- Independent tickets run in parallel within waves. Dependent tickets are sequenced across waves. No runtime cancellation needed.
  • Contract-first output -- Machine-readable contracts are a first-class output, not an afterthought. Contracts make inter-ticket interfaces structural and verifiable.
  • Disk-persisted state -- All critical state lives on disk in the scratch directory. Context compaction cannot lose progress. The orchestrator never relies solely on context memory.
  • Graceful degradation -- Agent Teams fallback to sequential dispatch. Re-invocation resumes from scratch directory state. Failed tickets are isolated. Blocked tickets defer to the user.
  • Cascading context -- Every decision informs subsequent investigations. Upstream contracts flow to downstream tickets. The decisions log is the running memory of the epic.
  • Quality over velocity -- Per-ticket validation, per-document quality gates, cross-ticket integration checks. The pipeline produces correct, consistent output even at the cost of additional passes.