Agentops post-mortem
Wrap up completed work. Council validates the implementation, then extract and process learnings. Triggers: "post-mortem", "wrap up", "close epic", "what did we learn".
git clone https://github.com/boshu2/agentops
T=$(mktemp -d) && git clone --depth=1 https://github.com/boshu2/agentops "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills-codex/post-mortem" ~/.claude/skills/boshu2-agentops-post-mortem && rm -rf "$T"
skills-codex/post-mortem/SKILL.mdPost-Mortem Skill
Purpose: Wrap up completed work — validate it shipped correctly, extract learnings, process the knowledge backlog, activate high-value insights, and retire stale knowledge.
Runtime note: Hook-driven closeout is runtime-dependent. Claude/OpenCode can wire Phase 2-5 maintenance through lifecycle hooks. Codex does not expose that hook surface, so Codex sessions should finish closeout with
.ao codex ensure-stop
Six phases:
- Council — Did we implement it correctly?
- Extract — What did we learn?
- Process Backlog — Score, deduplicate, and flag stale learnings
- Activate — Promote high-value learnings to MEMORY.md and constraints
- Retire — Archive stale and superseded learnings
- Harvest — Surface next work for the flywheel
Quick Start
$post-mortem # wraps up recent work $post-mortem epic-123 # wraps up specific epic $post-mortem --quick "insight" # quick-capture single learning (no council) $post-mortem --process-only # skip council+extraction, run Phase 3-5 on backlog $post-mortem --skip-activate # extract + process but don't write MEMORY.md $post-mortem --deep recent # thorough council review $post-mortem --mixed epic-123 # cross-vendor (Claude + Codex) $post-mortem --skip-checkpoint-policy epic-123 # skip ratchet chain validation
Codex Closeout
In Codex hookless mode, run these after the post-mortem workflow writes learnings and next work:
ao codex ensure-stop --auto-extract ao codex status
ao codex ensure-stop is idempotent for the current Codex thread. It uses the latest transcript or history fallback to queue/persist learnings and run close-loop maintenance without runtime hooks.
Flags
| Flag | Default | Description |
|---|---|---|
| off | Quick-capture a single learning directly to without running a full post-mortem. Formerly handled by . |
| off | Skip council and extraction (Phase 1-2). Run Phase 3-5 on the existing backlog only. |
| off | Extract and process learnings but do not write to MEMORY.md (skip Phase 4 promotions). |
| off | 3 judges (default for post-mortem) |
| off | Cross-vendor (Claude + Codex) judges |
| off | Each judge spawns N explorers before judging |
| off | Two-round adversarial review |
| off | Skip ratchet chain validation |
| off | Skip pre-council deep audit sweep |
Quick Mode
Given
$post-mortem --quick "insight text":
Quick Step 1: Generate Slug
Create a slug from the content: first meaningful words, lowercase, hyphens, max 50 chars.
Quick Step 2: Write Learning Directly
Write to:
.agents/learnings/YYYY-MM-DD-quick-<slug>.md
--- type: learning source: post-mortem-quick date: YYYY-MM-DD maturity: provisional utility: 0.5 --- # Learning: <Short Title> **Category**: <auto-classify: debugging|architecture|process|testing|security> **Confidence**: medium ## What We Learned <user's insight text> ## Source Quick capture via `$post-mortem --quick`
This skips the full pipeline — writes directly to learnings, no council or backlog processing.
Quick Step 3: Confirm
Learned: <one-line summary> Saved to: .agents/learnings/YYYY-MM-DD-quick-<slug>.md For deeper reflection, use `$post-mortem` without --quick.
Done. Return immediately after confirmation.
Execution Steps
Pre-Flight Checks
Before proceeding, verify:
- Git repo exists:
— if not, error: "Not in a git repository"git rev-parse --git-dir 2>/dev/null - Work was done:
— if empty, error: "No commits found. Rungit log --oneline -1 2>/dev/null
first."$implement - Epic context: If epic ID provided, verify it has closed children. If 0 closed children, error: "No completed work to review."
If
: Skip Pre-Flight Checks through Step 3. Jump directly to Phase 3: Process Backlog.--process-only
Step 0.4: Load Reference Documents (MANDATORY)
Before Step 0.5 and Step 2.5, read the required reference docs into context:
REQUIRED_REFS=( "skills/post-mortem/references/checkpoint-policy.md" "skills/post-mortem/references/metadata-verification.md" "skills/post-mortem/references/closure-integrity-audit.md" "skills/post-mortem/references/four-surface-closure.md" )
For each reference file, read its content and hold it in context for use in later steps. Do NOT just test file existence with
[ -f ] -- actually read the content so it is available when Steps 0.5 and 2.5 need it.
If a reference file does not exist (Read returns an error), log a warning and add it as a checkpoint warning in the council context. Proceed only if the missing reference is intentionally deferred.
Step 0.5: Checkpoint-Policy Preflight (MANDATORY)
Read
references/checkpoint-policy.md for the full checkpoint-policy preflight procedure. It validates the ratchet chain, checks artifact availability, and runs idempotency checks. BLOCK on prior FAIL verdicts; WARN on everything else.
Step 1: Identify Completed Work and Record Timing
Record the post-mortem start time for cycle-time tracking:
PM_START=$(date +%s)
If epic/issue ID provided: Use it directly.
If no ID: Find recently completed work:
# Check for closed beads bd list --status closed --since "7 days ago" 2>/dev/null | head -5 # Or check recent git activity git log --oneline --since="7 days ago" | head -10
Step 1.5: RPI Session Metrics
Read
.agents/rpi/rpi-state.json and extract session ID, phase, verdicts, and streak data. If absent or unparseable, skip silently. Prepend a tweetable summary to the report: > RPI streak: N consecutive days | Sessions: N | Last verdict: PASS/WARN/FAIL. See references/streak-tracking.md for extraction logic and fallback behavior.
Step 2: Load the Original Plan/Spec
Before invoking council, load the original plan for comparison:
- If epic/issue ID provided:
to get the spec/descriptionbd show <id> - Search for plan doc:
ls .agents/plans/ | grep <target-keyword> - Check git log:
to find the relevant bead referencegit log --oneline | head -10
If a plan is found, include it in the council packet's
context.spec field:
{ "spec": { "source": "bead na-0042", "content": "<the original plan/spec text>" } }
Step 2.1: Load Compiled Prevention Context
Before council and retro synthesis, load compiled prevention outputs when they exist:
.agents/planning-rules/*.md.agents/pre-mortem-checks/*.md
Use these compiled artifacts first, then fall back to
.agents/findings/registry.jsonl only when compiled outputs are missing or incomplete. Carry matched finding IDs into the retro as Applied findings / Known risks applied context so post-mortem can judge whether the flywheel actually prevented rediscovery.
Step 2.2: Load Implementation Summary
Check for a crank-generated phase-2 summary:
PHASE2_SUMMARY=$(ls -t .agents/rpi/phase-2-summary-*-crank.md 2>/dev/null | head -1) if [ -n "$PHASE2_SUMMARY" ]; then echo "Phase-2 summary found: $PHASE2_SUMMARY" # Read the summary with the Read tool for implementation context fi
If available, use the phase-2 summary to understand what was implemented, how many waves ran, and which files were modified.
Step 2.3: Reconcile Plan vs Delivered Scope
Compare the original plan scope against what was actually delivered:
- Read the plan from
(most recent).agents/plans/ - Compare planned issues against closed issues (
)bd children <epic-id> - Note any scope additions, removals, or modifications
- Include scope delta in the post-mortem findings
Step 2.4: Closure Integrity Audit (MANDATORY)
Read
references/closure-integrity-audit.md for the full procedure. Mechanically verifies:
- Evidence precedence per child — every closed child resolves on the strongest available evidence in this order:
, thencommit
, thenstagedworktree - Phantom bead detection — flags children with generic titles ("task") or empty descriptions
- Orphaned children — beads in
but not linked to parent inbd listbd show - Multi-wave regression detection — for crank epics, checks if a later wave removed code added by an earlier wave
- Stretch goal audit — verifies deferred stretch goals have documented rationale
Include results in the council packet as
context.closure_integrity. WARN on 1-2 findings, FAIL on 3+.
If a closure is evidence-only, emit a proof artifact with
bash skills/post-mortem/scripts/write-evidence-only-closure.sh and cite at .agents/releases/evidence-only-closures/<target-id>.json. Record evidence_mode plus repo-state detail for replayability.
Step 2.5: Pre-Council Metadata Verification (MANDATORY)
Read
references/metadata-verification.md for the full verification procedure. Mechanically checks: plan vs actual files, file existence in commits, cross-references in docs, and ASCII diagram integrity. Failures are included in the council packet as context.metadata_failures.
Step 2.6: Pre-Council Deep Audit Sweep
Skip if
or --quick
.--skip-sweep
Before council runs, dispatch a deep audit sweep to systematically discover issues across all changed files. This uses the same protocol as
$vibe --deep — see the deep audit protocol in the vibe skill (skills/vibe/) for the full specification.
In summary:
- Identify all files in scope (from epic commits or recent changes)
- Chunk files into batches of 3-5 by line count (<=100 lines -> batch of 5, 101-300 -> batch of 3, >300 -> solo)
- Dispatch up to 8 Explore agents in parallel, each with a mandatory 8-category checklist per file (resource leaks, string safety, dead code, hardcoded values, edge cases, concurrency, error handling, HTTP/web security)
- Merge all explorer findings into a sweep manifest at
.agents/council/sweep-manifest.md - Include sweep manifest in council packet — judges shift to adjudication mode (confirm/reject/reclassify sweep findings + add cross-cutting findings)
Why: Post-mortem council judges exhibit satisfaction bias when reviewing monolithic file sets — they stop at ~10 findings regardless of actual issue count. Per-file explorers with category checklists find 3x more issues, and the sweep manifest gives judges structured input to adjudicate rather than discover from scratch.
Skip conditions:
flag -> skip (fast inline path)--quick
flag -> skip (old behavior: judges do pure discovery)--skip-sweep- No source files in scope -> skip (nothing to audit)
Step 3: Council Validates the Work
Council Verdict:
Run
$council with the retrospective preset and always 3 judges:
$council --deep --preset=retrospective validate <epic-or-recent>
Default (3 judges with retrospective perspectives):
: What was planned vs what was delivered? What's missing? What was added?plan-compliance
: What shortcuts were taken? What will bite us later? What needs cleanup?tech-debt
: What patterns emerged? What should be extracted as reusable knowledge?learnings
Post-mortem always uses 3 judges (
--deep) because completed work deserves thorough review.
Four-Surface Closure: Validate all four surfaces -- Code, Documentation, Examples, and Proof. A PASS verdict requires all four surfaces addressed, not just code correctness. Read
skills/post-mortem/references/four-surface-closure.md for the closure checklist and common gaps.
Timeout: Post-mortem inherits council timeout settings. If judges time out, the council report will note partial results. Post-mortem treats a partial council report the same as a full report — the verdict stands with available judges.
The plan/spec content is injected into the council packet context so the
plan-compliance judge can compare planned vs delivered.
With --quick (inline, no spawning):
$council --quick validate <epic-or-recent>
Single-agent structured review. Fast wrap-up without spawning.
With debate mode:
$post-mortem --debate epic-123
Enables adversarial two-round review for post-implementation validation. Use for high-stakes shipped work where missed findings have production consequences. See
$council docs for full --debate details.
Advanced options (passed through to council):
— Cross-vendor (Claude + Codex) with retrospective perspectives--mixed
— Override with different personas (e.g.,--preset=<name>
for production readiness)--preset=ops
— Each judge spawns N explorers to investigate the implementation deeply before judging--explorers=N
— Two-round adversarial review (judges critique each other's findings before final verdict)--debate
Step 3.5: Prediction Accuracy (Pre-Mortem Correlation)
When a pre-mortem report exists for the current epic (
ls -t .agents/council/*pre-mortem*.md), cross-reference its prediction IDs against actual vibe/implementation findings. Score each as HIT (prediction confirmed), MISS (did not materialize), or SURPRISE (unpredicted issue). Write a ## Prediction Accuracy table in the report. Skip silently if no pre-mortem exists. See references/prediction-tracking.md for the full table format and scoring rules.
Phase 2: Extract Learnings
Inline extraction of learnings from the completed work (formerly delegated to the retro skill).
Step EX.1: Gather Context
# Recent commits git log --oneline -20 --since="7 days ago" # Epic children (if epic ID provided) bd children <epic-id> 2>/dev/null | head -20 # Recent plans and research ls -lt .agents/plans/ .agents/research/ 2>/dev/null | head -10
Read relevant artifacts: research documents, plan documents, commit messages, and code changes. Use file reads and git commands to understand what was done.
If retrospecting an epic: Run the closure integrity quick-check from
references/context-gathering.md (Phantom Bead Detection + Multi-Wave Regression Scan). Include any warnings in findings.
Step EX.2: Classify Learnings
Ask these questions:
What went well?
- What approaches worked?
- What was faster than expected?
- What should we do again?
What went wrong?
- What failed?
- What took longer than expected?
- What would we do differently?
What did we discover?
- New patterns found
- Codebase quirks learned
- Tool tips discovered
- Debugging insights
- Test pyramid gaps found during implementation or review
For each learning, capture:
- ID: L1, L2, L3...
- Category: debugging, architecture, process, testing, security
- What: The specific insight
- Why it matters: Impact on future work
- Confidence: high, medium, low
Step EX.3: Write Learnings
Write to:
.agents/learnings/YYYY-MM-DD-<topic>.md
--- id: learning-YYYY-MM-DD-<slug> type: learning date: YYYY-MM-DD category: <category> confidence: <high|medium|low> maturity: provisional utility: 0.5 --- # Learning: <Short Title> ## What We Learned <1-2 sentences describing the insight> ## Why It Matters <1 sentence on impact/value> ## Source <What work this came from> --- # Learning: <Next Title> **ID**: L2 ...
Step EX.3.5: Test Pyramid Gap Analysis
Compare planned vs actual test levels per the test pyramid standard (
test-pyramid.md in the standards skill). For each closed issue: check planned test_levels metadata against actual test files. Write a ## Test Pyramid Assessment table (Issue | Planned | Actual | Gaps | Action). Gaps with severity >= moderate become next-work.jsonl items with type tech-debt.
Step EX.4: Classify Learning Scope
For each learning extracted in Step EX.3, classify:
Question: "Does this learning reference specific files, packages, or architecture in THIS repo? Or is it a transferable pattern that helps any project?"
- Repo-specific -> Write to
(existing behavior from Step EX.3). Use.agents/learnings/
to resolve repo root — never write relative to cwd.git rev-parse --show-toplevel - Cross-cutting/transferable -> Rewrite to remove repo-specific context (file paths, function names, package names), then:
- Write abstracted version to
(NOT local — one copy only)~/.agents/learnings/YYYY-MM-DD-<slug>.md - Run abstraction lint check:
If matches: WARN user with matched lines, ask to proceed or revise. Never block the write.file="<path-to-written-global-file>" grep -iEn '(internal/|cmd/|\.go:|/pkg/|/src/|AGENTS\.md|CLAUDE\.md)' "$file" 2>/dev/null grep -En '[A-Z][a-z]+[A-Z][a-z]+\.(go|py|ts|rs)' "$file" 2>/dev/null grep -En '\./[a-z]+/' "$file" 2>/dev/null
- Write abstracted version to
Note: Each learning goes to ONE location (local or global). No
promoted_to needed — there's no local copy to mark when writing directly to global.
Example abstraction:
- Local: "Compile's validate package needs O_CREATE|O_EXCL for atomic claims because Zeus spawns concurrent workers"
- Global: "Use O_CREATE|O_EXCL for atomic file creation when multiple processes may race on the same path"
Step EX.5: Write Structured Findings to Registry
Before backlog processing, normalize reusable council findings into
.agents/findings/registry.jsonl.
Use the tracked contract in
docs/contracts/finding-registry.md:
- persist only reusable findings that should change future planning or review behavior
- require
, provenance,dedup_key
,pattern
,detection_question
,checklist_item
, andapplicable_whenconfidence
must use the controlled vocabulary from the contractapplicable_when- append or merge by
dedup_key - use the contract's temp-file-plus-rename atomic write rule
This registry is the v1 advisory prevention surface. It complements learnings and next-work; it does not replace them.
Step EX.6: Refresh Compiled Prevention Outputs
After the registry mutation, refresh compiled outputs immediately so the same session can benefit from the updated prevention set.
If
hooks/finding-compiler.sh exists, run:
bash hooks/finding-compiler.sh --quiet 2>/dev/null || true
This promotes registry rows into
.agents/findings/*.md, refreshes .agents/planning-rules/*.md and .agents/pre-mortem-checks/*.md, and rewrites draft constraint metadata under .agents/constraints/. Active enforcement still depends on the constraint index lifecycle and runtime hook support, but compilation itself is no longer deferred.
Step ACT.3: Feed Next-Work
Actionable improvements identified during processing -> append one schema v1.3 batch entry to
.agents/rpi/next-work.jsonl using the tracked contract in
../../docs/contracts/next-work.schema.md
and the write procedure in
references/harvest-next-work.md.
Follow the claim/finalize lifecycle documented in references/harvest-next-work.md.
mkdir -p .agents/rpi # Build VALID_ITEMS via the schema-validation flow in references/harvest-next-work.md # Then append one entry per post-mortem / epic. # If a harvested item already maps to a known proof surface, preserve it on the # item as "proof_ref" instead of burying target IDs in free text. Example item: # [{"title":"Verify the parity gate after proof propagation lands","type":"task","severity":"medium","source":"council-finding","description":"Re-run the targeted validator after the follow-up lands.","target_repo":"agentops","proof_ref":{"kind":"execution_packet","run_id":"6f36a5640805","path":".agents/rpi/runs/6f36a5640805/execution-packet.json"}}] ENTRY_TIMESTAMP="$(date -Iseconds)" SOURCE_EPIC="${EPIC_ID:-recent}" VALID_ITEMS_JSON="${VALID_ITEMS_JSON:-[]}" printf '%s\n' "$(jq -cn \ --arg source_epic "$SOURCE_EPIC" \ --arg timestamp "$ENTRY_TIMESTAMP" \ --argjson items "$VALID_ITEMS_JSON" \ '{ source_epic: $source_epic, timestamp: $timestamp, items: $items, consumed: false, claim_status: "available", claimed_by: null, claimed_at: null, consumed_by: null, consumed_at: null }' )" >> .agents/rpi/next-work.jsonl
Step ACT.4: Update Marker
date -Iseconds > .agents/ao/last-processed
This must be the LAST action in Phase 4.
Phases 3-6 (Maintenance): Read
references/maintenance-phases.md for backlog processing, activation, retirement, and harvesting phases. Load when --process-only flag is set or when running full post-mortem.
Step 7: Report to User
Tell the user:
- Council verdict on implementation
- Key learnings
- Any follow-up items
- Location of post-mortem report
- Knowledge flywheel status
- Suggested next
command from the harvested$rpi
section (ALWAYS — this is how the flywheel spins itself)## Next Work - ALL proactive improvements, organized by priority (highlight one quick win)
- Knowledge lifecycle summary (Phase 3-5 stats)
The next
suggestion is MANDATORY, not opt-in. After every post-mortem, present the highest-severity harvested item as a ready-to-copy command:$rpi
## Flywheel: Next Cycle Based on this post-mortem, the highest-priority follow-up is: > **<title>** (<type>, <severity>) > <1-line description> Ready to run:
$rpi "<title>"
Or see all N harvested items in `.agents/rpi/next-work.jsonl`.
If no items were harvested, write: "Flywheel stable — no follow-up items identified."
Integration with Workflow
$plan epic-123 | v $pre-mortem (council on plan) | v $implement | v $vibe (council on code) | v Ship it | v $post-mortem <-- You are here | |-- Phase 1: Council validates implementation |-- Phase 2: Extract learnings (inline) |-- Phase 3: Process backlog (score, dedup, flag stale) |-- Phase 4: Activate (promote to MEMORY.md, compile constraints) |-- Phase 5: Retire stale learnings |-- Phase 6: Harvest next work |-- Suggest next $rpi --------------------+ | +----------------------------------------+ | (flywheel: learnings become next work) v $rpi "<highest-priority enhancement>"
Examples
Wrap Up Recent Work
User says:
$post-mortem
What happens:
- Agent scans recent commits.
- Runs
.$council --deep validate recent - Extracts learnings, processes backlog, and promotes items.
- Harvests next-work to
..agents/rpi/next-work.jsonl
Result: Report with learnings, stats, and a suggested
$rpi command.
Other Modes
- Epic-specific:
— review against the target plan$post-mortem ag-5k2 - Quick capture:
— write a learning without council$post-mortem --quick "insight" - Process-only:
— run backlog processing only$post-mortem --process-only - Cross-vendor:
— broaden judgment coverage$post-mortem --mixed ag-3b7
Troubleshooting
| Problem | Cause | Solution |
|---|---|---|
| Council times out | Epic too large or too many files changed | Split post-mortem into smaller reviews or increase timeout |
| No next-work items harvested | Council found no tech debt or improvements | Flywheel stable — write entry with empty items array to next-work.jsonl |
| Checkpoint-policy preflight blocks | Prior FAIL verdict in ratchet chain without fix | Resolve prior failure (fix + re-vibe) or skip checkpoint-policy via |
| Metadata verification fails | Plan vs actual files mismatch or missing cross-references | Include failures in council packet as — judges assess severity |
See Also
— Multi-model validation councilskills/council/SKILL.md
— Council validates code (skills/vibe/SKILL.md
after coding)$vibe
— Council validates plans (before implementation)skills/pre-mortem/SKILL.md
Reference Documents
- references/harvest-next-work.md
- references/learning-templates.md
- references/plan-compliance-checklist.md
- references/closure-integrity-audit.md
- references/security-patterns.md
- references/checkpoint-policy.md
- references/metadata-verification.md
- references/context-gathering.md
- references/output-templates.md
- references/backlog-processing.md
- references/activation-policy.md
- references/prediction-tracking.md
- references/retro-history.md
- references/streak-tracking.md
- references/maintenance-phases.md
- references/four-surface-closure.md