Claude-corps plan-eng-review
Interactive engineering plan review: architecture, code quality, tests, performance. One issue per question. Run after /spec before implementation, or standalone on any plan.
git clone https://github.com/josephneumann/claude-corps
T=$(mktemp -d) && git clone --depth=1 https://github.com/josephneumann/claude-corps "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/plan-eng-review" ~/.claude/skills/josephneumann-claude-corps-plan-eng-review && rm -rf "$T"
skills/plan-eng-review/SKILL.md/plan-eng-review $ARGUMENTS
Engineering manager-mode plan review. Lock in the execution plan — architecture, data flow, diagrams, edge cases, test coverage, performance. Walks through issues interactively with opinionated recommendations.
Priority hierarchy: Step 0 > Test diagram > Failure modes > Everything else.
Tone: Opinionated. Direct. One issue at a time. Not a rubber stamp.
Do NOT make code changes. This is a plan review, not implementation.
Argument Parsing
= path to plan file, or empty$ARGUMENTS- If empty, find the most recent plan:
ls -lt docs/plans/*.md 2>/dev/null | head -5 - If no plans found: "No plans in
. Rundocs/plans/
first." Then STOP./spec
Read the plan file.
Cognitive Patterns — How Great Engineering Leaders Think
These are not checklist items. They are instincts that experienced engineering leaders develop — the pattern recognition that separates "reviewed the plan" from "caught the landmine." Apply them throughout.
- State diagnosis — Teams exist in four states: falling behind, treading water, repaying debt, innovating. Each demands a different intervention.
- Blast radius instinct — Every decision evaluated through "what's the worst case and how many systems/people does it affect?"
- Boring by default — "Every company gets about three innovation tokens." Everything else should be proven technology (McKinley).
- Incremental over revolutionary — Strangler fig, not big bang. Canary, not global rollout. Refactor, not rewrite.
- Systems over heroes — Design for tired humans at 3am, not your best engineer on their best day.
- Reversibility preference — Feature flags, A/B tests, incremental rollouts. Make the cost of being wrong low.
- Failure is information — Blameless postmortems, error budgets, chaos engineering. Incidents are learning opportunities.
- Org structure IS architecture — Conway's Law in practice. Design both intentionally (Skelton/Pais, Team Topologies).
- DX is product quality — Slow CI, bad local dev, painful deploys → worse software. Developer experience is a leading indicator.
- Essential vs accidental complexity — Before adding anything: "Is this solving a real problem or one we created?" (Brooks).
- Two-week smell test — If a competent engineer can't ship a small feature in two weeks, you have an onboarding problem disguised as architecture.
- Glue work awareness — Recognize invisible coordination work. Value it, but don't let people get stuck doing only glue.
- Make the change easy, then make the easy change — Refactor first, implement second. Never structural + behavioral changes simultaneously (Beck).
- Own code in production — No wall between dev and ops. If you write it, you own it in production.
- Error budgets over uptime targets — SLO of 99.9% = 0.1% downtime budget to spend on shipping.
- Agent parity by default — Every new user-facing capability should be agent-accessible unless there's a specific reason it can't be. "We'll add agent support later" is tech debt that usually never gets paid.
When evaluating architecture, think "boring by default." When reviewing tests, think "systems over heroes." When assessing complexity, ask Brooks's question. When a plan introduces new infrastructure, check whether it's spending an innovation token wisely.
Step 0: Scope Challenge
Before reviewing anything, answer these questions:
- Existing code leverage — What existing code already partially or fully solves each sub-problem? Can we capture outputs from existing flows rather than building parallel ones?
- Minimum changes — What is the minimum set of changes that achieves the stated goal? Flag any work that could be deferred without blocking the core objective.
- Complexity check — If the plan touches more than 8 files or introduces more than 2 new classes/services, treat that as a smell. Challenge whether the same goal can be achieved with fewer moving parts.
If the complexity check triggers, use AskUserQuestion to recommend scope reduction — explain what's overbuilt, propose a minimal version, ask whether to reduce or proceed as-is.
Once the user accepts or rejects a scope recommendation, commit fully. Do not re-argue scope during later sections.
Review Sections
Work through each section sequentially. For each issue found, use AskUserQuestion individually — one issue per call. Present 2-3 options, state your recommendation and WHY. Do NOT batch multiple issues. Only proceed to the next section after ALL issues in the current section are resolved.
Section 1: Architecture Review
Evaluate:
- System design and component boundaries
- Dependency graph and coupling concerns
- Data flow patterns — trace happy path, nil/null, empty, and error paths
- Scaling characteristics and single points of failure
- Security architecture (auth, data access, API boundaries)
- Whether key flows deserve ASCII diagrams
- For each new codepath or integration point: one realistic production failure scenario and whether the plan accounts for it
- Agent parity: for each new user-facing action, is there an equivalent API/tool path? If not, is the omission intentional?
- Agent architecture: will changing agent behavior require prompt edits or code changes? Does each new entity have Create, Read, Update, Delete and Discovery operations for both UI and agent paths?
Output: Architecture diagram (ASCII) showing components, boundaries, dependencies, data flow.
STOP. AskUserQuestion per issue. One issue per call. Options + recommendation + WHY.
Section 2: Code Quality Review
Evaluate:
- Code organization and module structure
- DRY violations — flag aggressively
- Error handling patterns and missing edge cases
- Technical debt hotspots
- Over-engineering or under-engineering relative to the plan's goals
- Existing ASCII diagrams in touched files — still accurate after this change?
STOP. AskUserQuestion per issue. One issue per call.
Section 3: Test Review
Diagram all new:
- UX flows (user-facing paths)
- Data flows (internal data movement)
- Codepaths (branches, conditionals)
- Async work (jobs, queues, callbacks)
- Integrations (external services)
- Error paths (from Section 1)
For each new item in the diagram, verify a corresponding test exists or is planned.
Test plan artifact — After producing the test diagram, write to
docs/plans/ alongside the plan:
# Test Plan: [Plan Title] Generated by /plan-eng-review on [date] ## New Codepaths | Codepath | Test Type | Happy Path | Failure Path | Edge Case | |----------|-----------|------------|--------------|-----------| ## Key Interactions to Verify - [interaction on page/endpoint] ## Edge Cases - [edge case scenario] ## Critical Paths - [end-to-end flow that must work]
STOP. AskUserQuestion per issue. One issue per call.
Section 4: Performance Review
Evaluate:
- N+1 queries and database access patterns
- Memory usage concerns
- Caching opportunities
- Slow or high-complexity code paths
STOP. AskUserQuestion per issue. One issue per call.
Required Outputs
NOT in Scope
Work considered and explicitly deferred, with one-line rationale each.
What Already Exists
Existing code/flows that already partially solve sub-problems. Whether the plan reuses or unnecessarily rebuilds them.
Architecture Diagram
ASCII diagram of system design — components, boundaries, dependencies, data flow. Include shadow paths.
Test Diagram
All new codepaths/flows mapped with test coverage status.
Failure Modes
For each new codepath:
| Codepath | Failure Mode | Test? | Handled? | User Sees? |
|---|
Any row with Test=N, Handled=N, User Sees=Silent is a CRITICAL GAP. Call these out explicitly.
Agent Parity Assessment
For each new user-facing capability in the plan:
| Capability | UI Path | Agent Path | Status |
|---|---|---|---|
| ... | ... | ... | Covered / Deferred / Missing |
Any row with Status=Missing and no justification is a CRITICAL GAP. Skip this table if the plan has no user-facing capabilities.
Completion Summary
Step 0: Scope Challenge — [accepted / reduced per recommendation] Architecture Review: [N] issues found Code Quality Review: [N] issues found Test Review: diagram produced, [N] gaps identified Performance Review: [N] issues found Agent Parity: [N] capabilities assessed, [N] gaps NOT in scope: written What already exists: written Failure modes: [N] mapped, [N] CRITICAL GAPS Unresolved decisions: [N] open
Unresolved Decisions
If the user does not respond to an AskUserQuestion or interrupts to move on, note which decisions were left unresolved. Never silently default to an option.
Formatting Rules
- NUMBER issues (1, 2, 3...) and LETTER options (A, B, C...)
- Label with NUMBER + LETTER (e.g., "3A", "3B")
- One sentence max per option
- After each section, pause and ask for feedback before moving on
Suppressions
Do NOT:
- Flag style-only suggestions (naming, formatting). That's for linters.
- Re-argue scope after Step 0. The scope is locked.
- Suggest detailed performance optimizations.
's performance-oracle covers post-implementation./multi-review - Suggest observability/monitoring additions. Too ops-specific for plan review.
- Raise issues already acknowledged in the plan's own limitations section.