EasyPlatform debug
[Fix & Debug] Systematic debugging with root cause investigation. Use when bugfix workflow reaches debug step.
git clone https://github.com/duc01226/EasyPlatform
T=$(mktemp -d) && git clone --depth=1 https://github.com/duc01226/EasyPlatform "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/debug-investigate" ~/.claude/skills/duc01226-easyplatform-debug && rm -rf "$T"
.claude/skills/debug-investigate/SKILL.md<!-- SYNC:critical-thinking-mindset -->[IMPORTANT] Use
to break ALL work into small tasks BEFORE starting — including tasks for each file read. This prevents context loss from long files. For simple tasks, AI MUST ATTENTION ask user whether to skip.TaskCreate
<!-- /SYNC:critical-thinking-mindset --> <!-- SYNC:ai-mistake-prevention -->Critical Thinking Mindset — Apply critical thinking, sequential thinking. Every claim needs traced proof, confidence >80% to act. Anti-hallucination: Never present guess as fact — cite sources for every claim, admit uncertainty freely, self-check output for errors, cross-reference independently, stay skeptical of own confidence — certainty without evidence root of all hallucination.
<!-- /SYNC:ai-mistake-prevention --> <!-- SYNC:understand-code-first -->AI Mistake Prevention — Failure modes to avoid on every task:
- Check downstream references before deleting. Deleting components causes documentation and code staleness cascades. Map all referencing files before removal.
- Verify AI-generated content against actual code. AI hallucinates APIs, class names, and method signatures. Always grep to confirm existence before documenting or referencing.
- Trace full dependency chain after edits. Changing a definition misses downstream variables and consumers derived from it. Always trace the full chain.
- Trace ALL code paths when verifying correctness. Confirming code exists is not confirming it executes. Always trace early exits, error branches, and conditional skips — not just happy path.
- When debugging, ask "whose responsibility?" before fixing. Trace whether bug is in caller (wrong data) or callee (wrong handling). Fix at responsible layer — never patch symptom site.
- Assume existing values are intentional — ask WHY before changing. Before changing any constant, limit, flag, or pattern: read comments, check git blame, examine surrounding code.
- Verify ALL affected outputs, not just the first. Changes touching multiple stacks require verifying EVERY output. One green check is not all green checks.
- Holistic-first debugging — resist nearest-attention trap. When investigating any failure, list EVERY precondition first (config, env vars, DB names, endpoints, DI registrations, data preconditions), then verify each against evidence before forming any code-layer hypothesis.
- Surgical changes — apply the diff test. Bug fix: every changed line must trace directly to the bug. Don't restyle or improve adjacent code. Enhancement task: implement improvements AND announce them explicitly.
- Surface ambiguity before coding — don't pick silently. If request has multiple interpretations, present each with effort estimate and ask. Never assume all-records, file-based, or more complex path.
<!-- /SYNC:understand-code-first --> <!-- SYNC:evidence-based-reasoning -->Understand Code First — HARD-GATE: Do NOT write, plan, or fix until you READ existing code.
- Search 3+ similar patterns (
/grep) — citeglobevidencefile:line- Read existing files in target area — understand structure, base classes, conventions
- Run
whenpython .claude/scripts/code_graph trace <file> --direction both --jsonexists.code-graph/graph.db- Map dependencies via
orconnections— know what depends on your targetcallers_of- Write investigation to
for non-trivial tasks (3+ files).ai/workspace/analysis/- Re-read analysis file before implementing — never work from memory alone
- NEVER invent new patterns when existing ones work — match exactly or document deviation
BLOCKED until:
Read target files- [ ]Grep 3+ patterns- [ ]Graph trace (if graph.db exists)- [ ]Assumptions verified with evidence- [ ]
<!-- /SYNC:evidence-based-reasoning -->Evidence-Based Reasoning — Speculation is FORBIDDEN. Every claim needs proof.
- Cite
, grep results, or framework docs for EVERY claimfile:line- Declare confidence: >80% act freely, 60-80% verify first, <60% DO NOT recommend
- Cross-service validation required for architectural changes
- "I don't have enough evidence" is valid and expected output
BLOCKED until:
Evidence file path (- [ ])file:lineGrep search performed- [ ]3+ similar patterns found- [ ]Confidence level stated- [ ]Forbidden without proof: "obviously", "I think", "should be", "probably", "this is because" If incomplete → output:
"Insufficient evidence. Verified: [...]. Not verified: [...]."
— Domain entity catalog, relationships, cross-service sync (read when task involves business entities/models) (content auto-injected by hook — check for [Injected: ...] header before reading)docs/project-reference/domain-entities-reference.md
<!-- /SYNC:estimation-framework --> <!-- SYNC:red-flag-stop-conditions -->Estimation — Modified Fibonacci: 1(trivial) → 2(small) → 3(medium) → 5(large) → 8(very large) → 13(epic, SHOULD split) → 21(MUST ATTENTION split). Output
andstory_pointsin plan frontmatter. Complexity auto-derived: 1-2=Low, 3-5=Medium, 8=High, 13+=Critical.complexity
<!-- /SYNC:red-flag-stop-conditions --> <!-- SYNC:fix-layer-accountability -->Red Flag Stop Conditions — STOP and escalate to user via AskUserQuestion when:
- Confidence drops below 60% on any critical decision
- Changes would affect >20 files (blast radius too large)
- Cross-service boundary is being crossed
- Security-sensitive code (auth, crypto, PII handling)
- Breaking change detected (interface, API contract, DB schema)
- Test coverage would decrease after changes
- Approach requires technology/pattern not in the project
NEVER proceed past a red flag without explicit user approval.
<!-- /SYNC:fix-layer-accountability -->Fix-Layer Accountability — NEVER fix at the crash site. Trace the full flow, fix at the owning layer.
AI default behavior: see error at Place A → fix Place A. This is WRONG. The crash site is a SYMPTOM, not the cause.
MANDATORY before ANY fix:
- Trace full data flow — Map the complete path from data origin to crash site across ALL layers (storage → backend → API → frontend → UI). Identify where the bad state ENTERS, not where it CRASHES.
- Identify the invariant owner — Which layer's contract guarantees this value is valid? That layer is responsible. Fix at the LOWEST layer that owns the invariant — not the highest layer that consumes it.
- One fix, maximum protection — Ask: "If I fix here, does it protect ALL downstream consumers with ONE change?" If fix requires touching 3+ files with defensive checks, you are at the wrong layer — go lower.
- Verify no bypass paths — Confirm all data flows through the fix point. Check for: direct construction skipping factories, clone/spread without re-validation, raw data not wrapped in domain models, mutations outside the model layer.
BLOCKED until:
Full data flow traced (origin → crash)- [ ]Invariant owner identified with- [ ]evidencefile:lineAll access sites audited (grep count)- [ ]Fix layer justified (lowest layer that protects most consumers)- [ ]Anti-patterns (REJECT these):
- "Fix it where it crashes" — Crash site ≠ cause site. Trace upstream.
- "Add defensive checks at every consumer" — Scattered defense = wrong layer. One authoritative fix > many scattered guards.
- "Both fix is safer" — Pick ONE authoritative layer. Redundant checks across layers send mixed signals about who owns the invariant.
Quick Summary
Goal: Investigate and identify root cause of a bug with evidence.
Workflow:
- Reproduce — Understand expected vs actual behavior
- Hypothesize — Form theories about root cause
- Trace — Follow code paths with file:line evidence
- Confirm — Verify root cause with grep/read evidence
- Report — Output root cause with confidence level
Key Rules:
- Debug Mindset: every claim needs file:line proof
- [ROOT-CAUSE-FIX] Never patch symptoms. Trace full call chain to find WHO is responsible. Fix at correct layer.
- Never assume first hypothesis is correct
- Output: confirmed root cause OR "hypothesis, not confirmed" with evidence gaps
- This is investigation-only — hand off to /fix for implementation
<!-- /SYNC:root-cause-debugging --> <!-- SYNC:incremental-persistence -->Root Cause Debugging — Systematic approach, never guess-and-check.
- Reproduce — Confirm the issue exists with evidence (error message, stack trace, screenshot)
- Isolate — Narrow to specific file/function/line using binary search + graph trace
- Trace — Follow data flow from input to failure point. Read actual code, don't infer.
- Hypothesize — Form theory with confidence %. State what evidence supports/contradicts it
- Verify — Test hypothesis with targeted grep/read. One variable at a time.
- Fix — Address root cause, not symptoms. Verify fix doesn't break callers via graph
connectionsNEVER: Guess without evidence. Fix symptoms instead of cause. Skip reproduction step.
<!-- /SYNC:incremental-persistence --> <!-- SYNC:subagent-return-contract -->Incremental Result Persistence — MANDATORY for all sub-agents or heavy inline steps processing >3 files.
- Before starting: Create report file
plans/reports/{skill}-{date}-{slug}.md- After each file/section reviewed: Append findings to report immediately — never hold in memory
- Return to main agent: Summary only (per SYNC:subagent-return-contract) with
pathFull report:- Main agent: Reads report file only when resolving specific blockers
Why: Context cutoff mid-execution loses ALL in-memory findings. Each disk write survives compaction. Partial results are better than no results.
Report naming:
plans/reports/{skill-name}-{YYMMDD}-{HHmm}-{slug}.md
<!-- /SYNC:subagent-return-contract -->Sub-Agent Return Contract — When this skill spawns a sub-agent, the sub-agent MUST return ONLY this structure. Main agent reads only this summary — NEVER requests full sub-agent output inline.
## Sub-Agent Result: [skill-name] Status: ✅ PASS | ⚠️ PARTIAL | ❌ FAIL Confidence: [0-100]% ### Findings (Critical/High only — max 10 bullets) - [severity] [file:line] [finding] ### Actions Taken - [file changed] [what changed] ### Blockers (if any) - [blocker description] Full report: plans/reports/[skill-name]-[date]-[slug].mdMain agent reads
file ONLY when: (a) resolving a specific blocker, or (b) building a fix plan. Sub-agent writes full report incrementally (per SYNC:incremental-persistence) — not held in memory.Full report
Debug Mindset (NON-NEGOTIABLE)
Be skeptical. Apply critical thinking, sequential thinking. Every claim needs traced proof, confidence percentages (Idea should be more than 80%).
- Do NOT assume the first hypothesis is correct — verify with actual code traces
- Every root cause claim must include
evidencefile:line - If you cannot prove a root cause with a code trace, state "hypothesis, not confirmed"
- Question assumptions: "Is this really the cause?" → trace the actual execution path
- Challenge completeness: "Are there other contributing factors?" → check related code paths
Confidence & Evidence Gate
MANDATORY IMPORTANT MUST ATTENTION declare
Confidence: X% with evidence list + file:line proof for EVERY claim.
| Confidence | Meaning | Action |
|---|---|---|
| 95-100% | Full trace verified | Report as confirmed root cause |
| 80-94% | Main path verified, edge cases uncertain | Report with caveats |
| 60-79% | Partial trace | Report as hypothesis |
| <60% | Insufficient evidence | DO NOT report — gather more evidence |
Workflow Details
Step 1: Reproduce
- Clarify expected vs actual behavior
- Identify trigger conditions (user action, data state, timing)
Step 2: Hypothesize
- Form 2-3 theories about root cause
- Rank by likelihood based on symptoms
Step 3: Trace
- For each hypothesis, trace the code path:
- Find entry point (API, UI, job, event)
- Follow through handlers/services
- Check data transformations and state changes
- Verify error handling paths
- Use grep/read to collect
evidencefile:line
Step 4: Confirm
- Match evidence to a single root cause
- Verify the root cause explains ALL symptoms
- Check for secondary contributing factors
Dependency Tracing (MANDATORY — DO NOT SKIP when graph.db exists)
If
.code-graph/graph.db exists, you MUST ATTENTION use structural queries to trace dependencies:
Graph reveals ALL callers and consumers of buggy code — grep alone misses structural relationships.
- Who calls the buggy function:
python .claude/scripts/code_graph query callers_of <function> --json - Who imports the buggy module:
python .claude/scripts/code_graph query importers_of <file> --json - What tests exist:
python .claude/scripts/code_graph query tests_for <function> --json - What does this function call:
python .claude/scripts/code_graph query callees_of <function> --json
Graph-Assisted Debugging
After identifying suspect files, use graph trace to understand the full context:
— see what calls this code AND what it triggers downstreampython .claude/scripts/code_graph trace <suspect-file> --direction both --json
— find all callers that could trigger the bugpython .claude/scripts/code_graph trace <suspect-file> --direction upstream --json- This reveals implicit connections (MESSAGE_BUS, event handlers) that may propagate the issue across services
Step 5: Report
- Output: confirmed root cause with evidence chain
- Include: affected files, data flow, fix recommendation
- Hand off to
for implementation/fix
⚠️ MANDATORY: Post-Fix Verification
After
applies changes, /fix
MUST ATTENTION be run. It builds code proof traces per change with confidence scores. This is non-negotiable in all fix workflows./prove-fix
Red Flags — STOP (Debugging-Specific)
If you're thinking:
- "I see the problem, let me fix it" — Seeing symptoms is not understanding root cause. Investigate first.
- "Quick fix for now, investigate later" — Quick fixes mask bugs and create debt. Find root cause.
- "Just try changing X and see" — One hypothesis at a time. Scientific method, not trial and error.
- "Already tried 2+ fixes, one more" — 3+ failed fixes = STOP. Question the architecture, not the fix.
- "The error message is misleading" — Read it again carefully. Error messages are usually right.
- "It works on my machine" — Reproduce in the failing environment. Your environment hides bugs.
- "This can't be the cause" — Verify with evidence, not intuition. Unlikely causes are still causes.
- "It's OOM, must be a large object" — For memory exhaustion, check row count BEFORE row size. An unbounded query loading thousands of records is the more common cause. Triage: (1) Is there a missing DB-level filter for the triggering condition? (2) Is each row excessively large?
Workflow Recommendation
MANDATORY IMPORTANT MUST ATTENTION — NO EXCEPTIONS: If you are NOT already in a workflow, you MUST ATTENTION use
to ask the user. Do NOT judge task complexity or decide this is "simple enough to skip" — the user decides whether to use a workflow, not you:AskUserQuestion
- Activate
workflow (Recommended) — scout → investigate → debug → plan → fix → prove-fix → review → testbugfix- Execute
directly — run this skill standalone/debug
Next Steps (Standalone: MUST ATTENTION ask user via AskUserQuestion
. Skip if inside workflow.)
AskUserQuestionMANDATORY IMPORTANT MUST ATTENTION — NO EXCEPTIONS after completing this skill, you MUST ATTENTION use
AskUserQuestion to present these options. Do NOT skip because the task seems "simple" or "obvious" — the user decides:
- "Proceed with full workflow (Recommended)" — I'll detect the best workflow to continue from here (debug complete, root cause identified). This ensures fix, verification, review, and testing steps aren't skipped.
- "/fix" — Apply fix based on debug findings
- "/plan" — If fix requires planning
- "Skip, continue manually" — user decides
Standalone Review Gate (Non-Workflow Only)
MANDATORY IMPORTANT MUST ATTENTION: If this skill is called outside a workflow (standalone
), you MUST ATTENTION create a/debugtodo task forTaskCreateas the last task in your task list. This ensures all changes are reviewed before commit even without a workflow enforcing it./review-changesIf already running inside a workflow (e.g.,
), skip this — the workflow sequence handlesbugfixat the appropriate step./review-changes
Closing Reminders
MANDATORY IMPORTANT MUST ATTENTION break work into small todo tasks using
TaskCreate BEFORE starting.
MANDATORY IMPORTANT MUST ATTENTION validate decisions with user via AskUserQuestion — never auto-decide.
MANDATORY IMPORTANT MUST ATTENTION add a final review todo task to verify work quality.
MANDATORY IMPORTANT MUST ATTENTION READ the following files before starting:
<!-- SYNC:understand-code-first:reminder -->
- MANDATORY IMPORTANT MUST ATTENTION search 3+ existing patterns and read code BEFORE any modification. Run graph trace when graph.db exists. <!-- /SYNC:understand-code-first:reminder --> <!-- SYNC:evidence-based-reasoning:reminder -->
- MANDATORY IMPORTANT MUST ATTENTION cite
evidence for every claim. Confidence >80% to act, <60% = do NOT recommend. <!-- /SYNC:evidence-based-reasoning:reminder --> <!-- SYNC:estimation-framework:reminder -->file:line - MANDATORY IMPORTANT MUST ATTENTION include
andstory_points
in plan frontmatter. SP > 8 = split. <!-- /SYNC:estimation-framework:reminder --> <!-- SYNC:red-flag-stop-conditions:reminder -->complexity - MANDATORY IMPORTANT MUST ATTENTION STOP after 3 failed fix attempts. Report all attempts, ask user before continuing. <!-- /SYNC:red-flag-stop-conditions:reminder --> <!-- SYNC:fix-layer-accountability:reminder -->
- MANDATORY IMPORTANT MUST ATTENTION trace full data flow and fix at the owning layer, not the crash site. Audit all access sites before adding
. <!-- /SYNC:fix-layer-accountability:reminder --> <!-- SYNC:critical-thinking-mindset:reminder -->?. - MUST ATTENTION apply critical thinking — every claim needs traced proof, confidence >80% to act. Anti-hallucination: never present guess as fact. <!-- /SYNC:critical-thinking-mindset:reminder --> <!-- SYNC:ai-mistake-prevention:reminder -->
- MUST ATTENTION apply AI mistake prevention — holistic-first debugging, fix at responsible layer, surface ambiguity before coding, re-read files after compaction. <!-- /SYNC:ai-mistake-prevention:reminder -->