Skills investigate
git clone https://github.com/openclaw/skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/ashish797/founderclaw/investigate" ~/.claude/skills/openclaw-skills-investigate && rm -rf "$T"
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/ashish797/founderclaw/investigate" ~/.openclaw/skills/openclaw-skills-investigate && rm -rf "$T"
skills/ashish797/founderclaw/investigate/SKILL.mdSystematic Debugging
Iron Law
NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST.
Fixing symptoms creates whack-a-mole debugging. Every fix that doesn't address root cause makes the next bug harder to find. Find the root cause, then fix it.
Phase 1: Root Cause Investigation
Gather context before forming any hypothesis.
-
Collect symptoms: Read the error messages, stack traces, and reproduction steps. If the user hasn't provided enough context, ask one question at a time.
-
Read the code: Trace the code path from the symptom back to potential causes. Use grep to find all references, read to understand the logic.
-
Check recent changes:
git log --oneline -20 -- <affected-files>Was this working before? What changed? A regression means the root cause is in the diff.
-
Reproduce: Can you trigger the bug deterministically? If not, gather more evidence before proceeding.
Output: "Root cause hypothesis: ..." — a specific, testable claim about what is wrong and why.
Phase 2: Pattern Analysis
Check if this bug matches a known pattern:
| Pattern | Signature | Where to look |
|---|---|---|
| Race condition | Intermittent, timing-dependent | Concurrent access to shared state |
| Nil/null propagation | NoMethodError, TypeError | Missing guards on optional values |
| State corruption | Inconsistent data, partial updates | Transactions, callbacks, hooks |
| Integration failure | Timeout, unexpected response | External API calls, service boundaries |
| Configuration drift | Works locally, fails in staging/prod | Env vars, feature flags, DB state |
| Stale cache | Shows old data, fixes on cache clear | Redis, CDN, browser cache |
Also check:
- Project TODOs for related known issues
for prior fixes in the same area — recurring bugs in the same files are an architectural smell, not a coincidencegit log
External pattern search: If the bug doesn't match a known pattern, search for:
- "{framework} {generic error type}" — sanitize first: strip hostnames, IPs, file paths, SQL, customer data. Search the error category, not the raw message.
- "{library} {component} known issues"
If a documented solution or known dependency bug surfaces, present it as a candidate hypothesis in Phase 3.
Phase 3: Hypothesis Testing
Before writing ANY fix, verify your hypothesis.
-
Confirm the hypothesis: Add a temporary log statement, assertion, or debug output at the suspected root cause. Run the reproduction. Does the evidence match?
-
If the hypothesis is wrong: Before forming the next hypothesis, search for the error. Sanitize first — strip hostnames, IPs, file paths, SQL fragments, customer identifiers. Search only the generic error type and framework context. Then return to Phase 1. Gather more evidence. Do not guess.
-
3-strike rule: If 3 hypotheses fail, STOP. Tell the user:
3 hypotheses tested, none match. This may be an architectural issue rather than a simple bug. Options: A) Continue investigating — I have a new hypothesis: [describe] B) Escalate for human review — this needs someone who knows the system C) Add logging and wait — instrument the area and catch it next time
Red flags — if you see any of these, slow down:
- "Quick fix for now" — there is no "for now." Fix it right or escalate.
- Proposing a fix before tracing data flow — you're guessing.
- Each fix reveals a new problem elsewhere — wrong layer, not wrong code.
Phase 4: Implementation
Once root cause is confirmed:
-
Fix the root cause, not the symptom. The smallest change that eliminates the actual problem.
-
Minimal diff: Fewest files touched, fewest lines changed. Resist the urge to refactor adjacent code.
-
Write a regression test that:
- Fails without the fix (proves the test is meaningful)
- Passes with the fix (proves the fix works)
-
Run the full test suite. Paste the output. No regressions allowed.
-
If the fix touches >5 files: Flag the blast radius to the user:
This fix touches N files. That's a large blast radius for a bug fix. A) Proceed — the root cause genuinely spans these files B) Split — fix the critical path now, defer the rest C) Rethink — maybe there's a more targeted approach
Phase 5: Verification & Report
Fresh verification: Reproduce the original bug scenario and confirm it's fixed. This is not optional.
Run the test suite and paste the output.
Output a structured debug report:
DEBUG REPORT ════════════════════════════════════════ Symptom: [what the user observed] Root cause: [what was actually wrong] Fix: [what was changed, with file:line references] Evidence: [test output, reproduction attempt showing fix works] Regression test: [file:line of the new test] Related: [prior bugs in same area, architectural notes] Status: DONE | DONE_WITH_CONCERNS | BLOCKED ════════════════════════════════════════
Important Rules
- 3+ failed fix attempts → STOP and question the architecture. Wrong architecture, not failed hypothesis.
- Never apply a fix you cannot verify. If you can't reproduce and confirm, don't ship it.
- Never say "this should fix it." Verify and prove it. Run the tests.
- If fix touches >5 files → ask the user about blast radius before proceeding.
- Completion status:
- DONE — root cause found, fix applied, regression test written, all tests pass
- DONE_WITH_CONCERNS — fix works but architectural concern remains
- BLOCKED — cannot determine root cause, needs human expertise
- NEEDS_CONTEXT — missing reproduction steps or access