Agent-almanac investigate-capa-root-cause
git clone https://github.com/pjt222/agent-almanac
T=$(mktemp -d) && git clone --depth=1 https://github.com/pjt222/agent-almanac "$T" && mkdir -p ~/.claude/skills && cp -r "$T/i18n/caveman-ultra/skills/investigate-capa-root-cause" ~/.claude/skills/pjt222-agent-almanac-investigate-capa-root-cause-0ddb3d && rm -rf "$T"
i18n/caveman-ultra/skills/investigate-capa-root-cause/SKILL.mdInvestigate CAPA Root Cause
Conduct a structured root cause investigation and develop effective corrective and preventive actions for compliance deviations.
When to Use
- An audit finding requires a CAPA
- A deviation or incident occurred in a validated system
- A regulatory inspection observation needs a formal response
- A data integrity anomaly requires investigation
- Recurring issues suggest a systemic root cause
Inputs
- Required: Description of the deviation, finding, or incident
- Required: Severity classification (critical, major, minor)
- Required: Evidence collected during the audit or investigation
- Optional: Previous related CAPAs or investigations
- Optional: Relevant SOPs, validation documents, and system logs
- Optional: Interview notes from involved personnel
Procedure
Step 1: Initiate the Investigation
# Root Cause Investigation ## Document ID: RCA-[CAPA-ID] ## CAPA Reference: CAPA-[YYYY]-[NNN] ### 1. Trigger | Field | Value | |-------|-------| | Source | [Audit finding / Deviation / Inspection observation / Monitoring alert] | | Reference | [Finding ID, deviation ID, or observation number] | | System | [Affected system name and version] | | Date discovered | [YYYY-MM-DD] | | Severity | [Critical / Major / Minor] | | Investigator | [Name, Title] | | Investigation deadline | [Date — per severity: Critical 15 days, Major 30 days, Minor 60 days] | ### 2. Problem Statement [Objective, factual description of what happened, what should have happened, and the gap between the two. No blame, no assumptions.] ### 3. Immediate Containment (if required) | Action | Owner | Completed | |--------|-------|-----------| | [e.g., Restrict system access pending investigation] | [Name] | [Date] | | [e.g., Quarantine affected batch records] | [Name] | [Date] | | [e.g., Implement manual workaround] | [Name] | [Date] |
Expected: Investigation initiated with clear problem statement and containment actions within 24 hours for critical findings. On failure: If containment cannot be implemented immediately, escalate to QA Director and document the risk of delayed containment.
Step 2: Select Investigation Method
Choose the method based on problem complexity:
### Investigation Method Selection | Method | Best For | Complexity | Output | |--------|----------|-----------|--------| | **5-Why Analysis** | Single-cause problems, straightforward failures | Low | Linear cause chain | | **Fishbone (Ishikawa)** | Multi-factor problems, process failures | Medium | Cause-and-effect diagram | | **Fault Tree Analysis** | System failures, safety-critical events | High | Boolean logic tree | **Selected method:** [5-Why / Fishbone / Fault Tree / Combination] **Rationale:** [Why this method is appropriate for this problem]
Expected: Method selected matches the problem complexity — don't use a fault tree for a simple procedural error, and don't use 5-Why for a complex systemic failure. On failure: If the first method does not reach a convincing root cause, apply a second method. Convergence across methods strengthens the conclusion.
Step 3: Conduct Root Cause Analysis
Option A: 5-Why Analysis
### 5-Why Analysis | Level | Question | Answer | Evidence | |-------|----------|--------|----------| | Why 1 | Why did [the problem] occur? | [Immediate cause] | [Evidence reference] | | Why 2 | Why did [immediate cause] occur? | [Contributing factor] | [Evidence reference] | | Why 3 | Why did [contributing factor] occur? | [Deeper cause] | [Evidence reference] | | Why 4 | Why did [deeper cause] occur? | [Systemic cause] | [Evidence reference] | | Why 5 | Why did [systemic cause] occur? | [Root cause] | [Evidence reference] | **Root cause:** [Clear statement of the fundamental cause]
Option B: Fishbone (Ishikawa) Diagram
### Fishbone Analysis Analyse causes across six standard categories: | Category | Potential Causes | Confirmed? | Evidence | |----------|-----------------|------------|----------| | **People** | Inadequate training, unfamiliarity with SOP, staffing shortage | [Y/N] | [Ref] | | **Process** | SOP unclear, missing step, wrong sequence | [Y/N] | [Ref] | | **Technology** | System misconfiguration, software bug, interface failure | [Y/N] | [Ref] | | **Materials** | Incorrect input data, wrong version of reference document | [Y/N] | [Ref] | | **Measurement** | Wrong metric, inadequate monitoring, missed threshold | [Y/N] | [Ref] | | **Environment** | Organisational change, regulatory change, resource constraints | [Y/N] | [Ref] | **Contributing causes:** [List confirmed causes] **Root cause(s):** [The fundamental cause(s) — may be more than one]
Option C: Fault Tree Analysis
### Fault Tree Analysis **Top event:** [The undesired event] Level 1 (OR gate — any of these could cause the top event): ├── [Cause A] │ Level 2 (AND gate — both needed): │ ├── [Sub-cause A1] │ └── [Sub-cause A2] ├── [Cause B] │ Level 2 (OR gate): │ ├── [Sub-cause B1] │ └── [Sub-cause B2] └── [Cause C] **Minimal cut sets:** [Smallest combinations of events that cause the top event] **Root cause(s):** [Fundamental failures identified in the tree]
Expected: Root cause analysis reaches the fundamental cause (not just the symptom) with supporting evidence for each step. On failure: If the analysis produces only symptoms ("user made an error"), push deeper. Ask: "Why was the user able to make that error? What control should have prevented it?"
Step 4: Design Corrective and Preventive Actions
Distinguish clearly between correction, corrective action, and preventive action:
### CAPA Plan | Category | Definition | Action | Owner | Deadline | |----------|-----------|--------|-------|----------| | **Correction** | Fix the immediate problem | [e.g., Re-enable audit trail for batch module] | [Name] | [Date] | | **Corrective Action** | Eliminate the root cause | [e.g., Remove admin ability to disable audit trail; require change control for all audit trail configuration changes] | [Name] | [Date] | | **Preventive Action** | Prevent recurrence in other areas | [e.g., Audit all systems for audit trail disable capability; add monitoring alert for audit trail configuration changes] | [Name] | [Date] | ### CAPA Details **CAPA-[YYYY]-[NNN]-CA1: [Corrective Action Title]** - **Root cause addressed:** [Specific root cause from Step 3] - **Action description:** [Detailed description of what will be done] - **Success criteria:** [Measurable outcome that proves the action worked] - **Verification method:** [How effectiveness will be checked] - **Verification date:** [When effectiveness will be verified — typically 3-6 months after implementation] **CAPA-[YYYY]-[NNN]-PA1: [Preventive Action Title]** - **Risk addressed:** [What recurrence or spread this prevents] - **Action description:** [Detailed description] - **Success criteria:** [Measurable outcome] - **Verification method:** [How effectiveness will be checked] - **Verification date:** [Date]
Expected: Every CAPA action traces to a specific root cause, has measurable success criteria, and includes an effectiveness verification plan. On failure: If success criteria are vague ("improve compliance"), rewrite them to be specific and measurable ("zero audit trail configuration changes outside change control for 6 consecutive months").
Step 5: Verify Effectiveness
After CAPA implementation, verify that the actions actually worked:
### Effectiveness Verification **CAPA-[YYYY]-[NNN] — Verification Record** | CAPA Action | Verification Date | Method | Evidence | Result | |-------------|------------------|--------|----------|--------| | CA1: [Action] | [Date] | [Method: audit, sampling, metric review] | [Evidence reference] | [Effective / Not Effective] | | PA1: [Action] | [Date] | [Method] | [Evidence reference] | [Effective / Not Effective] | ### Effectiveness Criteria Check - [ ] The original problem has not recurred since CAPA implementation - [ ] The corrective action eliminated the root cause (evidence: [reference]) - [ ] The preventive action has been applied to similar systems/processes - [ ] No new issues were introduced by the CAPA actions ### CAPA Closure | Field | Value | |-------|-------| | Closure decision | [Closed — Effective / Closed — Not Effective / Extended] | | Closed by | [Name, Title] | | Closure date | [YYYY-MM-DD] | | Next review | [If recurring, when to re-check] |
Expected: Effectiveness verification demonstrates that the root cause was actually eliminated, not just that the action was completed. On failure: If verification shows the CAPA was not effective, reopen the investigation and develop revised actions. Do not close an ineffective CAPA.
Step 6: Analyse CAPA Trends
### CAPA Trend Analysis | Period | Total CAPAs | By Source | Top 3 Root Cause Categories | Recurring? | |--------|------------|-----------|---------------------------|------------| | Q1 20XX | [N] | Audit: [n], Deviation: [n], Monitoring: [n] | [Cat1], [Cat2], [Cat3] | [Y/N] | | Q2 20XX | [N] | Audit: [n], Deviation: [n], Monitoring: [n] | [Cat1], [Cat2], [Cat3] | [Y/N] | ### Systemic Issues | Issue | Frequency | Systems Affected | Recommended Action | |-------|-----------|-----------------|-------------------| | [e.g., Training gaps] | [N occurrences in 12 months] | [Systems] | [Systemic programme improvement] |
Expected: Trend analysis identifies systemic issues that individual CAPAs miss. On failure: If trending reveals recurring root causes despite CAPAs, the CAPAs are treating symptoms. Escalate to management review for systemic intervention.
Validation
- Investigation initiated within required timeline (24h for critical, 72h for major)
- Problem statement is factual and does not assign blame
- Investigation method is appropriate for problem complexity
- Root cause analysis reaches the fundamental cause (not just symptoms)
- Every root cause step is supported by evidence
- CAPAs distinguish correction, corrective action, and preventive action
- Each CAPA has measurable success criteria and a verification plan
- Effectiveness verified with evidence before CAPA closure
- Trend analysis reviewed at least quarterly
Common Pitfalls
- Stopping at the symptom: "The user made an error" is not a root cause. The root cause is why the system or process allowed the error.
- CAPA = retraining: Retraining addresses only one possible root cause (knowledge). If the real root cause is a system design flaw or unclear SOP, retraining will not prevent recurrence.
- Closing without verification: Completing the action is not the same as verifying its effectiveness. A CAPA closed without effectiveness verification is a regulatory citation waiting to happen.
- Blame-oriented investigation: Investigations that focus on who made the error rather than what allowed the error undermine the quality culture and discourage reporting.
- No trending: Individual CAPAs may seem unrelated, but trending often reveals systemic issues (e.g., "training" root causes across multiple systems may indicate a broken training programme).
Related Skills
— audits generate findings that require CAPAsconduct-gxp-audit
— monitoring detects anomalies that trigger investigationsmonitor-data-integrity
— CAPA-driven changes go through change controlmanage-change-control
— open and overdue CAPAs are top inspection targetsprepare-inspection-readiness
— when root cause is training-related, improve the training programmedesign-training-program