Awesome-omni-skill Incident Response
Guide structured incident response following severity-based protocols
install
source · Clone the upstream repo
git clone https://github.com/diegosouzapw/awesome-omni-skill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/tools/incident-response" ~/.claude/skills/diegosouzapw-awesome-omni-skill-incident-response-fdc399 && rm -rf "$T"
manifest:
skills/tools/incident-response/SKILL.mdsource content
Incident Response Skill
Guide structured incident response following severity-based protocols.
Trigger Conditions
- An incident is detected or reported
- An alert fires with severity S1 or S2
- User invokes with "incident response" or "start incident"
Input Contract
- Required: Incident description or alert payload
- Required: Severity level (S1-S4)
- Optional: Affected services, customer impact estimate
Output Contract
- Incident timeline with structured updates
- Mitigation steps and status
- Communication templates for stakeholders
- Evidence preservation checklist
Tool Permissions
- Read: Logs, metrics, alerts, runbooks, deployment history
- Write: Incident reports, status page updates
- Execute: Diagnostic commands, rollback procedures
Execution Steps
- Classify severity using defined criteria
- Identify incident commander and assign roles
- Establish communication channels (war room)
- Preserve evidence (logs, metrics, config state, recent deploys)
- Focus on mitigation first, root cause second
- Provide structured updates at cadence defined by severity
- Assess regulatory notification requirements
- Document resolution and schedule postmortem
Success Criteria
- Severity correctly classified within 5 minutes
- Communication cadence maintained per severity SLA
- Service restored within MTTR target
- Evidence preserved before mitigation changes state
Escalation Rules
- Auto-escalate if MTTR exceeds threshold per severity
- Escalate if regulatory notification may be required
- Escalate if customer impact exceeds threshold
Example Invocations
Input: "Payment processing is returning 500 errors for 30% of requests"
Output: S1 incident declared. Mitigation: rollback last deploy (payment-service v2.3.1 → v2.3.0). Comms: internal Slack update every 15min, status page updated, customer notification drafted. Evidence: deploy log, error rate graph, and config diff preserved.