Claude-code-plugins salesforce-incident-runbook
install
source · Clone the upstream repo
git clone https://github.com/jeremylongshore/claude-code-plugins-plus-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/jeremylongshore/claude-code-plugins-plus-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/saas-packs/salesforce-pack/skills/salesforce-incident-runbook" ~/.claude/skills/jeremylongshore-claude-code-plugins-salesforce-incident-runbook && rm -rf "$T"
manifest:
plugins/saas-packs/salesforce-pack/skills/salesforce-incident-runbook/SKILL.mdsource content
Salesforce Incident Runbook
Overview
Rapid incident response procedures for Salesforce integration failures, covering Salesforce-side outages, API limit exhaustion, authentication failures, and data sync issues.
Prerequisites
- Salesforce CLI authenticated (
)sf org login - Access to Salesforce Status API
- Monitoring dashboards configured (see
)salesforce-observability - Communication channels (Slack, PagerDuty)
Quick Triage (Do This First)
# 1. Is Salesforce itself down? curl -s https://api.status.salesforce.com/v1/incidents/active | jq '.[0:3]' # If incidents returned → Salesforce-side issue, enable fallback mode # 2. Check your org's instance status # Find your instance at: https://status.salesforce.com curl -s "https://api.status.salesforce.com/v1/instances/NA45/status" | jq '.status' # 3. Check API limits — are we out of calls? sf limits api display --target-org my-org --json | jq '.result[] | select(.name == "DailyApiRequests")' # If remaining = 0 → API_LIMIT_EXCEEDED, see mitigation below # 4. Check authentication sf org display --target-org my-org --json | jq '.result.connectedStatus' # If "RefreshTokenError" → re-authenticate # 5. Check recent errors in your logs sf apex log list --target-org my-org --json | jq '.result[0:5]'
Decision Tree
Integration returning errors? ├── YES: Is status.salesforce.com showing incident? │ ├── YES → Salesforce outage. Enable fallback mode. Monitor status page. │ └── NO → Check error type below: │ ├── INVALID_SESSION_ID (401) → Token expired. Re-authenticate. │ ├── REQUEST_LIMIT_EXCEEDED (403) → API limit hit. Reduce calls. │ ├── UNABLE_TO_LOCK_ROW (409) → Record contention. Retry with backoff. │ ├── MALFORMED_QUERY / INVALID_FIELD → Code bug. Check SOQL. │ └── 500/503 → Salesforce-side. Wait and retry. └── NO: Is data syncing correctly? ├── YES → Likely resolved or intermittent. Monitor. └── NO → Check CDC subscription, query timestamps, bulk job status.
Immediate Actions by Error Type
REQUEST_LIMIT_EXCEEDED — API Limit Exhausted
// This is a P1 — your integration is completely blocked // 1. Check what's consuming API calls const limits = await conn.request('/services/data/v59.0/limits/'); console.log('API calls:', limits.DailyApiRequests); console.log('Bulk API:', limits.DailyBulkV2QueryJobs); // Limits reset on a 24-hour rolling basis // 2. Identify top consumers (Enterprise+ orgs with EventLogFile) const topUsers = await conn.query(` SELECT UserId, COUNT(Id) callCount FROM EventLogFile WHERE EventType = 'API' AND LogDate = TODAY GROUP BY UserId ORDER BY COUNT(Id) DESC LIMIT 10 `); // 3. Immediate mitigation: pause non-critical integrations // Set env var: SF_CRITICAL_ONLY=true // Only allow essential operations (auth, health check, critical writes)
INVALID_SESSION_ID — Authentication Failure
# Token expired or revoked — re-authenticate sf org login web --alias my-org --instance-url https://login.salesforce.com # For CI/automated: re-auth with JWT sf org login jwt \ --client-id $SF_CLIENT_ID \ --jwt-key-file server.key \ --username $SF_USERNAME \ --alias my-org # Verify connection is restored sf org display --target-org my-org
Salesforce System Outage
// Enable graceful degradation — serve stale data from cache const FALLBACK_MODE = process.env.SF_FALLBACK_MODE === 'true'; async function queryWithFallback<T>(soql: string, cacheKey: string): Promise<T[]> { if (FALLBACK_MODE) { const cached = await redis.get(cacheKey); if (cached) { console.warn('SF FALLBACK: serving cached data'); return JSON.parse(cached); } throw new Error('Salesforce unavailable and no cached data'); } const conn = await getConnection(); const result = await conn.query<T>(soql); // Always update cache for fallback await redis.set(cacheKey, JSON.stringify(result.records), 'EX', 3600); return result.records; }
Communication Templates
Internal (Slack)
P1 INCIDENT: Salesforce Integration Status: INVESTIGATING Error: [REQUEST_LIMIT_EXCEEDED / INVALID_SESSION_ID / SF outage] Impact: [Data sync paused / API calls failing / user-facing errors] Current action: [Checking limits / re-authenticating / enabling fallback] Next update: [time]
Postmortem Template
## Incident: Salesforce [Error Type] **Date:** YYYY-MM-DD | **Duration:** X hours | **Severity:** P[1-4] ### Summary [One sentence — e.g., "API limit exhausted due to unoptimized batch job"] ### Root Cause [e.g., "New sync job ran SELECT * on Contact (3M records) using individual queries instead of Bulk API"] ### Impact - API calls blocked for [duration] - [N] users affected / [N] records not synced ### Timeline - HH:MM — Alerts fired: REQUEST_LIMIT_EXCEEDED - HH:MM — Triage: identified bulk sync as consumer - HH:MM — Mitigated: paused sync job - HH:MM — Resolved: API limit rolled over ### Action Items - [ ] Migrate sync to Bulk API 2.0 — @owner — due date - [ ] Add API budget guard (80% warning) — @owner — due date - [ ] Set up EventLogFile monitoring for top consumers — @owner — due date
Error Handling
| Issue | Cause | Solution |
|---|---|---|
| Can't reach status API | Network issue | Try https://status.salesforce.com manually |
| sf CLI auth expired | Token revoked | Re-authenticate with |
| Limits API returns 403 | Limit already exceeded | Wait for rolling 24hr reset |
| Bulk job stuck | Processing timeout | Abort and retry: |
Resources
Next Steps
For data handling, see
salesforce-data-handling.