Skillshub anth-prod-checklist
install
source · Clone the upstream repo
git clone https://github.com/ComeOnOliver/skillshub
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ComeOnOliver/skillshub "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/jeremylongshore/claude-code-plugins-plus-skills/anth-prod-checklist" ~/.claude/skills/comeonoliver-skillshub-anth-prod-checklist && rm -rf "$T"
manifest:
skills/jeremylongshore/claude-code-plugins-plus-skills/anth-prod-checklist/SKILL.mdsource content
Anthropic Production Checklist
Overview
Complete checklist for deploying Claude API integrations to production with reliability, observability, and cost controls.
Pre-Launch Checklist
Authentication & Keys
- Production API key from dedicated Workspace
- Key stored in secret manager (not env files on servers)
- Key rotation procedure documented and tested
- Separate keys for each environment (dev/staging/prod)
Error Handling
- All 5 error types handled:
,authentication_error
,invalid_request_error
,rate_limit_error
,api_erroroverloaded_error - SDK
set (recommended: 3-5 for production)maxRetries - Custom error logging with
capturedrequest-id - Circuit breaker for sustained API failures
Rate Limits & Cost
- Usage tier verified at console.anthropic.com
- Application-level rate limiting implemented
- Cost alerts configured (monthly spend caps)
- Model selection optimized (Haiku for simple tasks, Sonnet for complex)
-
set to realistic values (not inflated)max_tokens - Prompt caching enabled for repeated system prompts
Reliability
- Timeout configured (
parameter, recommended 60-120s)timeout - Graceful degradation when API is unavailable
- Health check endpoint tests API connectivity
async def health_check(): try: # Use token counting as a cheap health probe (no generation cost) count = client.messages.count_tokens( model="claude-haiku-4-20250514", messages=[{"role": "user", "content": "ping"}] ) return {"status": "healthy", "tokens": count.input_tokens} except Exception as e: return {"status": "degraded", "error": str(e)}
Observability
- Request/response logging (redact content, keep metadata)
- Latency tracking (p50, p95, p99)
- Token usage tracking (input + output per request)
- Cost tracking per feature/customer
- Error rate alerting (429s, 5xx, timeouts)
import logging import time logger = logging.getLogger("anthropic") def tracked_create(**kwargs): start = time.monotonic() try: response = client.messages.create(**kwargs) duration = time.monotonic() - start logger.info( "claude_request", extra={ "request_id": response._request_id, "model": response.model, "input_tokens": response.usage.input_tokens, "output_tokens": response.usage.output_tokens, "duration_ms": int(duration * 1000), "stop_reason": response.stop_reason, } ) return response except Exception as e: duration = time.monotonic() - start logger.error("claude_error", extra={"error": str(e), "duration_ms": int(duration * 1000)}) raise
Content Safety
- System prompts reviewed for injection resistance
- User input validated and length-limited
- Output scanned for sensitive data leakage
- Content moderation for user-facing responses
Infrastructure
- Deployment uses canary/rolling strategy
- Rollback procedure documented and tested
- Runbook created (see
)anth-incident-runbook - On-call escalation path defined
Alerting Thresholds
| Metric | Warning | Critical |
|---|---|---|
| Error rate (5xx) | > 1% | > 5% |
| p99 latency | > 10s | > 30s |
| 429 rate | > 5/min | > 20/min |
| Daily cost | > 80% budget | > 100% budget |
| Auth failures (401/403) | > 0 | > 0 (immediate) |
Resources
Next Steps
For version upgrades, see
anth-upgrade-migration.