Claude-code-plugins-plus-skills deepgram-prod-checklist
install
source · Clone the upstream repo
git clone https://github.com/jeremylongshore/claude-code-plugins-plus-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/jeremylongshore/claude-code-plugins-plus-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/saas-packs/deepgram-pack/skills/deepgram-prod-checklist" ~/.claude/skills/jeremylongshore-claude-code-plugins-plus-skills-deepgram-prod-checklist && rm -rf "$T"
manifest:
plugins/saas-packs/deepgram-pack/skills/deepgram-prod-checklist/SKILL.mdsource content
Deepgram Production Checklist
Overview
Comprehensive go-live checklist for Deepgram integrations. Covers singleton client, health checks, Prometheus metrics, alert rules, error handling, and a phased go-live timeline.
Production Readiness Matrix
| Category | Item | Status |
|---|---|---|
| Auth | Production API key with scoped permissions | [ ] |
| Auth | Key stored in secret manager (not env file) | [ ] |
| Auth | Key rotation schedule (90-day) configured | [ ] |
| Auth | Fallback key provisioned and tested | [ ] |
| Resilience | Retry with exponential backoff on 429/5xx | [ ] |
| Resilience | Circuit breaker for cascade failure prevention | [ ] |
| Resilience | Request timeout set (30s pre-recorded, 10s TTS) | [ ] |
| Resilience | Graceful degradation when API unavailable | [ ] |
| Performance | Singleton client (not creating per-request) | [ ] |
| Performance | Concurrency limited (50-80% of plan limit) | [ ] |
| Performance | Audio preprocessed (16kHz mono for best results) | [ ] |
| Performance | Large files use callback URL (async) | [ ] |
| Monitoring | Health check endpoint testing Deepgram API | [ ] |
| Monitoring | Prometheus metrics: latency, error rate, usage | [ ] |
| Monitoring | Alerts: error rate >5%, latency >10s, circuit open | [ ] |
| Security | PII redaction enabled if handling sensitive audio | [ ] |
| Security | Audio URLs validated (HTTPS, no private IPs) | [ ] |
| Security | Audit logging on all operations | [ ] |
Instructions
Step 1: Production Singleton Client
import { createClient, DeepgramClient } from '@deepgram/sdk'; class ProductionDeepgram { private static client: DeepgramClient | null = null; static getClient(): DeepgramClient { if (!this.client) { const key = process.env.DEEPGRAM_API_KEY; if (!key) throw new Error('DEEPGRAM_API_KEY required for production'); this.client = createClient(key); } return this.client; } // Force re-init (for key rotation) static reset() { this.client = null; } }
Step 2: Health Check Endpoint
import express from 'express'; import { createClient } from '@deepgram/sdk'; const app = express(); const deepgram = createClient(process.env.DEEPGRAM_API_KEY!); app.get('/health', async (req, res) => { const start = Date.now(); try { // Test API connectivity by listing projects const { error } = await deepgram.manage.getProjects(); const latency = Date.now() - start; if (error) { return res.status(503).json({ status: 'unhealthy', deepgram: 'error', error: error.message, latency_ms: latency, }); } res.json({ status: 'healthy', deepgram: 'connected', latency_ms: latency, timestamp: new Date().toISOString(), }); } catch (err: any) { res.status(503).json({ status: 'unhealthy', deepgram: 'unreachable', error: err.message, latency_ms: Date.now() - start, }); } });
Step 3: Prometheus Metrics
import { Counter, Histogram, Gauge, Registry } from 'prom-client'; const registry = new Registry(); const transcriptionRequests = new Counter({ name: 'deepgram_requests_total', help: 'Total Deepgram API requests', labelNames: ['method', 'model', 'status'], registers: [registry], }); const transcriptionLatency = new Histogram({ name: 'deepgram_latency_seconds', help: 'Deepgram API request latency', labelNames: ['method', 'model'], buckets: [0.5, 1, 2, 5, 10, 30], registers: [registry], }); const audioProcessed = new Counter({ name: 'deepgram_audio_seconds_total', help: 'Total audio seconds processed', labelNames: ['model'], registers: [registry], }); const activeConnections = new Gauge({ name: 'deepgram_active_connections', help: 'Active WebSocket connections', registers: [registry], }); // Instrumented transcription async function instrumentedTranscribe(url: string, model = 'nova-3') { const timer = transcriptionLatency.startTimer({ method: 'prerecorded', model }); try { const { result, error } = await deepgram.listen.prerecorded.transcribeUrl( { url }, { model, smart_format: true } ); timer(); transcriptionRequests.inc({ method: 'prerecorded', model, status: error ? 'error' : 'ok' }); if (result?.metadata?.duration) { audioProcessed.inc({ model }, result.metadata.duration); } if (error) throw error; return result; } catch (err) { timer(); transcriptionRequests.inc({ method: 'prerecorded', model, status: 'error' }); throw err; } } // Expose metrics endpoint app.get('/metrics', async (req, res) => { res.set('Content-Type', registry.contentType); res.send(await registry.metrics()); });
Step 4: Alert Rules (Prometheus/AlertManager)
groups: - name: deepgram rules: - alert: DeepgramHighErrorRate expr: rate(deepgram_requests_total{status="error"}[5m]) / rate(deepgram_requests_total[5m]) > 0.05 for: 5m labels: severity: critical annotations: summary: "Deepgram error rate > 5%" - alert: DeepgramHighLatency expr: histogram_quantile(0.95, rate(deepgram_latency_seconds_bucket[5m])) > 10 for: 5m labels: severity: warning annotations: summary: "Deepgram P95 latency > 10s" - alert: DeepgramHealthCheckFailed expr: up{job="deepgram-service"} == 0 for: 2m labels: severity: critical annotations: summary: "Deepgram health check failed for 2+ minutes"
Step 5: Error Handling Wrapper
async function safeTranscribe(url: string, options: Record<string, any> = {}) { const timeout = options.timeout ?? 30000; const controller = new AbortController(); const timeoutId = setTimeout(() => controller.abort(), timeout); try { const result = await Promise.race([ instrumentedTranscribe(url, options.model ?? 'nova-3'), new Promise((_, reject) => setTimeout(() => reject(new Error('Transcription timeout')), timeout) ), ]); clearTimeout(timeoutId); return result; } catch (err: any) { clearTimeout(timeoutId); // Log structured error console.error(JSON.stringify({ level: 'error', service: 'deepgram', message: err.message, url: url.substring(0, 100), timestamp: new Date().toISOString(), })); throw err; } }
Step 6: Go-Live Timeline
| Phase | When | Actions |
|---|---|---|
| D-7 | 1 week before | Load test at 2x expected volume, security review |
| D-3 | 3 days before | Smoke test with production key, verify all alerts fire |
| D-1 | Day before | Confirm on-call rotation, validate dashboards |
| D-0 | Launch | Shadow mode (10% traffic), monitoring open |
| D+1 | Day after | Review error rate, latency, verify no anomalies |
| D+7 | 1 week after | Full traffic, tune alert thresholds based on baselines |
Output
- Singleton client with reset capability
- Health check endpoint with latency reporting
- Prometheus metrics (requests, latency, audio, connections)
- AlertManager rules for error rate, latency, availability
- Timeout-safe transcription wrapper
- Phased go-live timeline
Error Handling
| Issue | Cause | Solution |
|---|---|---|
| Health check 503 | API key expired | Rotate key, check secret manager |
| Metrics not scraped | Wrong port/path | Verify Prometheus target config |
| Alert storms | Thresholds too tight | Add duration, tune values |
| Timeout on large files | Sync mode too slow | Switch to URL pattern |