Claude-code-plugins-plus-skills deepgram-cost-tuning
install
source · Clone the upstream repo
git clone https://github.com/jeremylongshore/claude-code-plugins-plus-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/jeremylongshore/claude-code-plugins-plus-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/saas-packs/deepgram-pack/skills/deepgram-cost-tuning" ~/.claude/skills/jeremylongshore-claude-code-plugins-plus-skills-deepgram-cost-tuning && rm -rf "$T"
manifest:
plugins/saas-packs/deepgram-pack/skills/deepgram-cost-tuning/SKILL.mdsource content
Deepgram Cost Tuning
Overview
Optimize Deepgram API costs through smart model selection, audio preprocessing to reduce billable minutes, usage monitoring via the Deepgram API, budget guardrails, and feature-aware cost estimation. Deepgram bills per audio minute processed.
Deepgram Pricing (2026)
| Product | Model | Price/Minute | Notes |
|---|---|---|---|
| STT (Batch) | Nova-3 | $0.0043 | Best accuracy |
| STT (Batch) | Nova-2 | $0.0043 | Proven stable |
| STT (Streaming) | Nova-3 | $0.0059 | Real-time |
| STT (Streaming) | Nova-2 | $0.0059 | Real-time |
| STT (Batch) | Base | $0.0048 | Fastest |
| STT (Batch) | Whisper | $0.0048 | Multilingual |
| TTS | Aura-2 | Pay-per-character | See TTS pricing |
| Intelligence | Summarize/Topics/Sentiment | Included with STT | No extra cost |
Add-on costs:
- Diarization: +$0.0044/min
- Multichannel: billed per channel
Instructions
Step 1: Budget-Aware Transcription Service
import { createClient } from '@deepgram/sdk'; interface BudgetConfig { monthlyLimitUsd: number; warningThreshold: number; // 0.0-1.0 (e.g., 0.8 = warn at 80%) costPerMinute: number; // Base STT cost } class BudgetAwareTranscriber { private client: ReturnType<typeof createClient>; private config: BudgetConfig; private monthlySpendUsd = 0; private monthlyMinutes = 0; constructor(apiKey: string, config: BudgetConfig) { this.client = createClient(apiKey); this.config = config; } async transcribe(source: any, options: any) { // Estimate cost before transcription const estimatedCost = this.estimateCost(options); const projected = this.monthlySpendUsd + estimatedCost; if (projected > this.config.monthlyLimitUsd) { throw new Error( `Budget exceeded: $${this.monthlySpendUsd.toFixed(2)} spent, ` + `$${this.config.monthlyLimitUsd} limit` ); } if (projected > this.config.monthlyLimitUsd * this.config.warningThreshold) { console.warn( `Budget warning: ${((projected / this.config.monthlyLimitUsd) * 100).toFixed(0)}% ` + `of $${this.config.monthlyLimitUsd} limit` ); } const { result, error } = await this.client.listen.prerecorded.transcribeUrl( source, options ); if (error) throw error; // Track actual usage const duration = result.metadata.duration / 60; // Convert to minutes const actualCost = this.calculateCost(duration, options); this.monthlyMinutes += duration; this.monthlySpendUsd += actualCost; return result; } private estimateCost(options: any): number { // Conservative estimate — assume 5 minutes per file return this.calculateCost(5, options); } private calculateCost(minutes: number, options: any): number { let cost = minutes * this.config.costPerMinute; if (options.diarize) cost += minutes * 0.0044; // Diarization add-on return cost; } getUsageSummary() { return { minutesUsed: this.monthlyMinutes.toFixed(1), spentUsd: this.monthlySpendUsd.toFixed(4), remainingUsd: (this.config.monthlyLimitUsd - this.monthlySpendUsd).toFixed(4), utilizationPercent: ((this.monthlySpendUsd / this.config.monthlyLimitUsd) * 100).toFixed(1), }; } } // Usage: const transcriber = new BudgetAwareTranscriber(process.env.DEEPGRAM_API_KEY!, { monthlyLimitUsd: 100, warningThreshold: 0.8, costPerMinute: 0.0043, });
Step 2: Reduce Billable Minutes with Audio Preprocessing
# Remove silence — can save 10-40% of billable minutes ffmpeg -i input.wav \ -af "silenceremove=stop_periods=-1:stop_duration=0.5:stop_threshold=-30dB" \ -ar 16000 -ac 1 -acodec pcm_s16le \ trimmed.wav # Speed up audio (1.25x) — saves 20% of billable minutes # Deepgram handles slightly sped-up audio well ffmpeg -i input.wav \ -filter:a "atempo=1.25" \ -ar 16000 -ac 1 -acodec pcm_s16le \ faster.wav
import { execSync } from 'child_process'; function measureSavings(inputPath: string) { // Get original duration const origDuration = parseFloat( execSync(`ffprobe -v quiet -show_entries format=duration -of csv=p=0 "${inputPath}"`) .toString().trim() ); // Remove silence execSync(`ffmpeg -y -i "${inputPath}" \ -af "silenceremove=stop_periods=-1:stop_duration=0.5:stop_threshold=-30dB" \ -ar 16000 -ac 1 -acodec pcm_s16le /tmp/trimmed.wav 2>/dev/null`); const trimmedDuration = parseFloat( execSync(`ffprobe -v quiet -show_entries format=duration -of csv=p=0 /tmp/trimmed.wav`) .toString().trim() ); const savings = ((1 - trimmedDuration / origDuration) * 100).toFixed(1); const costSaved = ((origDuration - trimmedDuration) / 60 * 0.0043).toFixed(4); console.log(`Original: ${origDuration.toFixed(1)}s`); console.log(`Trimmed: ${trimmedDuration.toFixed(1)}s`); console.log(`Savings: ${savings}% (${costSaved}/file at $0.0043/min)`); }
Step 3: Query Deepgram Usage API
import { createClient } from '@deepgram/sdk'; async function getUsageDashboard(projectId: string) { const client = createClient(process.env.DEEPGRAM_API_KEY!); // Get usage for current month const now = new Date(); const monthStart = new Date(now.getFullYear(), now.getMonth(), 1); const { result } = await client.manage.getUsage(projectId, { start: monthStart.toISOString(), end: now.toISOString(), }); // Aggregate by model const byModel: Record<string, { minutes: number; cost: number }> = {}; for (const entry of (result as any).results ?? []) { const model = entry.model ?? 'unknown'; if (!byModel[model]) byModel[model] = { minutes: 0, cost: 0 }; byModel[model].minutes += (entry.hours ?? 0) * 60 + (entry.minutes ?? 0); } console.log('=== Monthly Usage ==='); for (const [model, data] of Object.entries(byModel)) { const cost = data.minutes * 0.0043; console.log(`${model}: ${data.minutes.toFixed(1)} min ($${cost.toFixed(2)})`); } // Monthly projection const dayOfMonth = now.getDate(); const daysInMonth = new Date(now.getFullYear(), now.getMonth() + 1, 0).getDate(); const totalMinutes = Object.values(byModel).reduce((s, d) => s + d.minutes, 0); const projectedMinutes = (totalMinutes / dayOfMonth) * daysInMonth; const projectedCost = projectedMinutes * 0.0043; console.log(`\nProjected monthly: ${projectedMinutes.toFixed(0)} min ($${projectedCost.toFixed(2)})`); }
Step 4: Cost-Optimized Model Selection
function recommendModel(params: { qualityNeeded: 'high' | 'medium' | 'low'; isRealtime: boolean; languages: string[]; budgetPerMinute?: number; }): { model: string; pricePerMin: number; reason: string } { const { qualityNeeded, isRealtime, languages, budgetPerMinute } = params; // Multilingual -> Whisper if (languages.length > 1 || !['en', 'es', 'fr', 'de'].includes(languages[0])) { return { model: 'whisper-large', pricePerMin: 0.0048, reason: 'Multilingual support' }; } // Budget constraint if (budgetPerMinute !== undefined && budgetPerMinute < 0.005) { return { model: 'nova-2', pricePerMin: 0.0043, reason: 'Best price per quality' }; } // Real-time -> Nova-3 (streaming price $0.0059/min) if (isRealtime) { return { model: 'nova-3', pricePerMin: 0.0059, reason: 'Best real-time accuracy' }; } // Quality based switch (qualityNeeded) { case 'high': return { model: 'nova-3', pricePerMin: 0.0043, reason: 'Highest accuracy' }; case 'medium': return { model: 'nova-2', pricePerMin: 0.0043, reason: 'Good accuracy, proven' }; case 'low': return { model: 'base', pricePerMin: 0.0048, reason: 'Fastest processing' }; } }
Step 5: Feature Cost Awareness
// Feature cost breakdown per minute of audio const featureCosts: Record<string, { cost: number; description: string }> = { // Free features (included with STT) smart_format: { cost: 0, description: 'Punctuation + paragraphs + numerals' }, punctuate: { cost: 0, description: 'Punctuation only' }, paragraphs: { cost: 0, description: 'Paragraph formatting' }, summarize: { cost: 0, description: 'AI summary (included with STT)' }, detect_topics: { cost: 0, description: 'Topic detection (included)' }, sentiment: { cost: 0, description: 'Sentiment analysis (included)' }, intents: { cost: 0, description: 'Intent recognition (included)' }, redact: { cost: 0, description: 'PII redaction (included)' }, // Paid add-ons diarize: { cost: 0.0044, description: 'Speaker identification (+$0.0044/min)' }, multichannel: { cost: 0.0043, description: 'Per-channel billing (1x STT cost per channel)' }, }; function estimateJobCost(params: { durationMinutes: number; model: string; features: string[]; channels?: number; }): number { const baseCost = params.durationMinutes * 0.0043; let addOnCost = 0; for (const feature of params.features) { addOnCost += (featureCosts[feature]?.cost ?? 0) * params.durationMinutes; } // Multichannel: billed per channel const channelMultiplier = params.channels ?? 1; return (baseCost + addOnCost) * channelMultiplier; } // Example: 60 min meeting with diarization // estimateJobCost({ durationMinutes: 60, model: 'nova-3', features: ['diarize'] }) // = (60 * 0.0043) + (60 * 0.0044) = $0.258 + $0.264 = $0.522
Output
- Budget-aware transcription with auto-blocking
- Audio preprocessing to reduce billable minutes
- Usage dashboard via Deepgram API
- Cost-optimized model recommendation
- Feature cost breakdown with estimation
Cost Optimization Quick Wins
| Strategy | Savings | Effort |
|---|---|---|
| Remove silence from audio | 10-40% | Low (ffmpeg one-liner) |
| Disable diarization when not needed | ~50% | Low (remove option) |
| Use callback for long files | Indirect (no timeouts) | Low |
| Cache repeated transcriptions | 20-60% | Medium (Redis) |
| Speed up audio 1.25x | 20% | Low (ffmpeg) |
| Use Nova-2 instead of Nova-3 | 0% (same price) | None |
| Batch pre-recorded vs streaming | 37% ($0.0043 vs $0.0059) | Medium |
Error Handling
| Issue | Cause | Solution |
|---|---|---|
| Budget exceeded | No controls | Enable budget check before transcription |
| Unexpected charges | Diarization always on | Make diarization opt-in |
| Usage API empty | Wrong project ID | Get ID from |
| Cost spike | Batch job without limits | Set concurrency limits + budget cap |