Claude-code-plugins-plus-skills assemblyai-performance-tuning
install
source · Clone the upstream repo
git clone https://github.com/jeremylongshore/claude-code-plugins-plus-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/jeremylongshore/claude-code-plugins-plus-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/saas-packs/assemblyai-pack/skills/assemblyai-performance-tuning" ~/.claude/skills/jeremylongshore-claude-code-plugins-plus-skills-assemblyai-performance-tuning && rm -rf "$T"
manifest:
plugins/saas-packs/assemblyai-pack/skills/assemblyai-performance-tuning/SKILL.mdsource content
AssemblyAI Performance Tuning
Overview
Optimize AssemblyAI transcription performance through model selection, parallel processing, caching, and webhook-based architectures.
Prerequisites
package installedassemblyai- Understanding of async patterns
- Redis or in-memory cache available (optional)
Latency Benchmarks (Actual)
Async Transcription
| Audio Duration | Approx. Processing Time | Notes |
|---|---|---|
| 30 seconds | ~10-15 seconds | Includes queue time |
| 5 minutes | ~30-60 seconds | Scales sub-linearly |
| 1 hour | ~3-5 minutes | Depends on queue load |
| 10 hours | ~15-30 minutes | Max async duration |
Streaming
| Metric | Value |
|---|---|
| First partial transcript | ~300ms (P50) |
| Final transcript latency | ~500ms (P50) |
| End-of-turn detection | Automatic with endpointing |
Model Speed vs. Accuracy
| Model | Speed | Accuracy | Price/hr |
|---|---|---|---|
| Fastest | Good | $0.12 |
(Universal-3) | Standard | Highest | $0.37 |
(streaming) | Real-time | High | $0.47 |
(streaming) | Real-time | Highest | $0.47 |
Instructions
Step 1: Choose the Right Model
import { AssemblyAI } from 'assemblyai'; const client = new AssemblyAI({ apiKey: process.env.ASSEMBLYAI_API_KEY!, }); // For highest accuracy (default) const accurate = await client.transcripts.transcribe({ audio: audioUrl, speech_model: 'best', }); // For fastest processing and lowest cost const fast = await client.transcripts.transcribe({ audio: audioUrl, speech_model: 'nano', });
Step 2: Parallel Batch Processing
import PQueue from 'p-queue'; const queue = new PQueue({ concurrency: 10 }); async function batchTranscribe(audioUrls: string[]) { const results = await Promise.all( audioUrls.map(url => queue.add(() => client.transcripts.transcribe({ audio: url, speech_model: 'nano' }) ) ) ); return results.filter(t => t.status === 'completed'); } // Process 100 files with 10 concurrent jobs const urls = Array.from({ length: 100 }, (_, i) => `https://storage.example.com/audio-${i}.mp3`); const transcripts = await batchTranscribe(urls); console.log(`Completed: ${transcripts.length}/${urls.length}`);
Step 3: Use Webhooks Instead of Polling
// SLOW: transcribe() polls every 3 seconds until done const slow = await client.transcripts.transcribe({ audio: audioUrl }); // FAST: submit() returns immediately, webhook notifies on completion const fast = await client.transcripts.submit({ audio: audioUrl, webhook_url: 'https://your-app.com/webhooks/assemblyai', }); // Your webhook handler processes the result — no polling overhead
Step 4: Cache Transcript Results
import { LRUCache } from 'lru-cache'; import type { Transcript } from 'assemblyai'; const transcriptCache = new LRUCache<string, Transcript>({ max: 500, ttl: 60 * 60 * 1000, // 1 hour }); async function getCachedTranscript(transcriptId: string): Promise<Transcript> { const cached = transcriptCache.get(transcriptId); if (cached) return cached; const transcript = await client.transcripts.get(transcriptId); if (transcript.status === 'completed') { transcriptCache.set(transcriptId, transcript); } return transcript; }
Step 5: Redis Cache for Distributed Systems
import Redis from 'ioredis'; const redis = new Redis(process.env.REDIS_URL!); async function getCachedTranscriptRedis(transcriptId: string): Promise<Transcript> { const cached = await redis.get(`transcript:${transcriptId}`); if (cached) return JSON.parse(cached); const transcript = await client.transcripts.get(transcriptId); if (transcript.status === 'completed') { await redis.setex( `transcript:${transcriptId}`, 3600, // 1 hour TTL JSON.stringify(transcript) ); } return transcript; }
Step 6: Minimize Feature Overhead
// Only enable features you actually need — each adds processing time // Minimal (fastest) const minimal = await client.transcripts.transcribe({ audio: audioUrl, speech_model: 'nano', punctuate: true, format_text: true, }); // Full intelligence (slower, more expensive) const full = await client.transcripts.transcribe({ audio: audioUrl, speech_model: 'best', speaker_labels: true, sentiment_analysis: true, entity_detection: true, auto_highlights: true, content_safety: true, iab_categories: true, summarization: true, summary_type: 'bullets', });
Step 7: Performance Monitoring
async function timedTranscribe(audioUrl: string, options: Record<string, any> = {}) { const start = Date.now(); const transcript = await client.transcripts.transcribe({ audio: audioUrl, ...options, }); const durationMs = Date.now() - start; const stats = { transcriptId: transcript.id, status: transcript.status, audioDuration: transcript.audio_duration, processingTimeMs: durationMs, ratio: transcript.audio_duration ? (durationMs / 1000 / transcript.audio_duration).toFixed(2) : 'N/A', wordCount: transcript.words?.length ?? 0, model: options.speech_model ?? 'best', }; console.log('Transcription stats:', stats); return { transcript, stats }; }
Output
- Optimal model selection based on speed/accuracy/cost trade-offs
- Parallel batch processing with concurrency control
- Webhook-based architecture (eliminates polling overhead)
- In-memory and Redis caching for transcript retrieval
- Performance monitoring with processing time ratios
Error Handling
| Issue | Cause | Solution |
|---|---|---|
| Slow transcription | Large file + best model | Use model or split audio |
| Queue backlog | Too many concurrent submissions | Limit concurrency with p-queue |
| Cache stale data | Transcript re-processed | Set appropriate TTL, invalidate on webhook |
| Polling overhead | Using for many files | Switch to + webhooks |
Resources
Next Steps
For cost optimization, see
assemblyai-cost-tuning.