Skillshub assemblyai-performance-tuning

install
source · Clone the upstream repo
git clone https://github.com/ComeOnOliver/skillshub
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ComeOnOliver/skillshub "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/jeremylongshore/claude-code-plugins-plus-skills/assemblyai-performance-tuning" ~/.claude/skills/comeonoliver-skillshub-assemblyai-performance-tuning && rm -rf "$T"
manifest: skills/jeremylongshore/claude-code-plugins-plus-skills/assemblyai-performance-tuning/SKILL.md
source content

AssemblyAI Performance Tuning

Overview

Optimize AssemblyAI transcription performance through model selection, parallel processing, caching, and webhook-based architectures.

Prerequisites

  • assemblyai
    package installed
  • Understanding of async patterns
  • Redis or in-memory cache available (optional)

Latency Benchmarks (Actual)

Async Transcription

Audio DurationApprox. Processing TimeNotes
30 seconds~10-15 secondsIncludes queue time
5 minutes~30-60 secondsScales sub-linearly
1 hour~3-5 minutesDepends on queue load
10 hours~15-30 minutesMax async duration

Streaming

MetricValue
First partial transcript~300ms (P50)
Final transcript latency~500ms (P50)
End-of-turn detectionAutomatic with endpointing

Model Speed vs. Accuracy

ModelSpeedAccuracyPrice/hr
nano
FastestGood$0.12
best
(Universal-3)
StandardHighest$0.37
nova-3
(streaming)
Real-timeHigh$0.47
nova-3-pro
(streaming)
Real-timeHighest$0.47

Instructions

Step 1: Choose the Right Model

import { AssemblyAI } from 'assemblyai';

const client = new AssemblyAI({
  apiKey: process.env.ASSEMBLYAI_API_KEY!,
});

// For highest accuracy (default)
const accurate = await client.transcripts.transcribe({
  audio: audioUrl,
  speech_model: 'best',
});

// For fastest processing and lowest cost
const fast = await client.transcripts.transcribe({
  audio: audioUrl,
  speech_model: 'nano',
});

Step 2: Parallel Batch Processing

import PQueue from 'p-queue';

const queue = new PQueue({ concurrency: 10 });

async function batchTranscribe(audioUrls: string[]) {
  const results = await Promise.all(
    audioUrls.map(url =>
      queue.add(() =>
        client.transcripts.transcribe({ audio: url, speech_model: 'nano' })
      )
    )
  );

  return results.filter(t => t.status === 'completed');
}

// Process 100 files with 10 concurrent jobs
const urls = Array.from({ length: 100 }, (_, i) => `https://storage.example.com/audio-${i}.mp3`);
const transcripts = await batchTranscribe(urls);
console.log(`Completed: ${transcripts.length}/${urls.length}`);

Step 3: Use Webhooks Instead of Polling

// SLOW: transcribe() polls every 3 seconds until done
const slow = await client.transcripts.transcribe({ audio: audioUrl });

// FAST: submit() returns immediately, webhook notifies on completion
const fast = await client.transcripts.submit({
  audio: audioUrl,
  webhook_url: 'https://your-app.com/webhooks/assemblyai',
});
// Your webhook handler processes the result — no polling overhead

Step 4: Cache Transcript Results

import { LRUCache } from 'lru-cache';
import type { Transcript } from 'assemblyai';

const transcriptCache = new LRUCache<string, Transcript>({
  max: 500,
  ttl: 60 * 60 * 1000, // 1 hour
});

async function getCachedTranscript(transcriptId: string): Promise<Transcript> {
  const cached = transcriptCache.get(transcriptId);
  if (cached) return cached;

  const transcript = await client.transcripts.get(transcriptId);
  if (transcript.status === 'completed') {
    transcriptCache.set(transcriptId, transcript);
  }
  return transcript;
}

Step 5: Redis Cache for Distributed Systems

import Redis from 'ioredis';

const redis = new Redis(process.env.REDIS_URL!);

async function getCachedTranscriptRedis(transcriptId: string): Promise<Transcript> {
  const cached = await redis.get(`transcript:${transcriptId}`);
  if (cached) return JSON.parse(cached);

  const transcript = await client.transcripts.get(transcriptId);
  if (transcript.status === 'completed') {
    await redis.setex(
      `transcript:${transcriptId}`,
      3600, // 1 hour TTL
      JSON.stringify(transcript)
    );
  }
  return transcript;
}

Step 6: Minimize Feature Overhead

// Only enable features you actually need — each adds processing time

// Minimal (fastest)
const minimal = await client.transcripts.transcribe({
  audio: audioUrl,
  speech_model: 'nano',
  punctuate: true,
  format_text: true,
});

// Full intelligence (slower, more expensive)
const full = await client.transcripts.transcribe({
  audio: audioUrl,
  speech_model: 'best',
  speaker_labels: true,
  sentiment_analysis: true,
  entity_detection: true,
  auto_highlights: true,
  content_safety: true,
  iab_categories: true,
  summarization: true,
  summary_type: 'bullets',
});

Step 7: Performance Monitoring

async function timedTranscribe(audioUrl: string, options: Record<string, any> = {}) {
  const start = Date.now();
  const transcript = await client.transcripts.transcribe({
    audio: audioUrl,
    ...options,
  });
  const durationMs = Date.now() - start;

  const stats = {
    transcriptId: transcript.id,
    status: transcript.status,
    audioDuration: transcript.audio_duration,
    processingTimeMs: durationMs,
    ratio: transcript.audio_duration
      ? (durationMs / 1000 / transcript.audio_duration).toFixed(2)
      : 'N/A',
    wordCount: transcript.words?.length ?? 0,
    model: options.speech_model ?? 'best',
  };

  console.log('Transcription stats:', stats);
  return { transcript, stats };
}

Output

  • Optimal model selection based on speed/accuracy/cost trade-offs
  • Parallel batch processing with concurrency control
  • Webhook-based architecture (eliminates polling overhead)
  • In-memory and Redis caching for transcript retrieval
  • Performance monitoring with processing time ratios

Error Handling

IssueCauseSolution
Slow transcriptionLarge file + best modelUse
nano
model or split audio
Queue backlogToo many concurrent submissionsLimit concurrency with p-queue
Cache stale dataTranscript re-processedSet appropriate TTL, invalidate on webhook
Polling overheadUsing
transcribe()
for many files
Switch to
submit()
+ webhooks

Resources

Next Steps

For cost optimization, see

assemblyai-cost-tuning
.