Claude-code-plugins-plus-skills groq-migration-deep-dive
install
source · Clone the upstream repo
git clone https://github.com/jeremylongshore/claude-code-plugins-plus-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/jeremylongshore/claude-code-plugins-plus-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/saas-packs/groq-pack/skills/groq-migration-deep-dive" ~/.claude/skills/jeremylongshore-claude-code-plugins-plus-skills-groq-migration-deep-dive && rm -rf "$T"
manifest:
plugins/saas-packs/groq-pack/skills/groq-migration-deep-dive/SKILL.mdsource content
Groq Migration Deep Dive
Current State
!
npm list groq-sdk openai @anthropic-ai/sdk 2>/dev/null | grep -E "groq|openai|anthropic" || echo 'No LLM SDKs found'
Overview
Migrate to Groq from OpenAI, Anthropic, or other LLM providers. Groq's OpenAI-compatible API makes migration straightforward -- the primary changes are: different SDK import, different model IDs, and different response metadata. The reward is 10-50x faster inference.
Migration Complexity
| Source | Complexity | Key Changes |
|---|---|---|
| OpenAI | Low | Import, model IDs, base URL -- API shape is identical |
| Anthropic | Medium | Different API shape, message format, streaming protocol |
| Local LLMs | Medium | Remove infra, add API calls |
| Other cloud (Bedrock, Vertex) | Medium | Remove cloud SDK, add groq-sdk |
Instructions
Step 1: OpenAI to Groq Migration
// BEFORE: OpenAI import OpenAI from "openai"; const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY }); const result = await openai.chat.completions.create({ model: "gpt-4o-mini", messages: [{ role: "user", content: "Hello" }], }); // AFTER: Groq (minimal changes) import Groq from "groq-sdk"; const groq = new Groq({ apiKey: process.env.GROQ_API_KEY }); const result = await groq.chat.completions.create({ model: "llama-3.3-70b-versatile", // or "llama-3.1-8b-instant" messages: [{ role: "user", content: "Hello" }], }); // Same response shape: result.choices[0].message.content
Step 2: Model ID Mapping
// OpenAI → Groq model equivalents const MODEL_MAP: Record<string, string> = { // OpenAI → Groq (quality equivalent) "gpt-4o": "llama-3.3-70b-versatile", "gpt-4o-mini": "llama-3.1-8b-instant", "gpt-4-turbo": "llama-3.3-70b-versatile", "gpt-3.5-turbo": "llama-3.1-8b-instant", // Anthropic → Groq (approximate) "claude-3-5-sonnet": "llama-3.3-70b-versatile", "claude-3-haiku": "llama-3.1-8b-instant", }; function migrateModelId(model: string): string { return MODEL_MAP[model] || "llama-3.3-70b-versatile"; }
Step 3: Provider Abstraction Layer
// Build a provider-agnostic layer for zero-downtime migration interface LLMProvider { name: string; complete(messages: any[], model: string, maxTokens: number): Promise<{ content: string; model: string; tokens: { prompt: number; completion: number; total: number }; }>; } class GroqProvider implements LLMProvider { name = "groq"; private client: Groq; constructor() { this.client = new Groq(); } async complete(messages: any[], model: string, maxTokens: number) { const result = await this.client.chat.completions.create({ model, messages, max_tokens: maxTokens, }); return { content: result.choices[0].message.content || "", model: result.model, tokens: { prompt: result.usage!.prompt_tokens, completion: result.usage!.completion_tokens, total: result.usage!.total_tokens, }, }; } } class OpenAIProvider implements LLMProvider { name = "openai"; private client: OpenAI; constructor() { this.client = new OpenAI(); } async complete(messages: any[], model: string, maxTokens: number) { const result = await this.client.chat.completions.create({ model, messages, max_tokens: maxTokens, }); return { content: result.choices[0].message.content || "", model: result.model, tokens: { prompt: result.usage!.prompt_tokens, completion: result.usage!.completion_tokens, total: result.usage!.total_tokens, }, }; } }
Step 4: Feature Flag Traffic Shifting
// Gradually shift traffic from OpenAI to Groq function getProvider(): LLMProvider { const groqPercentage = getFeatureFlag("groq_migration_pct"); // 0-100 if (Math.random() * 100 < groqPercentage) { return new GroqProvider(); } return new OpenAIProvider(); } // Migration schedule: // Week 1: groq_migration_pct = 10 (canary) // Week 2: groq_migration_pct = 50 (validate quality) // Week 3: groq_migration_pct = 90 (near-complete) // Week 4: groq_migration_pct = 100 (done, remove OpenAI)
Step 5: Automated Migration Scanner
set -euo pipefail echo "=== Migration Assessment ===" echo "" echo "--- OpenAI references ---" grep -rn "from ['\"]openai['\"]" src/ --include="*.ts" --include="*.js" 2>/dev/null | wc -l grep -rn "openai\." src/ --include="*.ts" --include="*.js" 2>/dev/null | head -5 echo "" echo "--- Model IDs to migrate ---" grep -roh "model.*['\"]gpt-[^'\"]*['\"]" src/ --include="*.ts" --include="*.js" 2>/dev/null | sort -u echo "" echo "--- OpenAI-specific features used ---" grep -rn "\.images\.\|\.audio\.\|\.embeddings\.\|\.moderations\.\|\.files\.\|\.fine_tuning\." \ src/ --include="*.ts" --include="*.js" 2>/dev/null || echo "None (chat.completions only -- easy migration)" echo "" echo "--- API keys to update ---" grep -rn "OPENAI_API_KEY" src/ .env* --include="*.ts" --include="*.js" --include=".env*" 2>/dev/null | wc -l
Step 6: Comparison Benchmark
// Run the same prompts through both providers to compare quality + speed async function migrationBenchmark(prompts: string[]) { const groq = new GroqProvider(); const openai = new OpenAIProvider(); for (const prompt of prompts) { const messages = [{ role: "user" as const, content: prompt }]; const startGroq = performance.now(); const groqResult = await groq.complete(messages, "llama-3.3-70b-versatile", 256); const groqMs = performance.now() - startGroq; const startOAI = performance.now(); const oaiResult = await openai.complete(messages, "gpt-4o-mini", 256); const oaiMs = performance.now() - startOAI; console.log(`Prompt: "${prompt.slice(0, 50)}..."`); console.log(` Groq: ${groqMs.toFixed(0)}ms | ${groqResult.tokens.total} tokens`); console.log(` OpenAI: ${oaiMs.toFixed(0)}ms | ${oaiResult.tokens.total} tokens`); console.log(` Speedup: ${(oaiMs / groqMs).toFixed(1)}x faster with Groq`); console.log(); } }
Step 7: Key Differences to Handle
| Feature | OpenAI | Groq |
|---|---|---|
| SDK import | | |
| Env var | | |
| Models | , | , |
| Embeddings | | Not available (use OpenAI or local) |
| Fine-tuning | Supported | Not available |
| Image generation | | Not available |
| Audio (STT) | | (faster) |
| Structured outputs | | (same format) |
| Tool calling | Supported | Supported (same format) |
| JSON mode | | Same |
| Vision | with images | Llama 4 Scout/Maverick |
| Streaming | Supported | Supported (same SSE format) |
| Response usage | Standard fields | Adds , , |
Rollback Plan
set -euo pipefail # Immediate rollback: flip feature flag # groq_migration_pct = 0 # Verify: # - All requests routing to OpenAI # - Error rates returned to baseline # - No Groq API calls in logs
Error Handling
| Issue | Cause | Solution |
|---|---|---|
| Quality regression | Different model strengths | Tune system prompts for Llama models |
| Missing features | Groq doesn't have embeddings/images | Keep OpenAI for those features |
| Rate limits | Different limits than OpenAI | Configure per-model rate limits |
| Cost increase | Different pricing structure | Route simple tasks to 8B model |
Resources
Next Steps
For ongoing SDK version upgrades, see
groq-upgrade-migration.