Claude-code-plugins-plus groq-common-errors
git clone https://github.com/jeremylongshore/claude-code-plugins-plus-skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/jeremylongshore/claude-code-plugins-plus-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/saas-packs/groq-pack/skills/groq-common-errors" ~/.claude/skills/jeremylongshore-claude-code-plugins-plus-groq-common-errors && rm -rf "$T"
plugins/saas-packs/groq-pack/skills/groq-common-errors/SKILL.mdGroq Common Errors
Overview
Comprehensive reference for Groq API error codes, their root causes, and proven fixes. Groq returns standard HTTP status codes with structured error bodies and rate-limit headers.
Error Response Format
{ "error": { "message": "Rate limit reached for model `llama-3.3-70b-versatile`...", "type": "tokens", "code": "rate_limit_exceeded" } }
Quick Diagnostic
set -euo pipefail # 1. Verify API key is valid curl -s https://api.groq.com/openai/v1/models \ -H "Authorization: Bearer $GROQ_API_KEY" | jq '.data | length' # 2. Check specific model availability curl -s https://api.groq.com/openai/v1/models \ -H "Authorization: Bearer $GROQ_API_KEY" | jq '.data[].id' | sort # 3. Test a minimal completion curl -s https://api.groq.com/openai/v1/chat/completions \ -H "Authorization: Bearer $GROQ_API_KEY" \ -H "Content-Type: application/json" \ -d '{"model":"llama-3.1-8b-instant","messages":[{"role":"user","content":"ping"}],"max_tokens":5}' | jq .
Error Reference
401 — Authentication Error
Authentication error: Invalid API key provided
Causes: Key missing, revoked, or malformed. Fix:
# Verify key is set and starts with gsk_ echo "${GROQ_API_KEY:0:4}" # Should print "gsk_" # Test key directly curl -s -o /dev/null -w "%{http_code}" \ https://api.groq.com/openai/v1/models \ -H "Authorization: Bearer $GROQ_API_KEY" # Should return 200
429 — Rate Limit Exceeded
Rate limit reached for model `llama-3.3-70b-versatile` in organization `org_xxx` on tokens per minute (TPM): Limit 6000, Used 5800, Requested 500.
Causes: RPM (requests/min), TPM (tokens/min), or RPD (requests/day) limit hit.
Rate limit headers returned:
| Header | Description |
|---|---|
| Seconds to wait before retrying |
| Max requests per window |
| Max tokens per window |
| Requests remaining |
| Tokens remaining |
| When request limit resets |
| When token limit resets |
Fix:
import Groq from "groq-sdk"; async function handleRateLimit<T>(fn: () => Promise<T>): Promise<T> { try { return await fn(); } catch (err) { if (err instanceof Groq.APIError && err.status === 429) { const retryAfter = parseInt(err.headers?.["retry-after"] || "10"); console.warn(`Rate limited. Waiting ${retryAfter}s...`); await new Promise((r) => setTimeout(r, retryAfter * 1000)); return fn(); // Single retry } throw err; } }
400 — Bad Request
Invalid parameter: model 'mixtral-8x7b-32768' is not available
Causes: Deprecated model ID, invalid parameters, or schema violation.
Common deprecated model IDs:
| Deprecated | Replacement |
|---|---|
| or |
| |
| |
Fix: Check current models at console.groq.com/docs/models or call
GET /openai/v1/models.
413 — Request Too Large
Maximum context length is 131072 tokens. However, your messages resulted in 140000 tokens.
Fix: Reduce prompt size or split into smaller requests. All current Llama models have 128K context.
500 / 503 — Server Errors
Internal server error / Service temporarily unavailable
Causes: Groq infrastructure issue, model overloaded. Fix: Retry with backoff, fall back to a different model, check status.groq.com.
SDK-Specific Errors
TypeScript:
import Groq from "groq-sdk"; try { await groq.chat.completions.create({ /* ... */ }); } catch (err) { if (err instanceof Groq.APIError) { console.error(`Status: ${err.status}, Message: ${err.message}`); } else if (err instanceof Groq.APIConnectionError) { console.error("Network error:", err.message); } else if (err instanceof Groq.RateLimitError) { console.error("Rate limited:", err.message); } else if (err instanceof Groq.AuthenticationError) { console.error("Auth failed:", err.message); } }
Python:
from groq import Groq, APIError, RateLimitError, AuthenticationError try: client.chat.completions.create(...) except RateLimitError as e: print(f"Rate limited: {e.message}") except AuthenticationError as e: print(f"Auth error: {e.message}") except APIError as e: print(f"API error {e.status_code}: {e.message}")
Escalation Path
- Check status.groq.com for ongoing incidents
- Collect request ID from error response (
header)x-request-id - Run
skill to gather diagnosticsgroq-debug-bundle - Contact Groq support with request ID and debug bundle
Resources
Next Steps
For comprehensive debugging, see
groq-debug-bundle.