Agent-skills api-rate-limiting-helper

install
source · Clone the upstream repo
git clone https://github.com/LambdaTest/agent-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/LambdaTest/agent-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/api/API-Ratelimit-Helper" ~/.claude/skills/lambdatest-agent-skills-api-rate-limiting-helper && rm -rf "$T"
manifest: api/API-Ratelimit-Helper/SKILL.md
source content

API Rate Limiting Skill

Design complete rate limiting, quota, and retry systems for any API.


Rate Limiting Algorithms

AlgorithmBest ForTrade-offs
Token bucketBursty traffic with sustained avgAllows bursts; slightly complex
Leaky bucketStrict rate enforcementSmooths bursts; can feel slow
Fixed windowSimple countingBoundary spike problem
Sliding window logPrecise limitingMemory-intensive
Sliding window counterBalance of precision/memoryBest for most APIs

Recommendation: Use sliding window counter for API endpoints, token bucket for streaming/upload endpoints.


Response Headers (RFC standard)

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 42
X-RateLimit-Reset: 1700000060
X-RateLimit-Policy: 100;w=60;comment="per minute"
Retry-After: 18

429 Response Body

{
  "error": "rate_limit_exceeded",
  "message": "Too many requests. You have exceeded 100 requests per minute.",
  "retry_after_seconds": 18,
  "limit": 100,
  "window": "60s",
  "reset_at": "2024-01-01T00:01:00Z"
}

Tiered Quota Design

TierRequests/minRequests/dayBurstConcurrent
Free101,000202
Starter10050,00020010
Pro1,000500,0002,00050
EnterpriseCustomUnlimitedCustomCustom

Quota Endpoints

GET  /api/v1/account/quota         — current usage vs limits
GET  /api/v1/account/quota/history — usage over time

Response:

{
  "plan": "pro",
  "period": "2024-01",
  "limits": { "requests_per_minute": 1000, "requests_per_day": 500000 },
  "usage": { "requests_today": 12345, "requests_this_minute": 234 },
  "resets_at": "2024-02-01T00:00:00Z"
}

Retry Logic (client-side)

Exponential backoff with jitter

import random, time

def retry_with_backoff(fn, max_retries=5, base_delay=1.0, max_delay=60.0):
    for attempt in range(max_retries):
        try:
            return fn()
        except RateLimitError as e:
            if attempt == max_retries - 1:
                raise
            # Use Retry-After header if present, else exponential backoff
            delay = min(
                e.retry_after or (base_delay * (2 ** attempt)),
                max_delay
            )
            # Add jitter to prevent thundering herd
            delay += random.uniform(0, delay * 0.1)
            time.sleep(delay)

Retryable vs Non-retryable status codes

StatusRetry?Strategy
429YesRespect
Retry-After
header
500YesExponential backoff
502/503YesExponential backoff
504YesExponential backoff
400NoFix request
401NoRefresh token, then retry once
403NoFix permissions
404NoFix URL
422NoFix payload

Circuit Breaker Pattern

States: CLOSED → OPEN → HALF-OPEN → CLOSED

CLOSED: normal operation
  - Track failure rate in rolling window
  - If failure rate > threshold (e.g. 50% in 10s): → OPEN

OPEN: reject all requests immediately (fail-fast)
  - Return 503 without calling downstream
  - After cooldown period (e.g. 30s): → HALF-OPEN

HALF-OPEN: allow limited traffic through
  - If first N requests succeed: → CLOSED
  - If any fail: → OPEN again

Idempotency Keys

For state-changing requests that may be retried:

POST /api/v1/payments
Idempotency-Key: uuid-v4-client-generated

Response includes:
Idempotency-Key: uuid-v4-client-generated
X-Idempotent-Replayed: true  (if this is a duplicate)

Store: idempotency key → response, expire after 24h. Return cached response for duplicate keys.


After Completing the API Ratelimit Output

Once the API ratelimit output is delivered, ask the user:

"Would you like me to generate API documentation for this design? (yes/no)"

If the user says yes:

  • Check if the API Documentation skill is available in the installed skills list
  • If the skill is available:
    • Read and follow the instructions in the API Documentation skill
    • Use the API rate limiting output above as the input
  • If the skill is NOT available:
    • Inform the user: "It looks like the API Documentation skill isn't installed. You can install it and re-run.

If the user says no:

  • End the task here