Skills api-load-tester

install
source · Clone the upstream repo
git clone https://github.com/TerminalSkills/skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/TerminalSkills/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/api-load-tester" ~/.claude/skills/terminalskills-skills-api-load-tester && rm -rf "$T"
manifest: skills/api-load-tester/SKILL.md
source content

API Load Tester

Overview

This skill generates realistic load test scripts from API definitions and executes them with proper ramp-up patterns, authentication flows, and assertions. It produces clear reports identifying breaking points, bottlenecks, and latency percentiles at each traffic level.

Instructions

Step 1: Choose Tool and Gather API Info

Prefer k6 for complex scenarios (multi-step flows, thresholds, custom metrics). Use wrk for quick single-endpoint benchmarks. Use autocannon if only Node.js is available.

Gather endpoint information from:

  • OpenAPI/Swagger spec files
  • Route definitions (Express, FastAPI, etc.)
  • User-described endpoints

Step 2: Generate Realistic Payloads

Read request/response types from the codebase (TypeScript interfaces, Python dataclasses, Go structs) and generate payloads with:

  • Realistic field values (not "test123" or "foo")
  • Proper data distributions (varied product IDs, realistic quantities)
  • Edge cases mixed in (long strings, special characters at ~5% rate)

Step 3: Design Test Scenarios

Create scenarios appropriate for the goal:

Ramp-up test (finding breaking point):

stages: [
  { duration: '2m', target: 50 },    // warm-up
  { duration: '5m', target: 200 },   // ramp
  { duration: '3m', target: 500 },   // push
  { duration: '2m', target: 500 },   // sustain
  { duration: '2m', target: 0 },     // cool-down
]

Soak test (finding memory leaks, connection exhaustion):

stages: [
  { duration: '5m', target: 100 },   // ramp
  { duration: '60m', target: 100 },  // sustain
  { duration: '5m', target: 0 },     // cool-down
]

Spike test (sudden traffic burst):

stages: [
  { duration: '2m', target: 50 },    // normal
  { duration: '30s', target: 500 },  // spike
  { duration: '5m', target: 500 },   // sustain spike
  { duration: '30s', target: 50 },   // drop back
]

Step 4: Include Proper Assertions

Always add thresholds:

thresholds: {
  http_req_duration: ['p(95)<800', 'p(99)<2000'],
  http_req_failed: ['rate<0.05'],
  // Custom per-endpoint if needed
}

Step 5: Execute and Report

Run the test and parse output into a structured report showing:

  • Latency percentiles (p50, p95, p99) per stage
  • Error rate per stage
  • Requests per second achieved
  • Breaking point (where thresholds first violated)
  • Bottleneck hypothesis based on error patterns

Examples

Example 1: Quick Endpoint Benchmark

User prompt:

Benchmark our /api/search endpoint handling 100 concurrent connections

Agent output:

# Quick benchmark with k6
k6 run --vus 100 --duration 60s - <<'EOF'
import http from 'k6/http';
import { check } from 'k6';

export default function () {
  const res = http.get('https://api.example.com/api/search?q=laptop&page=1');
  check(res, { 'status 200': (r) => r.status === 200 });
}
EOF
Results — /api/search @ 100 VUs for 60s
  Requests:    8,423 total (140.4 RPS)
  Latency:     p50=89ms  p95=340ms  p99=890ms
  Errors:      12 (0.14%)
  Data:        24.3 MB received

Verdict: Endpoint handles 140 RPS at 100 VUs. p99 approaching 1s suggests
database query optimization needed for sustained higher load.

Example 2: Multi-Step Flow

User prompt:

Create a load test for our signup → login → create-project flow

Agent generates a k6 script with:

  • Step 1: POST /api/auth/signup with randomized email/name
  • Step 2: POST /api/auth/login to get JWT
  • Step 3: POST /api/projects with auth header and realistic project data
  • Custom metrics tracking each step's latency separately
  • Sleep between steps to simulate real user behavior

Guidelines

  • Never load test production without explicit confirmation — always clarify the target environment
  • Start low, ramp gradually — sudden jumps make it hard to identify the exact breaking point
  • Realistic think time — add
    sleep(1-3)
    between requests to simulate real users; without it, you're testing throughput, not user concurrency
  • Authentication matters — many bottlenecks only appear with real auth flows (token validation, session lookups)
  • Watch for connection reuse — k6 reuses connections by default, which is realistic for browsers but not for serverless/mobile clients
  • Rate limit awareness — if the API has rate limiting, note it in the report; it's not a performance bottleneck, it's intentional
  • Report infrastructure context — always note the server specs, pod count, and database size alongside results