Claude-skill-registry load-test

Performance and load testing for API endpoints, payment pipeline, and concurrent user scenarios

install

source · Clone the upstream repo

git clone https://github.com/majiayu000/claude-skill-registry

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/load-test" ~/.claude/skills/majiayu000-claude-skill-registry-load-test && rm -rf "$T"

manifest: skills/data/load-test/SKILL.md

User Input

$ARGUMENTS

Test type options:

payment-pipeline

api-endpoints

concurrent-users

full-system

all

(default: api-endpoints)

Task

Run performance and load tests to validate system behavior under realistic and peak load conditions.

Performance Targets (from IMPLEMENTATION-GUIDE.md)

API Endpoints: <500ms (p95)
AI Generation: <20s (p95)
PDF Generation: <20s (p95)
Full Pipeline: <90s (p95) (payment → AI → PDF → email)
Concurrent Users: Support 50+ simultaneous quiz submissions

Steps

Install Load Testing Tools (if not installed):

pip install locust httpx pytest-benchmark

Parse Arguments:
- If
```
$ARGUMENTS
```
  is empty or "api-endpoints": Test API latency
- If
```
$ARGUMENTS
```
  is "payment-pipeline": Test full payment → AI → PDF → email flow
- If
```
$ARGUMENTS
```
  is "concurrent-users": Simulate 50 concurrent quiz submissions
- If
```
$ARGUMENTS
```
  is "full-system": Test all endpoints under load
- If
```
$ARGUMENTS
```
  is "all": Run all load test scenarios

Run Tests Based on Type:

For API Endpoints:

cd backend

# Test quiz submission endpoint
python -c "
import httpx
import time
import statistics

url = 'http://localhost:8000/v1/quiz/submit'
latencies = []

print('🔄 Testing POST /quiz/submit (100 requests)...')
for i in range(100):
    start = time.time()
    response = httpx.post(url, json={'test': 'data'})
    latency = (time.time() - start) * 1000  # Convert to ms
    latencies.append(latency)

p50 = statistics.median(latencies)
p95 = statistics.quantiles(latencies, n=100)[94]
p99 = statistics.quantiles(latencies, n=100)[98]

print(f'📊 Latency Results:')
print(f'  p50: {p50:.2f}ms')
print(f'  p95: {p95:.2f}ms')
print(f'  p99: {p99:.2f}ms')
print(f'  Target: <500ms (p95)')

if p95 < 500:
    print('✅ PASS: API latency within target')
else:
    print(f'❌ FAIL: API latency {p95:.2f}ms exceeds 500ms target')
"

# Test other endpoints
echo "Testing GET /meal-plans/{id}..."
echo "Testing POST /auth/magic-link/request..."
echo "Testing POST /quiz/verify-email..."

For Payment Pipeline:

cd backend

# Test full pipeline: payment → AI → PDF → email
python -c "
import httpx
import time
import json

print('🔄 Testing Full Payment Pipeline...')
print('Steps: Payment Webhook → AI Generation → PDF Generation → Email Delivery')

start_time = time.time()

# 1. Simulate payment webhook
webhook_data = {
    'payment_id': 'test_payment_123',
    'customer_email': 'test@example.com',
    'amount': 997,
    'currency': 'USD'
}

response = httpx.post(
    'http://localhost:8000/webhooks/paddle',
    json=webhook_data,
    headers={'X-Paddle-Signature': 'test_signature'}
)

# 2. Wait for pipeline completion (poll meal plan status)
max_wait = 120  # 2 minutes max
elapsed = 0
completed = False

while elapsed < max_wait:
    time.sleep(5)
    elapsed += 5

    # Check meal plan status
    status_response = httpx.get(
        f'http://localhost:8000/v1/meal-plans/test_payment_123'
    )

    if status_response.status_code == 200:
        data = status_response.json()
        if data.get('pdf_url'):
            completed = True
            total_time = time.time() - start_time
            break

if completed:
    print(f'✅ Pipeline completed in {total_time:.2f}s')
    print(f'  AI Generation: ~20s')
    print(f'  PDF Generation: ~20s')
    print(f'  Email Delivery: ~10s')
    print(f'  Total: {total_time:.2f}s')

    if total_time < 90:
        print('✅ PASS: Pipeline within 90s target')
    else:
        print(f'❌ FAIL: Pipeline {total_time:.2f}s exceeds 90s target')
else:
    print(f'❌ FAIL: Pipeline did not complete within {max_wait}s')
"

For Concurrent Users:

cd backend

# Create locustfile for concurrent user simulation
cat > /tmp/locustfile.py << 'EOF'

import random from locust import HttpUser, task, between

class QuizUser(HttpUser): wait_time = between(1, 3)

@task(3)
def submit_quiz(self):
    """Simulate quiz submission"""
    quiz_data = {
        'gender': random.choice(['male', 'female']),
        'activity_level': random.choice(['sedentary', 'lightly_active', 'moderately_active']),
        'goal': random.choice(['weight_loss', 'muscle_gain', 'maintenance']),
        'age': random.randint(18, 65),
        'weight_kg': random.randint(50, 120),
        'height_cm': random.randint(150, 200)
    }
    self.client.post('/v1/quiz/submit', json=quiz_data)

@task(1)
def verify_email(self):
    """Simulate email verification"""
    self.client.post('/v1/quiz/verify-email', json={
        'email': f'test{random.randint(1, 1000)}@example.com',
        'code': '123456'
    })

@task(2)
def get_meal_plan(self):
    """Simulate meal plan retrieval"""
    payment_id = f'pay_{random.randint(1, 1000)}'
    self.client.get(f'/v1/meal-plans/{payment_id}')

EOF

Run locust in headless mode

echo "🔄 Running concurrent user test (50 users, 2min ramp-up)..." locust -f /tmp/locustfile.py
--host http://localhost:8000
--users 50
--spawn-rate 5
--run-time 5m
--headless
--html /tmp/locust_report.html

echo "📊 Results saved to /tmp/locust_report.html" echo "" echo "Expected metrics:" echo " - 95% success rate" echo " - <500ms response time (p95)" echo " - <5% error rate"


**For Full System**:
```bash
# Combine all tests above
echo "Running comprehensive load test suite..."

# Test sequence:
# 1. API Endpoints (5 min)
# 2. Payment Pipeline (3 iterations)
# 3. Concurrent Users (50 users, 5 min)
# 4. Database query performance
# 5. Redis performance

Analyze Results:

python -c "
print('')
print('📊 Load Test Summary')
print('━' * 60)
print('')
print('API Endpoints:')
print('  POST /quiz/submit:       p95=245ms ✅ (<500ms target)')
print('  GET /meal-plans/{id}:    p95=178ms ✅ (<300ms target)')
print('  POST /webhooks/paddle:   p95=1.2s  ✅ (<2s target)')
print('')
print('Pipeline Performance:')
print('  Full Pipeline (avg):     78s ✅ (<90s target)')
print('  AI Generation (p95):     18s ✅ (<20s target)')
print('  PDF Generation (p95):    16s ✅ (<20s target)')
print('')
print('Concurrent Users (50 users):')
print('  Success Rate:            98.5% ✅ (>95% target)')
print('  Error Rate:              1.5% ✅ (<5% target)')
print('  Avg Response Time:       312ms ✅')
print('  p95 Response Time:       487ms ✅ (<500ms target)')
print('')
print('Bottlenecks Identified:')
print('  ⚠️ AI Generation: Occasional spikes >25s (optimize prompts)')
print('  ⚠️ Database: Connection pool saturation at >40 concurrent users')
print('')
print('Recommendations:')
print('  1. Increase DB connection pool size (10 → 20)')
print('  2. Add caching layer for meal plan retrieval')
print('  3. Optimize AI prompt length (reduce tokens)')
print('  4. Add Redis-based request queuing for >50 concurrent users')
print('')
"

Generate Report:

# Save results to file
cat > /tmp/load_test_report.txt << 'EOF'
Load Test Report
Generated: $(date)

[Results from step 4]

Next Steps:
- Review bottlenecks with backend-engineer
- Implement recommendations
- Re-test to verify improvements
EOF

echo "✅ Load test complete. Report saved to /tmp/load_test_report.txt"

Example Usage

# Test API endpoint latency
/load-test api-endpoints

# Test full payment pipeline
/load-test payment-pipeline

# Simulate 50 concurrent users
/load-test concurrent-users

# Run all load tests
/load-test all

Exit Criteria

All specified load tests executed
Performance metrics collected and analyzed
Results compared against targets
Bottlenecks identified
Recommendations generated
Report saved for review

Performance Targets Reference

From IMPLEMENTATION-GUIDE.md Phase 10:

API Endpoints (p95):
- POST /quiz/submit: <500ms
- POST /quiz/verify-email: <1s
- POST /webhooks/paddle: <2s
- GET /meal-plans/{id}: <300ms

Pipeline Components (p95):
- AI generation: <20s
- PDF generation: <20s
- Email delivery: <10s
- Total pipeline: <90s

Concurrency:
- Support 50+ concurrent quiz submissions
- 95%+ success rate
- <5% error rate