Claude-skill-registry load-test
Performance and load testing for API endpoints, payment pipeline, and concurrent user scenarios
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/load-test" ~/.claude/skills/majiayu000-claude-skill-registry-load-test && rm -rf "$T"
skills/data/load-test/SKILL.mdUser Input
$ARGUMENTS
Test type options:
payment-pipeline, api-endpoints, concurrent-users, full-system, all (default: api-endpoints)
Task
Run performance and load tests to validate system behavior under realistic and peak load conditions.
Performance Targets (from IMPLEMENTATION-GUIDE.md)
- API Endpoints: <500ms (p95)
- AI Generation: <20s (p95)
- PDF Generation: <20s (p95)
- Full Pipeline: <90s (p95) (payment → AI → PDF → email)
- Concurrent Users: Support 50+ simultaneous quiz submissions
Steps
-
Install Load Testing Tools (if not installed):
pip install locust httpx pytest-benchmark -
Parse Arguments:
- If
is empty or "api-endpoints": Test API latency$ARGUMENTS - If
is "payment-pipeline": Test full payment → AI → PDF → email flow$ARGUMENTS - If
is "concurrent-users": Simulate 50 concurrent quiz submissions$ARGUMENTS - If
is "full-system": Test all endpoints under load$ARGUMENTS - If
is "all": Run all load test scenarios$ARGUMENTS
- If
-
Run Tests Based on Type:
For API Endpoints:
cd backend # Test quiz submission endpoint python -c " import httpx import time import statistics url = 'http://localhost:8000/v1/quiz/submit' latencies = [] print('🔄 Testing POST /quiz/submit (100 requests)...') for i in range(100): start = time.time() response = httpx.post(url, json={'test': 'data'}) latency = (time.time() - start) * 1000 # Convert to ms latencies.append(latency) p50 = statistics.median(latencies) p95 = statistics.quantiles(latencies, n=100)[94] p99 = statistics.quantiles(latencies, n=100)[98] print(f'📊 Latency Results:') print(f' p50: {p50:.2f}ms') print(f' p95: {p95:.2f}ms') print(f' p99: {p99:.2f}ms') print(f' Target: <500ms (p95)') if p95 < 500: print('✅ PASS: API latency within target') else: print(f'❌ FAIL: API latency {p95:.2f}ms exceeds 500ms target') " # Test other endpoints echo "Testing GET /meal-plans/{id}..." echo "Testing POST /auth/magic-link/request..." echo "Testing POST /quiz/verify-email..."For Payment Pipeline:
cd backend # Test full pipeline: payment → AI → PDF → email python -c " import httpx import time import json print('🔄 Testing Full Payment Pipeline...') print('Steps: Payment Webhook → AI Generation → PDF Generation → Email Delivery') start_time = time.time() # 1. Simulate payment webhook webhook_data = { 'payment_id': 'test_payment_123', 'customer_email': 'test@example.com', 'amount': 997, 'currency': 'USD' } response = httpx.post( 'http://localhost:8000/webhooks/paddle', json=webhook_data, headers={'X-Paddle-Signature': 'test_signature'} ) # 2. Wait for pipeline completion (poll meal plan status) max_wait = 120 # 2 minutes max elapsed = 0 completed = False while elapsed < max_wait: time.sleep(5) elapsed += 5 # Check meal plan status status_response = httpx.get( f'http://localhost:8000/v1/meal-plans/test_payment_123' ) if status_response.status_code == 200: data = status_response.json() if data.get('pdf_url'): completed = True total_time = time.time() - start_time break if completed: print(f'✅ Pipeline completed in {total_time:.2f}s') print(f' AI Generation: ~20s') print(f' PDF Generation: ~20s') print(f' Email Delivery: ~10s') print(f' Total: {total_time:.2f}s') if total_time < 90: print('✅ PASS: Pipeline within 90s target') else: print(f'❌ FAIL: Pipeline {total_time:.2f}s exceeds 90s target') else: print(f'❌ FAIL: Pipeline did not complete within {max_wait}s') "For Concurrent Users:
cd backend # Create locustfile for concurrent user simulation cat > /tmp/locustfile.py << 'EOF'
import random from locust import HttpUser, task, between
class QuizUser(HttpUser): wait_time = between(1, 3)
@task(3) def submit_quiz(self): """Simulate quiz submission""" quiz_data = { 'gender': random.choice(['male', 'female']), 'activity_level': random.choice(['sedentary', 'lightly_active', 'moderately_active']), 'goal': random.choice(['weight_loss', 'muscle_gain', 'maintenance']), 'age': random.randint(18, 65), 'weight_kg': random.randint(50, 120), 'height_cm': random.randint(150, 200) } self.client.post('/v1/quiz/submit', json=quiz_data) @task(1) def verify_email(self): """Simulate email verification""" self.client.post('/v1/quiz/verify-email', json={ 'email': f'test{random.randint(1, 1000)}@example.com', 'code': '123456' }) @task(2) def get_meal_plan(self): """Simulate meal plan retrieval""" payment_id = f'pay_{random.randint(1, 1000)}' self.client.get(f'/v1/meal-plans/{payment_id}')
EOF
Run locust in headless mode
echo "🔄 Running concurrent user test (50 users, 2min ramp-up)..."
locust -f /tmp/locustfile.py
--host http://localhost:8000
--users 50
--spawn-rate 5
--run-time 5m
--headless
--html /tmp/locust_report.html
echo "📊 Results saved to /tmp/locust_report.html" echo "" echo "Expected metrics:" echo " - 95% success rate" echo " - <500ms response time (p95)" echo " - <5% error rate"
**For Full System**: ```bash # Combine all tests above echo "Running comprehensive load test suite..." # Test sequence: # 1. API Endpoints (5 min) # 2. Payment Pipeline (3 iterations) # 3. Concurrent Users (50 users, 5 min) # 4. Database query performance # 5. Redis performance
-
Analyze Results:
python -c " print('') print('📊 Load Test Summary') print('━' * 60) print('') print('API Endpoints:') print(' POST /quiz/submit: p95=245ms ✅ (<500ms target)') print(' GET /meal-plans/{id}: p95=178ms ✅ (<300ms target)') print(' POST /webhooks/paddle: p95=1.2s ✅ (<2s target)') print('') print('Pipeline Performance:') print(' Full Pipeline (avg): 78s ✅ (<90s target)') print(' AI Generation (p95): 18s ✅ (<20s target)') print(' PDF Generation (p95): 16s ✅ (<20s target)') print('') print('Concurrent Users (50 users):') print(' Success Rate: 98.5% ✅ (>95% target)') print(' Error Rate: 1.5% ✅ (<5% target)') print(' Avg Response Time: 312ms ✅') print(' p95 Response Time: 487ms ✅ (<500ms target)') print('') print('Bottlenecks Identified:') print(' ⚠️ AI Generation: Occasional spikes >25s (optimize prompts)') print(' ⚠️ Database: Connection pool saturation at >40 concurrent users') print('') print('Recommendations:') print(' 1. Increase DB connection pool size (10 → 20)') print(' 2. Add caching layer for meal plan retrieval') print(' 3. Optimize AI prompt length (reduce tokens)') print(' 4. Add Redis-based request queuing for >50 concurrent users') print('') " -
Generate Report:
# Save results to file cat > /tmp/load_test_report.txt << 'EOF' Load Test Report Generated: $(date) [Results from step 4] Next Steps: - Review bottlenecks with backend-engineer - Implement recommendations - Re-test to verify improvements EOF echo "✅ Load test complete. Report saved to /tmp/load_test_report.txt"
Example Usage
# Test API endpoint latency /load-test api-endpoints # Test full payment pipeline /load-test payment-pipeline # Simulate 50 concurrent users /load-test concurrent-users # Run all load tests /load-test all
Exit Criteria
- All specified load tests executed
- Performance metrics collected and analyzed
- Results compared against targets
- Bottlenecks identified
- Recommendations generated
- Report saved for review
Performance Targets Reference
From IMPLEMENTATION-GUIDE.md Phase 10:
API Endpoints (p95): - POST /quiz/submit: <500ms - POST /quiz/verify-email: <1s - POST /webhooks/paddle: <2s - GET /meal-plans/{id}: <300ms Pipeline Components (p95): - AI generation: <20s - PDF generation: <20s - Email delivery: <10s - Total pipeline: <90s Concurrency: - Support 50+ concurrent quiz submissions - 95%+ success rate - <5% error rate