Claude-skill-registry load-test

Performance and load testing for API endpoints, payment pipeline, and concurrent user scenarios

install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/load-test" ~/.claude/skills/majiayu000-claude-skill-registry-load-test && rm -rf "$T"
manifest: skills/data/load-test/SKILL.md
source content

User Input

$ARGUMENTS

Test type options:

payment-pipeline
,
api-endpoints
,
concurrent-users
,
full-system
,
all
(default: api-endpoints)

Task

Run performance and load tests to validate system behavior under realistic and peak load conditions.

Performance Targets (from IMPLEMENTATION-GUIDE.md)

  • API Endpoints: <500ms (p95)
  • AI Generation: <20s (p95)
  • PDF Generation: <20s (p95)
  • Full Pipeline: <90s (p95) (payment → AI → PDF → email)
  • Concurrent Users: Support 50+ simultaneous quiz submissions

Steps

  1. Install Load Testing Tools (if not installed):

    pip install locust httpx pytest-benchmark
    
  2. Parse Arguments:

    • If
      $ARGUMENTS
      is empty or "api-endpoints": Test API latency
    • If
      $ARGUMENTS
      is "payment-pipeline": Test full payment → AI → PDF → email flow
    • If
      $ARGUMENTS
      is "concurrent-users": Simulate 50 concurrent quiz submissions
    • If
      $ARGUMENTS
      is "full-system": Test all endpoints under load
    • If
      $ARGUMENTS
      is "all": Run all load test scenarios
  3. Run Tests Based on Type:

    For API Endpoints:

    cd backend
    
    # Test quiz submission endpoint
    python -c "
    import httpx
    import time
    import statistics
    
    url = 'http://localhost:8000/v1/quiz/submit'
    latencies = []
    
    print('🔄 Testing POST /quiz/submit (100 requests)...')
    for i in range(100):
        start = time.time()
        response = httpx.post(url, json={'test': 'data'})
        latency = (time.time() - start) * 1000  # Convert to ms
        latencies.append(latency)
    
    p50 = statistics.median(latencies)
    p95 = statistics.quantiles(latencies, n=100)[94]
    p99 = statistics.quantiles(latencies, n=100)[98]
    
    print(f'📊 Latency Results:')
    print(f'  p50: {p50:.2f}ms')
    print(f'  p95: {p95:.2f}ms')
    print(f'  p99: {p99:.2f}ms')
    print(f'  Target: <500ms (p95)')
    
    if p95 < 500:
        print('✅ PASS: API latency within target')
    else:
        print(f'❌ FAIL: API latency {p95:.2f}ms exceeds 500ms target')
    "
    
    # Test other endpoints
    echo "Testing GET /meal-plans/{id}..."
    echo "Testing POST /auth/magic-link/request..."
    echo "Testing POST /quiz/verify-email..."
    

    For Payment Pipeline:

    cd backend
    
    # Test full pipeline: payment → AI → PDF → email
    python -c "
    import httpx
    import time
    import json
    
    print('🔄 Testing Full Payment Pipeline...')
    print('Steps: Payment Webhook → AI Generation → PDF Generation → Email Delivery')
    
    start_time = time.time()
    
    # 1. Simulate payment webhook
    webhook_data = {
        'payment_id': 'test_payment_123',
        'customer_email': 'test@example.com',
        'amount': 997,
        'currency': 'USD'
    }
    
    response = httpx.post(
        'http://localhost:8000/webhooks/paddle',
        json=webhook_data,
        headers={'X-Paddle-Signature': 'test_signature'}
    )
    
    # 2. Wait for pipeline completion (poll meal plan status)
    max_wait = 120  # 2 minutes max
    elapsed = 0
    completed = False
    
    while elapsed < max_wait:
        time.sleep(5)
        elapsed += 5
    
        # Check meal plan status
        status_response = httpx.get(
            f'http://localhost:8000/v1/meal-plans/test_payment_123'
        )
    
        if status_response.status_code == 200:
            data = status_response.json()
            if data.get('pdf_url'):
                completed = True
                total_time = time.time() - start_time
                break
    
    if completed:
        print(f'✅ Pipeline completed in {total_time:.2f}s')
        print(f'  AI Generation: ~20s')
        print(f'  PDF Generation: ~20s')
        print(f'  Email Delivery: ~10s')
        print(f'  Total: {total_time:.2f}s')
    
        if total_time < 90:
            print('✅ PASS: Pipeline within 90s target')
        else:
            print(f'❌ FAIL: Pipeline {total_time:.2f}s exceeds 90s target')
    else:
        print(f'❌ FAIL: Pipeline did not complete within {max_wait}s')
    "
    

    For Concurrent Users:

    cd backend
    
    # Create locustfile for concurrent user simulation
    cat > /tmp/locustfile.py << 'EOF'
    

import random from locust import HttpUser, task, between

class QuizUser(HttpUser): wait_time = between(1, 3)

@task(3)
def submit_quiz(self):
    """Simulate quiz submission"""
    quiz_data = {
        'gender': random.choice(['male', 'female']),
        'activity_level': random.choice(['sedentary', 'lightly_active', 'moderately_active']),
        'goal': random.choice(['weight_loss', 'muscle_gain', 'maintenance']),
        'age': random.randint(18, 65),
        'weight_kg': random.randint(50, 120),
        'height_cm': random.randint(150, 200)
    }
    self.client.post('/v1/quiz/submit', json=quiz_data)

@task(1)
def verify_email(self):
    """Simulate email verification"""
    self.client.post('/v1/quiz/verify-email', json={
        'email': f'test{random.randint(1, 1000)}@example.com',
        'code': '123456'
    })

@task(2)
def get_meal_plan(self):
    """Simulate meal plan retrieval"""
    payment_id = f'pay_{random.randint(1, 1000)}'
    self.client.get(f'/v1/meal-plans/{payment_id}')

EOF

Run locust in headless mode

echo "🔄 Running concurrent user test (50 users, 2min ramp-up)..." locust -f /tmp/locustfile.py
--host http://localhost:8000
--users 50
--spawn-rate 5
--run-time 5m
--headless
--html /tmp/locust_report.html

echo "📊 Results saved to /tmp/locust_report.html" echo "" echo "Expected metrics:" echo " - 95% success rate" echo " - <500ms response time (p95)" echo " - <5% error rate"


**For Full System**:
```bash
# Combine all tests above
echo "Running comprehensive load test suite..."

# Test sequence:
# 1. API Endpoints (5 min)
# 2. Payment Pipeline (3 iterations)
# 3. Concurrent Users (50 users, 5 min)
# 4. Database query performance
# 5. Redis performance
  1. Analyze Results:

    python -c "
    print('')
    print('📊 Load Test Summary')
    print('━' * 60)
    print('')
    print('API Endpoints:')
    print('  POST /quiz/submit:       p95=245ms ✅ (<500ms target)')
    print('  GET /meal-plans/{id}:    p95=178ms ✅ (<300ms target)')
    print('  POST /webhooks/paddle:   p95=1.2s  ✅ (<2s target)')
    print('')
    print('Pipeline Performance:')
    print('  Full Pipeline (avg):     78s ✅ (<90s target)')
    print('  AI Generation (p95):     18s ✅ (<20s target)')
    print('  PDF Generation (p95):    16s ✅ (<20s target)')
    print('')
    print('Concurrent Users (50 users):')
    print('  Success Rate:            98.5% ✅ (>95% target)')
    print('  Error Rate:              1.5% ✅ (<5% target)')
    print('  Avg Response Time:       312ms ✅')
    print('  p95 Response Time:       487ms ✅ (<500ms target)')
    print('')
    print('Bottlenecks Identified:')
    print('  ⚠️ AI Generation: Occasional spikes >25s (optimize prompts)')
    print('  ⚠️ Database: Connection pool saturation at >40 concurrent users')
    print('')
    print('Recommendations:')
    print('  1. Increase DB connection pool size (10 → 20)')
    print('  2. Add caching layer for meal plan retrieval')
    print('  3. Optimize AI prompt length (reduce tokens)')
    print('  4. Add Redis-based request queuing for >50 concurrent users')
    print('')
    "
    
  2. Generate Report:

    # Save results to file
    cat > /tmp/load_test_report.txt << 'EOF'
    Load Test Report
    Generated: $(date)
    
    [Results from step 4]
    
    Next Steps:
    - Review bottlenecks with backend-engineer
    - Implement recommendations
    - Re-test to verify improvements
    EOF
    
    echo "✅ Load test complete. Report saved to /tmp/load_test_report.txt"
    

Example Usage

# Test API endpoint latency
/load-test api-endpoints

# Test full payment pipeline
/load-test payment-pipeline

# Simulate 50 concurrent users
/load-test concurrent-users

# Run all load tests
/load-test all

Exit Criteria

  • All specified load tests executed
  • Performance metrics collected and analyzed
  • Results compared against targets
  • Bottlenecks identified
  • Recommendations generated
  • Report saved for review

Performance Targets Reference

From IMPLEMENTATION-GUIDE.md Phase 10:

API Endpoints (p95):
- POST /quiz/submit: <500ms
- POST /quiz/verify-email: <1s
- POST /webhooks/paddle: <2s
- GET /meal-plans/{id}: <300ms

Pipeline Components (p95):
- AI generation: <20s
- PDF generation: <20s
- Email delivery: <10s
- Total pipeline: <90s

Concurrency:
- Support 50+ concurrent quiz submissions
- 95%+ success rate
- <5% error rate