Claude-skill-registry Load Balancing Strategies
Comprehensive guide to load balancing algorithms, health checks, and high-availability patterns.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/load-balancing" ~/.claude/skills/majiayu000-claude-skill-registry-load-balancing-strategies && rm -rf "$T"
skills/data/load-balancing/SKILL.mdLoad Balancing Strategies
Overview
Load balancing distributes traffic across multiple targets to improve availability, performance, and resilience. This guide covers algorithms, health checks, and operational best practices.
Table of Contents
- Fundamentals
- Layer 4 vs Layer 7
- Algorithms
- Health Checks
- Session Persistence
- TLS Termination
- Connection Draining
- Global Load Balancing
- Cloud Load Balancers
- Software Load Balancers
- Service Mesh Load Balancing
- Autoscaling Integration
- Monitoring
- Troubleshooting
Fundamentals
Core goals:
- Spread traffic evenly
- Avoid single points of failure
- Improve latency by routing to healthy targets
Layer 4 vs Layer 7
- Layer 4 (TCP/UDP): Fast, protocol-agnostic, no HTTP awareness.
- Layer 7 (HTTP/HTTPS): Route by path/host/headers, supports TLS termination.
Algorithms
Common strategies:
- Round Robin: Simple rotation.
- Weighted Round Robin: Bias to stronger nodes.
- Least Connections: Route to least busy.
- Weighted Least Connections: Combine weight + load.
- IP Hash: Sticky routing by client IP.
- Random: Low overhead.
- Least Response Time: Prefer lowest latency target.
Health Checks
Types:
- Active: Probes at intervals.
- Passive: Detect failures from live traffic.
Use both for best detection and recovery.
Session Persistence
Sticky sessions route a client to the same target:
- Cookie-based affinity
- IP hash
Use only when state cannot be externalized.
TLS Termination
Terminate TLS at the load balancer for:
- Centralized cert management
- Better performance
- Easier observability
Optionally re-encrypt to backend for end-to-end security.
Connection Draining
Allow in-flight requests to finish during scale-down or deploy:
- Set drain timeout
- Stop new connections
Global Load Balancing
GSLB routes across regions:
- Geo-based routing
- Latency-based routing
- Failover routing
Cloud Load Balancers
- AWS: ALB (L7), NLB (L4)
- GCP: HTTP(S) Load Balancer, TCP/UDP LB
- Azure: Application Gateway, Azure Load Balancer
Software Load Balancers
- NGINX: Popular L7 proxy with health checks.
- HAProxy: High performance L4/L7.
- Envoy: Modern proxy with rich telemetry.
Service Mesh Load Balancing
Service meshes (Istio, Linkerd) provide client-side load balancing with retry policies, circuit breaking, and telemetry.
Autoscaling Integration
Combine with autoscaling:
- Scale on CPU, latency, or queue depth
- Pre-warm nodes to reduce cold starts
Monitoring
Track:
- Request rate and latency
- Backend error rates
- Health check failures
- Uneven traffic distribution
Troubleshooting
Common issues:
- Misconfigured health checks (false negatives)
- Sticky sessions causing hot spots
- TLS mismatch or SNI routing errors
- Draining too short for long requests
Related Skills
09-microservices/api-gateway09-microservices/service-mesh15-devops-infrastructure/kubernetes-helm