Awesome-omni-skill istio
Service mesh implementation with Istio for microservices traffic management, security, and observability. Use when implementing service mesh, mTLS, traffic routing, load balancing, circuit breakers, retries, timeouts, canary deployments, A/B testing, or service-to-service communication. Triggers: istio, service mesh, envoy, sidecar, virtualservice, destinationrule, gateway, mtls, peerauthentication, authorizationpolicy, serviceentry, traffic management, traffic splitting, canary, blue-green, circuit breaker, retry, timeout, load balancing, ingress, egress, observability, tracing, telemetry.
git clone https://github.com/diegosouzapw/awesome-omni-skill
T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/testing-security/istio" ~/.claude/skills/diegosouzapw-awesome-omni-skill-istio && rm -rf "$T"
skills/testing-security/istio/SKILL.mdIstio Service Mesh
Overview
Istio is an open-source service mesh that provides traffic management, security, and observability for microservices architectures. It uses a sidecar proxy pattern with Envoy proxies to intercept and control all network communication between services.
Core Capabilities
Traffic Management: Load balancing, traffic splitting, canary deployments, blue-green deployments, A/B testing, retries, timeouts, circuit breakers, fault injection.
Security: mTLS encryption, certificate management, authentication, authorization policies, RBAC, JWT validation, service-to-service security.
Observability: Distributed tracing, metrics collection, access logging, service topology visualization, golden signals monitoring.
Quick Reference: Common Tasks
| Task | Resources | Section |
|---|---|---|
| Enable mTLS between services | PeerAuthentication | mTLS PeerAuthentication |
| Route traffic to new version | VirtualService + DestinationRule | Traffic Splitting for Canary |
| Add circuit breaker | DestinationRule (outlierDetection) | Circuit Breaker and Retry |
| Configure retries/timeouts | VirtualService (retries, timeout) | Circuit Breaker and Retry |
| Expose service to internet | Gateway + VirtualService | Gateway and VirtualService |
| Control egress traffic | Sidecar + ServiceEntry | Sidecar Resource for Egress |
| Add authorization rules | AuthorizationPolicy | AuthorizationPolicy for RBAC |
| Configure load balancing | DestinationRule (loadBalancer) | DestinationRule with Traffic Policies |
| Test resilience | VirtualService (fault injection) | Fault Injection for Testing |
Architecture Components
Control Plane (istiod)
- Service discovery and configuration distribution
- Certificate authority for mTLS
- Pilot for traffic management
- Galley for configuration validation
- Citadel for security
Data Plane
- Envoy proxies deployed as sidecars
- Intercept all inbound and outbound traffic
- Enforce policies and collect telemetry
- Handle traffic routing, load balancing, and retries
Key Resources
: Configures load balancers for HTTP/TCP traffic entering the meshGateway
: Defines traffic routing rulesVirtualService
: Configures policies after routing (load balancing, connection pools, circuit breakers)DestinationRule
: Adds external services to the meshServiceEntry
: Configures mTLS between servicesPeerAuthentication
: Defines access control policiesAuthorizationPolicy
: Controls sidecar proxy configuration and egress trafficSidecar
Installation and Configuration
Install Istio with istioctl
# Download and install istioctl curl -L https://istio.io/downloadIstio | sh - cd istio-* export PATH=$PWD/bin:$PATH # Install Istio with production profile istioctl install --set profile=production -y # Verify installation kubectl get pods -n istio-system istioctl verify-install # Enable automatic sidecar injection for namespace kubectl label namespace default istio-injection=enabled
Configuration Profiles
# Minimal: Control plane only, no ingress/egress istioctl install --set profile=minimal # Default: Recommended for production istioctl install --set profile=default # Production: High availability control plane istioctl install --set profile=production # Custom configuration istioctl install --set profile=default \ --set meshConfig.accessLogFile=/dev/stdout \ --set meshConfig.enableTracing=true \ --set meshConfig.defaultConfig.proxyMetadata.ISTIO_META_DNS_CAPTURE=true
Verify Sidecar Injection
# Check if namespace has injection enabled kubectl get namespace -L istio-injection # Verify pod has sidecar kubectl get pod <pod-name> -o jsonpath='{.spec.containers[*].name}' # Should show: app-container istio-proxy # View sidecar configuration istioctl proxy-config all <pod-name>.<namespace>
Best Practices
Gateway Configuration
Guidelines
- Use dedicated Gateway resources per domain or protocol
- Configure HTTPS with proper TLS certificates
- Implement health checks and timeouts
- Use wildcard domains sparingly for security
- Place gateways in dedicated namespaces (istio-system or istio-ingress)
Anti-patterns
- Avoid multiple Gateways binding to the same port/host combination
- Don't expose internal services directly without authentication
- Never hardcode credentials in Gateway specs
Traffic Management Patterns
Progressive Delivery
- Use weighted routing for canary deployments
- Implement blue-green deployments with instant traffic switching
- Apply header-based routing for testing new versions
- Monitor metrics before promoting canaries
Resilience
- Configure retries with exponential backoff
- Implement circuit breakers to prevent cascade failures
- Set connection pool limits to protect services
- Use outlier detection to remove unhealthy instances
Routing Strategy
- Route based on headers, URI paths, or query parameters
- Use subset-based routing for version management
- Implement fault injection for chaos testing
- Apply timeouts at every service boundary
Security Policies
mTLS Configuration
- Enable STRICT mode in production for all services
- Use PERMISSIVE mode only during migration
- Scope PeerAuthentication to specific namespaces or workloads
- Verify mTLS status with
istioctl authn tls-check
Authorization
- Default deny all traffic, then explicitly allow
- Use namespace-level policies for broad rules
- Apply workload-specific policies for fine-grained control
- Leverage JWT authentication for end-user identity
- Audit authorization policies regularly
Certificate Management
- Rotate certificates automatically (default 90 days)
- Use external CA for production (cert-manager, Vault)
- Monitor certificate expiration
- Test certificate renewal procedures
Observability Integration
Metrics
- Deploy Prometheus for metrics collection
- Use Grafana dashboards for visualization
- Monitor golden signals: latency, traffic, errors, saturation
- Set up alerts for SLO violations
Tracing
- Integrate with Jaeger, Zipkin, or Datadog
- Propagate trace headers in application code
- Sample traces intelligently (not 100% in production)
- Use tracing for debugging latency issues
Logging
- Enable access logs selectively (performance impact)
- Structure logs in JSON format
- Send logs to centralized logging (ELK, Splunk)
- Include trace IDs in application logs
Production-Ready Examples
Gateway and VirtualService
# gateway.yaml apiVersion: networking.istio.io/v1beta1 kind: Gateway metadata: name: public-gateway namespace: istio-system spec: selector: istio: ingressgateway servers: # HTTPS configuration - port: number: 443 name: https protocol: HTTPS tls: mode: SIMPLE credentialName: tls-cert-secret # Secret in istio-system namespace hosts: - "api.example.com" - "app.example.com" # HTTP to HTTPS redirect - port: number: 80 name: http protocol: HTTP hosts: - "api.example.com" - "app.example.com" tls: httpsRedirect: true --- # virtualservice.yaml apiVersion: networking.istio.io/v1beta1 kind: VirtualService metadata: name: api-routes namespace: default spec: hosts: - "api.example.com" gateways: - istio-system/public-gateway http: # Route /v2 to new service - match: - uri: prefix: "/v2/" rewrite: uri: "/" route: - destination: host: api-v2.default.svc.cluster.local port: number: 8080 timeout: 30s retries: attempts: 3 perTryTimeout: 10s retryOn: 5xx,reset,connect-failure,refused-stream # Route /v1 to legacy service - match: - uri: prefix: "/v1/" route: - destination: host: api-v1.default.svc.cluster.local port: number: 8080 timeout: 60s # Default route - route: - destination: host: api-v2.default.svc.cluster.local port: number: 8080
DestinationRule with Traffic Policies
apiVersion: networking.istio.io/v1beta1 kind: DestinationRule metadata: name: api-destination namespace: default spec: host: api.default.svc.cluster.local trafficPolicy: # Load balancing loadBalancer: consistentHash: httpHeaderName: x-user-id # Session affinity # Connection pool settings connectionPool: tcp: maxConnections: 100 connectTimeout: 30ms tcpKeepalive: time: 7200s interval: 75s http: http1MaxPendingRequests: 50 http2MaxRequests: 100 maxRequestsPerConnection: 2 maxRetries: 3 # Outlier detection (circuit breaker) outlierDetection: consecutive5xxErrors: 5 interval: 30s baseEjectionTime: 30s maxEjectionPercent: 50 minHealthPercent: 40 # TLS settings for upstream tls: mode: ISTIO_MUTUAL # Use Istio mTLS # Define subsets for version-based routing subsets: - name: v1 labels: version: v1 trafficPolicy: loadBalancer: simple: ROUND_ROBIN - name: v2 labels: version: v2 trafficPolicy: loadBalancer: simple: LEAST_REQUEST
Traffic Splitting for Canary Deployment
apiVersion: networking.istio.io/v1beta1 kind: VirtualService metadata: name: canary-rollout namespace: default spec: hosts: - reviews.default.svc.cluster.local http: # Send 10% of traffic to canary - match: - headers: x-canary: exact: "true" route: - destination: host: reviews.default.svc.cluster.local subset: v2 - route: - destination: host: reviews.default.svc.cluster.local subset: v1 weight: 90 - destination: host: reviews.default.svc.cluster.local subset: v2 weight: 10 --- # Blue-Green Deployment (instant switch) apiVersion: networking.istio.io/v1beta1 kind: VirtualService metadata: name: blue-green namespace: default spec: hosts: - orders.default.svc.cluster.local http: - route: # Switch to green by changing weight to 100 - destination: host: orders.default.svc.cluster.local subset: blue weight: 100 - destination: host: orders.default.svc.cluster.local subset: green weight: 0
Circuit Breaker and Retry Configuration
apiVersion: networking.istio.io/v1beta1 kind: DestinationRule metadata: name: circuit-breaker namespace: default spec: host: backend.default.svc.cluster.local trafficPolicy: connectionPool: tcp: maxConnections: 10 http: http1MaxPendingRequests: 1 http2MaxRequests: 10 maxRequestsPerConnection: 1 outlierDetection: # Remove instance after 5 consecutive errors consecutive5xxErrors: 5 consecutiveGatewayErrors: 5 # Check every 1 second interval: 1s # Keep instance ejected for 30 seconds baseEjectionTime: 30s # Maximum 100% of instances can be ejected maxEjectionPercent: 100 # Minimum 0% must be healthy (allows full ejection for testing) minHealthPercent: 0 --- apiVersion: networking.istio.io/v1beta1 kind: VirtualService metadata: name: retry-policy namespace: default spec: hosts: - payment.default.svc.cluster.local http: - route: - destination: host: payment.default.svc.cluster.local timeout: 10s retries: attempts: 3 perTryTimeout: 3s # Retry on these conditions retryOn: 5xx,reset,connect-failure,refused-stream,retriable-4xx # Retry only on idempotent methods retryRemoteLocalities: true
mTLS PeerAuthentication
# Namespace-wide STRICT mTLS apiVersion: security.istio.io/v1beta1 kind: PeerAuthentication metadata: name: default-mtls namespace: production spec: mtls: mode: STRICT --- # Mesh-wide mTLS (apply to istio-system) apiVersion: security.istio.io/v1beta1 kind: PeerAuthentication metadata: name: mesh-mtls namespace: istio-system spec: mtls: mode: STRICT --- # Workload-specific PERMISSIVE (migration) apiVersion: security.istio.io/v1beta1 kind: PeerAuthentication metadata: name: legacy-service namespace: default spec: selector: matchLabels: app: legacy-app mtls: mode: PERMISSIVE # Accept both mTLS and plaintext # Port-level override portLevelMtls: 8080: mode: DISABLE # Health check port --- # Verify mTLS status # istioctl authn tls-check <pod-name>.<namespace> <service-name>.<namespace>.svc.cluster.local
AuthorizationPolicy for RBAC
# Default deny all apiVersion: security.istio.io/v1beta1 kind: AuthorizationPolicy metadata: name: deny-all namespace: production spec: {} # Empty spec denies all requests --- # Allow specific service-to-service communication apiVersion: security.istio.io/v1beta1 kind: AuthorizationPolicy metadata: name: allow-frontend-to-backend namespace: production spec: selector: matchLabels: app: backend action: ALLOW rules: # Allow from frontend service - from: - source: principals: - "cluster.local/ns/production/sa/frontend" to: - operation: methods: ["GET", "POST"] paths: ["/api/*"] --- # JWT authentication and authorization apiVersion: security.istio.io/v1beta1 kind: AuthorizationPolicy metadata: name: require-jwt namespace: default spec: selector: matchLabels: app: api action: ALLOW rules: - from: - source: requestPrincipals: ["*"] # Valid JWT required when: - key: request.auth.claims[role] values: ["admin", "user"] --- # IP-based allow list apiVersion: security.istio.io/v1beta1 kind: AuthorizationPolicy metadata: name: allow-internal-ips namespace: default spec: selector: matchLabels: app: admin-panel action: ALLOW rules: - from: - source: ipBlocks: ["10.0.0.0/8", "172.16.0.0/12"] --- # Method and path-based restrictions apiVersion: security.istio.io/v1beta1 kind: AuthorizationPolicy metadata: name: read-only-access namespace: default spec: selector: matchLabels: app: database-api action: ALLOW rules: - to: - operation: methods: ["GET", "HEAD"] # DENY takes precedence over ALLOW --- apiVersion: security.istio.io/v1beta1 kind: AuthorizationPolicy metadata: name: deny-delete namespace: default spec: selector: matchLabels: app: database-api action: DENY rules: - to: - operation: methods: ["DELETE"]
Sidecar Resource for Egress Control
# Default sidecar for namespace (restrict egress) apiVersion: networking.istio.io/v1beta1 kind: Sidecar metadata: name: default-sidecar namespace: production spec: # Apply to all workloads in namespace egress: # Allow access to services in same namespace - hosts: - "./*" # Allow access to istio-system - hosts: - "istio-system/*" # Allow specific external services - hosts: - "*/external-api.external.svc.cluster.local" --- # Workload-specific sidecar apiVersion: networking.istio.io/v1beta1 kind: Sidecar metadata: name: frontend-sidecar namespace: default spec: workloadSelector: labels: app: frontend ingress: - port: number: 8080 protocol: HTTP name: http defaultEndpoint: 127.0.0.1:8080 egress: # Only allow access to backend service - hosts: - "./backend.default.svc.cluster.local" # Allow access to external API - hosts: - "*/api.external.com" --- # Optimize sidecar for external service access apiVersion: networking.istio.io/v1beta1 kind: ServiceEntry metadata: name: external-api namespace: default spec: hosts: - api.external.com ports: - number: 443 name: https protocol: HTTPS location: MESH_EXTERNAL resolution: DNS --- apiVersion: networking.istio.io/v1beta1 kind: Sidecar metadata: name: external-egress namespace: default spec: workloadSelector: labels: app: worker outboundTrafficPolicy: mode: REGISTRY_ONLY # Only allow registered ServiceEntry egress: - hosts: - "*/api.external.com"
Advanced Patterns
Fault Injection for Testing
apiVersion: networking.istio.io/v1beta1 kind: VirtualService metadata: name: fault-injection namespace: default spec: hosts: - ratings.default.svc.cluster.local http: - match: - headers: x-test: exact: "chaos" fault: # Inject 5 second delay for 50% of requests delay: percentage: value: 50.0 fixedDelay: 5s # Abort 10% of requests with HTTP 500 abort: percentage: value: 10.0 httpStatus: 500 route: - destination: host: ratings.default.svc.cluster.local - route: - destination: host: ratings.default.svc.cluster.local
Multi-Cluster Service Mesh
# Primary cluster configuration apiVersion: install.istio.io/v1alpha1 kind: IstioOperator metadata: name: primary-cluster spec: values: global: meshID: mesh1 multiCluster: clusterName: primary network: network1 --- # Remote cluster configuration apiVersion: install.istio.io/v1alpha1 kind: IstioOperator metadata: name: remote-cluster spec: values: global: meshID: mesh1 multiCluster: clusterName: remote network: network2 remotePilotAddress: istiod.istio-system.svc.cluster.local
Locality-Based Load Balancing
apiVersion: networking.istio.io/v1beta1 kind: DestinationRule metadata: name: locality-lb namespace: default spec: host: service.default.svc.cluster.local trafficPolicy: loadBalancer: localityLbSetting: enabled: true # Prefer same region/zone distribute: - from: us-west/zone1/* to: "us-west/zone1/*": 80 "us-west/zone2/*": 20 # Failover configuration failover: - from: us-west to: us-east outlierDetection: consecutiveErrors: 5 interval: 30s baseEjectionTime: 30s
Troubleshooting Commands
# Check Istio installation istioctl verify-install # Analyze configuration issues istioctl analyze --all-namespaces # Inspect proxy configuration istioctl proxy-config cluster <pod-name>.<namespace> istioctl proxy-config route <pod-name>.<namespace> istioctl proxy-config listener <pod-name>.<namespace> istioctl proxy-config endpoint <pod-name>.<namespace> # Check mTLS status istioctl authn tls-check <pod-name>.<namespace> <service-name>.<namespace>.svc.cluster.local # View proxy logs kubectl logs <pod-name> -c istio-proxy -n <namespace> # Debug routing istioctl experimental describe pod <pod-name> -n <namespace> # Check certificate expiration istioctl proxy-config secret <pod-name>.<namespace> -o json | jq '.dynamicActiveSecrets[0].secret.tlsCertificate.certificateChain.inlineBytes' -r | base64 -d | openssl x509 -text -noout # Test traffic routing kubectl exec <pod-name> -c istio-proxy -- curl -v http://service:port/path # Export proxy configuration for debugging istioctl proxy-config all <pod-name>.<namespace> -o json > proxy-config.json
Performance Tuning
Resource Requests and Limits
# Sidecar proxy resources apiVersion: v1 kind: Namespace metadata: name: production annotations: # Set default sidecar resources sidecar.istio.io/proxyCPU: "100m" sidecar.istio.io/proxyCPULimit: "2000m" sidecar.istio.io/proxyMemory: "128Mi" sidecar.istio.io/proxyMemoryLimit: "1024Mi"
Control Plane Tuning
apiVersion: install.istio.io/v1alpha1 kind: IstioOperator spec: meshConfig: # Reduce config push time defaultConfig: holdApplicationUntilProxyStarts: true proxyMetadata: ISTIO_META_DNS_CAPTURE: "true" ISTIO_META_DNS_AUTO_ALLOCATE: "true" components: pilot: k8s: resources: requests: cpu: 500m memory: 2Gi limits: cpu: 2000m memory: 4Gi env: - name: PILOT_PUSH_THROTTLE value: "100" - name: PILOT_ENABLE_WORKLOAD_ENTRY_HEALTH_CHECKS value: "true"
Security Hardening
Disable Privileged Containers
apiVersion: install.istio.io/v1alpha1 kind: IstioOperator spec: meshConfig: defaultConfig: # Run as non-root runAsUser: 1337 runAsGroup: 1337 # Drop all capabilities securityContext: capabilities: drop: - ALL readOnlyRootFilesystem: true
Egress Traffic Control
# Block all egress by default apiVersion: install.istio.io/v1alpha1 kind: IstioOperator spec: meshConfig: outboundTrafficPolicy: mode: REGISTRY_ONLY # Only allow registered ServiceEntry
Migration Strategy
Phase 1: Install Istio (No Injection)
istioctl install --set profile=default # Don't enable automatic injection yet
Phase 2: Enable Injection Per Workload
apiVersion: apps/v1 kind: Deployment metadata: name: app spec: template: metadata: annotations: sidecar.istio.io/inject: "true"
Phase 3: Enable PERMISSIVE mTLS
apiVersion: security.istio.io/v1beta1 kind: PeerAuthentication metadata: name: default namespace: istio-system spec: mtls: mode: PERMISSIVE
Phase 4: Verify All Services Use mTLS
# Check each service for pod in $(kubectl get pods -n production -o name); do istioctl authn tls-check $pod done
Phase 5: Enable STRICT mTLS
apiVersion: security.istio.io/v1beta1 kind: PeerAuthentication metadata: name: default namespace: istio-system spec: mtls: mode: STRICT