Claude-skill-registry ecs-deployment
ECS deployment strategies including rolling updates, blue-green with CodeDeploy, canary releases, and GitOps workflows. Covers deployment circuit breakers, rollback strategies, and production deployment patterns. Use when deploying ECS services, implementing blue-green deployments, setting up CI/CD pipelines, or managing production releases.
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/ecs-deployment" ~/.claude/skills/majiayu000-claude-skill-registry-ecs-deployment && rm -rf "$T"
manifest:
skills/data/ecs-deployment/SKILL.mdsource content
ECS Deployment Strategies
Complete guide to deploying ECS services safely and efficiently, from rolling updates to blue-green deployments.
Quick Reference
| Strategy | Downtime | Rollback Speed | Complexity | Best For |
|---|---|---|---|---|
| Rolling Update | Zero | Medium | Low | Most workloads |
| Blue-Green | Zero | Instant | High | Critical services |
| Canary | Zero | Fast | High | Risk mitigation |
Rolling Updates (Default)
Configuration
resource "aws_ecs_service" "app" { deployment_configuration { maximum_percent = 200 # Allow 2x during deployment minimum_healthy_percent = 100 # Keep 100% healthy } deployment_circuit_breaker { enable = true # Auto-detect failures rollback = true # Auto-rollback on failure } }
Behavior
- New task definition registered
- New tasks launched (up to maximum_percent)
- Health checks pass on new tasks
- Old tasks drained and stopped
- Continues until all tasks updated
Boto3 Deployment
import boto3 ecs = boto3.client('ecs') def deploy_rolling_update(cluster: str, service: str, new_image: str, container_name: str): """Deploy new image via rolling update""" # 1. Get current task definition svc = ecs.describe_services(cluster=cluster, services=[service]) current_task_def = svc['services'][0]['taskDefinition'] # 2. Create new task definition revision task_def = ecs.describe_task_definition(taskDefinition=current_task_def) new_task_def = task_def['taskDefinition'].copy() # Remove response-only fields for field in ['taskDefinitionArn', 'revision', 'status', 'requiresAttributes', 'compatibilities', 'registeredAt', 'registeredBy']: new_task_def.pop(field, None) # Update image for container in new_task_def['containerDefinitions']: if container['name'] == container_name: container['image'] = new_image response = ecs.register_task_definition(**new_task_def) new_task_def_arn = response['taskDefinition']['taskDefinitionArn'] # 3. Update service ecs.update_service( cluster=cluster, service=service, taskDefinition=new_task_def_arn, forceNewDeployment=True ) print(f"Deploying {new_task_def_arn}") return new_task_def_arn # Usage deploy_rolling_update( cluster='production', service='api', new_image='123456789.dkr.ecr.us-east-1.amazonaws.com/api:v2.0', container_name='api' )
Monitor Deployment
def wait_for_deployment(cluster: str, service: str, timeout: int = 600): """Wait for deployment to complete""" import time start = time.time() while time.time() - start < timeout: response = ecs.describe_services(cluster=cluster, services=[service]) svc = response['services'][0] for deployment in svc['deployments']: print(f"Deployment {deployment['id'][:8]}: " f"{deployment['rolloutState']} " f"({deployment['runningCount']}/{deployment['desiredCount']})") if deployment['status'] == 'PRIMARY': if deployment['rolloutState'] == 'COMPLETED': print("Deployment successful!") return True elif deployment['rolloutState'] == 'FAILED': print(f"Deployment failed: {deployment.get('rolloutStateReason')}") return False time.sleep(15) print("Deployment timed out") return False
Blue-Green Deployments
Architecture
┌─────────────┐ │ ALB │ └──────┬──────┘ │ ┌───────────────┴───────────────┐ │ │ ┌──────▼──────┐ ┌──────▼──────┐ │ Target Group│ │ Target Group│ │ (Blue) │ │ (Green) │ └──────┬──────┘ └──────┬──────┘ │ │ ┌──────▼──────┐ ┌──────▼──────┐ │ ECS Service │ │ ECS Service │ │ (Blue) │ │ (Green) │ └─────────────┘ └─────────────┘
Terraform with CodeDeploy
# Two target groups resource "aws_lb_target_group" "blue" { name = "app-blue" port = 8080 protocol = "HTTP" vpc_id = module.vpc.vpc_id target_type = "ip" health_check { path = "/health" } } resource "aws_lb_target_group" "green" { name = "app-green" port = 8080 protocol = "HTTP" vpc_id = module.vpc.vpc_id target_type = "ip" health_check { path = "/health" } } # ALB with two listeners resource "aws_lb_listener" "prod" { load_balancer_arn = aws_lb.app.arn port = 443 protocol = "HTTPS" default_action { type = "forward" target_group_arn = aws_lb_target_group.blue.arn } lifecycle { ignore_changes = [default_action] # Managed by CodeDeploy } } resource "aws_lb_listener" "test" { load_balancer_arn = aws_lb.app.arn port = 8443 protocol = "HTTPS" default_action { type = "forward" target_group_arn = aws_lb_target_group.green.arn } lifecycle { ignore_changes = [default_action] } } # ECS Service with CodeDeploy resource "aws_ecs_service" "app" { name = "app" cluster = module.ecs.cluster_id task_definition = aws_ecs_task_definition.app.arn desired_count = 3 deployment_controller { type = "CODE_DEPLOY" } load_balancer { target_group_arn = aws_lb_target_group.blue.arn container_name = "app" container_port = 8080 } lifecycle { ignore_changes = [task_definition, load_balancer] } } # CodeDeploy Application resource "aws_codedeploy_app" "app" { compute_platform = "ECS" name = "app-deploy" } # CodeDeploy Deployment Group resource "aws_codedeploy_deployment_group" "app" { app_name = aws_codedeploy_app.app.name deployment_group_name = "app-dg" deployment_config_name = "CodeDeployDefault.ECSAllAtOnce" service_role_arn = aws_iam_role.codedeploy.arn auto_rollback_configuration { enabled = true events = ["DEPLOYMENT_FAILURE", "DEPLOYMENT_STOP_ON_REQUEST"] } blue_green_deployment_config { deployment_ready_option { action_on_timeout = "CONTINUE_DEPLOYMENT" } terminate_blue_instances_on_deployment_success { action = "TERMINATE" termination_wait_time_in_minutes = 5 } } deployment_style { deployment_option = "WITH_TRAFFIC_CONTROL" deployment_type = "BLUE_GREEN" } ecs_service { cluster_name = module.ecs.cluster_name service_name = aws_ecs_service.app.name } load_balancer_info { target_group_pair_info { prod_traffic_route { listener_arns = [aws_lb_listener.prod.arn] } test_traffic_route { listener_arns = [aws_lb_listener.test.arn] } target_group { name = aws_lb_target_group.blue.name } target_group { name = aws_lb_target_group.green.name } } } }
Trigger Blue-Green Deployment
import boto3 import json codedeploy = boto3.client('codedeploy') def deploy_blue_green(app_name: str, deployment_group: str, task_definition_arn: str, container_name: str, container_port: int): """Trigger blue-green deployment via CodeDeploy""" app_spec = { "version": "0.0", "Resources": [{ "TargetService": { "Type": "AWS::ECS::Service", "Properties": { "TaskDefinition": task_definition_arn, "LoadBalancerInfo": { "ContainerName": container_name, "ContainerPort": container_port } } } }] } response = codedeploy.create_deployment( applicationName=app_name, deploymentGroupName=deployment_group, revision={ 'revisionType': 'AppSpecContent', 'appSpecContent': { 'content': json.dumps(app_spec) } } ) deployment_id = response['deploymentId'] print(f"Started deployment: {deployment_id}") return deployment_id # Usage deploy_blue_green( app_name='app-deploy', deployment_group='app-dg', task_definition_arn='arn:aws:ecs:us-east-1:123456789:task-definition/app:5', container_name='app', container_port=8080 )
Canary Releases
ALB Weighted Routing
resource "aws_lb_listener_rule" "canary" { listener_arn = aws_lb_listener.prod.arn priority = 100 action { type = "forward" forward { target_group { arn = aws_lb_target_group.stable.arn weight = 90 } target_group { arn = aws_lb_target_group.canary.arn weight = 10 } } } condition { path_pattern { values = ["/*"] } } }
Gradual Traffic Shift
def shift_traffic(listener_rule_arn: str, canary_weight: int): """Shift traffic percentage to canary""" elb = boto3.client('elbv2') stable_weight = 100 - canary_weight elb.modify_rule( RuleArn=listener_rule_arn, Actions=[{ 'Type': 'forward', 'ForwardConfig': { 'TargetGroups': [ { 'TargetGroupArn': stable_tg_arn, 'Weight': stable_weight }, { 'TargetGroupArn': canary_tg_arn, 'Weight': canary_weight } ] } }] ) print(f"Traffic: {stable_weight}% stable, {canary_weight}% canary") # Progressive rollout shift_traffic(rule_arn, 10) # 10% to canary # Monitor metrics... shift_traffic(rule_arn, 25) # 25% to canary # Monitor metrics... shift_traffic(rule_arn, 50) # 50% to canary # Monitor metrics... shift_traffic(rule_arn, 100) # 100% to canary (promote)
Deployment Circuit Breaker
How It Works
- ECS monitors deployment health
- Detects repeated task failures
- Automatically stops deployment
- Optional: Rolls back to previous version
Configuration
resource "aws_ecs_service" "app" { deployment_circuit_breaker { enable = true rollback = true # Auto-rollback on failure } }
Failure Detection
Circuit breaker triggers when:
- Tasks fail to reach RUNNING state
- Health checks fail repeatedly
- Tasks crash shortly after starting
GitOps Workflow
GitHub Actions Example
name: Deploy to ECS on: push: branches: [main] jobs: deploy: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Configure AWS credentials uses: aws-actions/configure-aws-credentials@v4 with: aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }} aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }} aws-region: us-east-1 - name: Login to Amazon ECR id: login-ecr uses: aws-actions/amazon-ecr-login@v2 - name: Build and push image env: ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }} IMAGE_TAG: ${{ github.sha }} run: | docker build -t $ECR_REGISTRY/myapp:$IMAGE_TAG . docker push $ECR_REGISTRY/myapp:$IMAGE_TAG - name: Update task definition id: task-def uses: aws-actions/amazon-ecs-render-task-definition@v1 with: task-definition: task-definition.json container-name: myapp image: ${{ steps.login-ecr.outputs.registry }}/myapp:${{ github.sha }} - name: Deploy to ECS uses: aws-actions/amazon-ecs-deploy-task-definition@v2 with: task-definition: ${{ steps.task-def.outputs.task-definition }} service: myapp-service cluster: production wait-for-service-stability: true
Rollback Strategies
Manual Rollback
def rollback_to_previous(cluster: str, service: str): """Rollback to previous task definition""" # Get current task definition svc = ecs.describe_services(cluster=cluster, services=[service]) current_td = svc['services'][0]['taskDefinition'] # Parse family and revision # arn:aws:ecs:region:account:task-definition/family:revision parts = current_td.split('/')[-1].split(':') family = parts[0] current_revision = int(parts[1]) # Go back to previous revision previous_td = f"{family}:{current_revision - 1}" # Update service ecs.update_service( cluster=cluster, service=service, taskDefinition=previous_td ) print(f"Rolling back to {previous_td}") # Usage rollback_to_previous('production', 'api')
Automatic Rollback (Circuit Breaker)
Enabled via
deployment_circuit_breaker.rollback = true
Best Practices
- Always enable circuit breaker with rollback for production
- Use blue-green for critical services requiring instant rollback
- Implement health checks at container, task, and ALB levels
- Pin image digests instead of tags for reproducibility
- Use immutable image tags in ECR
- Monitor deployments with CloudWatch alarms
- Test rollback procedures regularly
- Keep previous task definitions for quick rollback
Progressive Disclosure
Quick Start (This File)
- Rolling updates
- Blue-green basics
- Canary releases
- Circuit breaker
Detailed References
- Blue-Green Setup: Complete CodeDeploy configuration
- CI/CD Pipelines: GitHub Actions, CodePipeline
- Monitoring: CloudWatch, alarms
Related Skills
- boto3-ecs: SDK patterns
- terraform-ecs: Infrastructure as Code
- ecs-troubleshooting: Debugging deployments