Claude-skill-registry deployment

Serverless deployment with zero-downtime, multi-environment strategies, and infrastructure validation. Use when deploying Lambda functions, managing environments, or troubleshooting deployment failures.

install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/deployment-awannaphasch2016-jousef-landing" ~/.claude/skills/majiayu000-claude-skill-registry-deployment && rm -rf "$T"
manifest: skills/data/deployment-awannaphasch2016-jousef-landing/SKILL.md
source content

Deployment Skill

Tech Stack: AWS Lambda, Docker, Terraform, GitHub Actions, Doppler (secrets)

Source: Extracted from CLAUDE.md deployment principles and production deployment patterns.


When to Use This Skill

Use the deployment skill when:

  • ✓ Deploying Lambda functions to AWS
  • ✓ Managing multi-environment deployments (dev/staging/production)
  • ✓ Implementing zero-downtime deployments
  • ✓ Troubleshooting deployment failures
  • ✓ Validating infrastructure configuration
  • ✓ Setting up CI/CD pipelines
  • ✓ Pre-deployment validation (see lambda-deployment checklist)

DO NOT use this skill for:

  • ✗ Local development setup (use project README instead)
  • ✗ Running tests (use testing-workflow skill)
  • ✗ Code refactoring (use refactor skill)

Quick Links:


Quick Deployment Decision Tree

What are you deploying?
├─ Lambda function update?
│  ├─ Code change only? → Update function code, wait for update
│  ├─ Config change (env vars, memory)? → Update configuration, wait
│  ├─ Zero-downtime required? → Use versioning + alias pattern
│  └─ Rollback needed? → Point alias to previous version
│
├─ New environment?
│  ├─ Branch-based (dev/staging/prod)? → Follow multi-env guide
│  ├─ Secrets setup? → Configure Doppler + GitHub secrets
│  └─ Infrastructure? → Terraform apply
│
├─ Deployment failed?
│  ├─ Check CloudWatch logs → Filter ERROR level
│  ├─ Validate secrets → Run validation script
│  ├─ Check resource state → AWS CLI describe commands
│  └─ Verify permissions → IAM policy validation
│
└─ CI/CD pipeline setup?
   ├─ Define environments → dev/staging/prod
   ├─ Configure artifact promotion → Immutable images
   ├─ Add validation gates → Pre-deploy checks
   └─ Setup monitoring → CloudWatch + smoke tests

Core Deployment Patterns

Pattern 1: Zero-Downtime Lambda Deployment

Problem: Updating Lambda function causes brief unavailability during deployment.

Solution: Version + Alias pattern

$LATEST (mutable, testing)
   ↓ publish version
Version N (immutable snapshot)
   ↓ update alias
live (production pointer) → Version N

Benefits:

  • Zero downtime (alias swap is atomic)
  • Instant rollback (point alias to previous version)
  • Test before promotion ($LATEST → Version)

See ZERO_DOWNTIME.md for detailed patterns.

Pattern 2: Multi-Environment Strategy

Branch-Based Deployment:

dev branch → dev environment (~8 min)
    ↓ PR
main branch → staging environment (~10 min)
    ↓ Tag v*.*.*
production environment (~12 min)

Artifact Promotion:

  • Build once in dev
  • Same immutable Docker image promoted to staging/prod
  • What you test is what you deploy

See MULTI_ENV.md for environment separation patterns.

Note: Deployment verification applies Progressive Evidence Strengthening (CLAUDE.md Principle #2). We verify from weak evidence (exit codes) to strong evidence (actual traffic metrics).

Pattern 3: Deployment Validation

Multi-Layer Verification (Deployment Application):

  1. Status Code (weakest signal)

    aws lambda invoke --function-name worker --payload '{}' /tmp/response.json
    # Exit code 0 only means "invocation succeeded", not "function worked"
    
  2. Response Payload (stronger signal)

    if grep -q "errorMessage" /tmp/response.json; then
      echo "❌ Lambda returned error"
      exit 1
    fi
    
  3. CloudWatch Logs (strongest signal)

    ERROR_COUNT=$(aws logs filter-log-events \
      --log-group-name /aws/lambda/worker \
      --filter-pattern "ERROR" \
      --query 'length(events)' --output text)
    
    if [ "$ERROR_COUNT" -gt 0 ]; then
      echo "❌ Found errors in logs"
      exit 1
    fi
    

Principle: AWS services returning 200 OK doesn't guarantee error-free execution. Always validate logs.

See MONITORING.md for comprehensive validation patterns.

Pattern 4: Secret Management by Consumer

Doppler (Runtime Secrets)

  • Who needs it: Lambda functions during execution
  • Examples:
    AURORA_HOST
    ,
    OPENROUTER_API_KEY
  • Access: Doppler → Terraform → Lambda env vars

GitHub Secrets (Deployment Secrets)

  • Who needs it: CI/CD pipeline during deployment
  • Examples:
    CLOUDFRONT_DISTRIBUTION_ID
    ,
    AWS_ACCESS_KEY_ID
  • Access:
    ${{ secrets.SECRET_NAME }}
    in workflows

The Deciding Question: "Does the Lambda function running in production need this value?"

  • YES → Doppler (runtime secret)
  • NO → GitHub Secrets (deployment secret)

See MULTI_ENV.md#secret-management for detailed patterns.


Pre-Deployment Validation

Principle: Validate configuration BEFORE deployment, not during.

Validation Script

# Run before every deployment
scripts/validate_deployment_ready.sh

# Checks:
# 1. Doppler configuration exists
# 2. Required environment variables set
# 3. AWS resources exist (S3 buckets, DynamoDB tables)
# 4. Lambda function dependencies available

Why This Matters:

  • Distinguishes config failures from logic failures
  • Fails fast (30 seconds vs 8 minute deploy)
  • Prevents partial deployments

Infrastructure-Deployment Contract Validation

Pattern: Query AWS infrastructure, validate secrets match reality.

jobs:
  validate-deployment-config:
    runs-on: ubuntu-latest
    steps:
      - name: Validate CloudFront Distributions
        run: |
          # Query actual infrastructure
          ACTUAL=$(aws cloudfront list-distributions \
            --query 'DistributionList.Items[?Comment==`app-dev`].Id' \
            --output text)

          # Compare to GitHub secret
          if [ "$ACTUAL" != "${{ secrets.CLOUDFRONT_DISTRIBUTION_ID }}" ]; then
            echo "❌ Secret mismatch detected"
            exit 1
          fi

  build:
    needs: validate-deployment-config  # Won't run if validation fails

Benefits:

  • Catches configuration drift automatically
  • No manual checklist needed
  • Single source of truth (AWS infrastructure is reality)

See MONITORING.md#infrastructure-validation.


Common Deployment Scenarios

Scenario 1: Deploy Code Change to Dev

# 1. Commit to dev branch
git add .
git commit -m "feat: Add new feature"
git push origin dev

# 2. GitHub Actions automatically:
#    - Builds Docker image
#    - Pushes to ECR
#    - Updates Lambda function code
#    - Waits for function update (no sleep!)
#    - Runs smoke tests
#    - Validates CloudWatch logs

# 3. Manual verification (optional)
just test-dev-api  # Test deployed function

Time: ~8 minutes

Scenario 2: Promote Dev → Staging

# 1. Create PR from dev → main
gh pr create --base main --head dev --title "Release: v1.2.0"

# 2. Review and merge
gh pr merge --squash

# 3. GitHub Actions automatically:
#    - Uses SAME Docker image from dev (artifact promotion)
#    - Updates staging Lambda with promoted image
#    - Runs integration tests
#    - Validates staging environment

Time: ~10 minutes (faster because no rebuild)

Scenario 3: Deploy to Production

# 1. Tag release on main branch
git tag v1.2.0
git push origin v1.2.0

# 2. GitHub Actions automatically:
#    - Uses SAME Docker image from staging
#    - Publishes new Lambda version (immutable)
#    - Updates 'live' alias to new version (zero-downtime)
#    - Runs smoke tests
#    - Validates production logs

Time: ~12 minutes

Scenario 4: Rollback Production

# Find previous version
aws lambda list-versions-by-function \
  --function-name worker \
  --query 'Versions[-2].Version'  # Previous version

# Update alias to previous version (instant rollback)
aws lambda update-alias \
  --function-name worker \
  --name live \
  --function-version 42  # Previous working version

# Verify rollback
aws lambda get-alias --function-name worker --name live

Time: < 30 seconds (instant)


Deployment Principles

From CLAUDE.md global instructions:

"Deployment Philosophy: Serverless AWS Lambda with immutable container images, zero-downtime promotion via versioning."

Core Principles

  1. Immutability: Build once, promote same artifact through all environments
  2. Zero-Downtime: Version + Alias pattern for atomic updates
  3. Fail Fast: Validate before deployment, not during
  4. Multi-Layer Verification: Status code + Payload + Logs
  5. Artifact Promotion: Dev image → Staging image → Prod image (same hash)
  6. Secret Separation: Runtime secrets (Doppler) vs Deployment secrets (GitHub)

When NOT to Deploy

  • ❌ Tests failing (run
    just test-deploy
    first)
  • ❌ Secrets not configured (run validation script)
  • ❌ Infrastructure not created (run
    terraform apply
    first)
  • ❌ No PR review for staging/prod (require approval)

AWS CLI Waiter Pattern

DO:

aws lambda update-function-code --function-name worker --image-uri $IMAGE
aws lambda wait function-updated --function-name worker  # Blocks until ready

DON'T:

aws lambda update-function-code --function-name worker --image-uri $IMAGE
sleep 30  # ❌ Arbitrary delay, might be too short or too long

Why Waiters:

  • Precise timing (returns immediately when ready)
  • Handles variability (cold start vs warm container)
  • Prevents race conditions

File Organization

.claude/skills/deployment/
├── SKILL.md                       # This file (entry point)
├── ZERO_DOWNTIME.md               # Lambda versioning patterns
├── MULTI_ENV.md                   # Environment strategy
├── MONITORING.md                  # Validation and monitoring
└── scripts/
    └── validate_deployment_ready.sh  # Pre-deployment validation

Next Steps

  • For zero-downtime patterns: See ZERO_DOWNTIME.md
  • For multi-environment setup: See MULTI_ENV.md
  • For deployment monitoring: See MONITORING.md
  • For complete runbook: See
    docs/deployment/TELEGRAM_DEPLOYMENT_RUNBOOK.md

References