Claude-skill-registry delivery-and-infra
Use when designing cloud infrastructure, CI/CD pipelines, or deployment strategies
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/delivery-and-infra" ~/.claude/skills/majiayu000-claude-skill-registry-delivery-and-infra && rm -rf "$T"
manifest:
skills/data/delivery-and-infra/SKILL.mdsource content
Delivery & Infrastructure
Guidelines for cloud architecture, CI/CD, and deployment engineering.
When to Use
- Designing cloud infrastructure (AWS/Azure/GCP)
- Setting up CI/CD pipelines
- Container orchestration (Kubernetes)
- Infrastructure as Code (Terraform)
- Deployment strategy selection
Core Principles
- Automate Everything - No manual deployment steps
- Infrastructure as Code - Version-controlled, reproducible
- Build Once, Deploy Anywhere - Immutable artifacts
- Fast Feedback - Fail fast, fix fast
- Security by Design - Embedded throughout lifecycle
- GitOps as Truth - Git is single source of truth
- Zero-Downtime - All deployments without user impact
CI/CD Pipeline
Standard Stages
Lint → Test → Security Scan → Build → Deploy (staging) → Deploy (prod)
Pipeline Checklist
- Parallel execution where possible
- Caching for faster builds
- Environment-based deployment gates
- Automated rollback on failure
- Security scanning (SAST/DAST)
- Artifact versioning and signing
GitHub Actions Example
name: CI/CD on: push: branches: [main] pull_request: branches: [main] jobs: lint: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - run: npm run lint test: needs: lint runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - run: npm ci - run: npm test -- --coverage build: needs: test runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: docker/build-push-action@v5 with: push: true tags: ${{ env.IMAGE }}:${{ github.sha }}
Container Best Practices
Dockerfile Checklist
- Multi-stage builds for minimal size
- Non-root user for runtime
- No secrets in image layers
- Minimal base image (alpine, distroless)
- Health checks defined
- Resource limits configured
Example Dockerfile
FROM node:20-alpine AS builder WORKDIR /app COPY package*.json ./ RUN npm ci --only=production COPY . . RUN npm run build FROM node:20-alpine RUN adduser -S app -u 1001 WORKDIR /app COPY --from=builder --chown=app /app/dist ./dist COPY --from=builder --chown=app /app/node_modules ./node_modules USER app EXPOSE 3000 HEALTHCHECK CMD node healthcheck.js CMD ["node", "dist/main.js"]
Deployment Strategies
| Strategy | Use Case | Rollback |
|---|---|---|
| Rolling | Default, gradual | |
| Blue-Green | Quick switch, validation period | Switch service back |
| Canary | Risk mitigation, A/B testing | Route 100% to stable |
Kubernetes Essentials
Production Deployment Checklist
- Resource requests and limits defined
- Liveness and readiness probes
- Security context (non-root, read-only)
- Pod anti-affinity for HA
- HPA for auto-scaling
- PDB for availability during updates
- Secrets from external manager
Key Resources
# Deployment with probes livenessProbe: httpGet: path: /health port: 3000 initialDelaySeconds: 30 readinessProbe: httpGet: path: /ready port: 3000 initialDelaySeconds: 10
Infrastructure as Code
Terraform Best Practices
- Use modules for reusable components
- Remote state with locking
- Input validation on variables
- Semantic versioning for modules
- Consistent naming conventions
- Document all outputs
Module Structure
modules/ vpc/ main.tf variables.tf outputs.tf
Cloud Architecture
Design Principles
- Design for Failure - Self-healing mechanisms
- Security by Design - Least privilege, encryption
- Cost-Conscious - Right-size, savings plans
- Managed Services - Reduce operational overhead
HA/DR Patterns
- Multi-AZ for availability
- Multi-region for disaster recovery
- Cross-region backup replication
- DNS-based failover
Cost Optimization
- Right-sizing based on usage
- Reserved instances/savings plans
- Lifecycle policies for storage
- Auto-scaling to match demand
- Regular unused resource cleanup
Security Checklist
- VPC with private subnets
- Security groups (least privilege)
- IAM roles (minimal permissions)
- Encryption at rest and in transit
- Secrets in dedicated manager
- Audit logging enabled
- DDoS protection configured
Observability
Metrics to Track
- RED: Rate, Errors, Duration
- USE: Utilization, Saturation, Errors
Key Dashboards
- Application performance (latency, errors)
- Infrastructure health (CPU, memory, disk)
- Business metrics (transactions, users)
Alerting Rules
- alert: HighErrorRate expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.05 for: 5m annotations: summary: "Error rate > 5%"
Runbook Template
## Deployment Runbook ### Prerequisites - kubectl access, valid kubeconfig - Registry credentials - Monitoring access ### Deploy 1. Merge to main → pipeline triggers 2. Monitor pipeline in CI 3. Verify in monitoring dashboard ### Rollback kubectl rollout undo deployment/app -n production ### Troubleshooting - Pods not starting: `kubectl logs -f deployment/app` - High errors: Check app logs and metrics