Claude-skill-registry cloud-infrastructure
Cloud platforms (AWS, Cloudflare, GCP, Azure), containerization (Docker), Kubernetes, Infrastructure as Code (Terraform), CI/CD, and observability.
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/cloud-infrastructure" ~/.claude/skills/majiayu000-claude-skill-registry-cloud-infrastructure && rm -rf "$T"
manifest:
skills/data/cloud-infrastructure/SKILL.mdsource content
Cloud Infrastructure Skill
Quick Reference
| Platform | Market | Best For | Learning |
|---|---|---|---|
| AWS | 32% | Everything | 3-6 mo |
| Azure | 24% | Microsoft stack | 3-6 mo |
| GCP | 11% | Data, ML | 3-6 mo |
| Cloudflare | Edge | CDN, Workers | 2-4 wk |
Learning Paths
AWS
[1] IAM + VPC (1-2 wk) │ └─ Roles, policies, networking │ ▼ [2] Compute: EC2, Lambda (2-3 wk) │ ▼ [3] Storage: S3, EBS (1-2 wk) │ ▼ [4] Database: RDS, DynamoDB (2-3 wk) │ ▼ [5] Containers: ECS, EKS (3-4 wk) │ ▼ [6] Monitoring: CloudWatch (1-2 wk)
Docker & Containers
[1] Docker Basics (1 wk) │ └─ Images, containers, Dockerfile │ ▼ [2] Multi-stage Builds (1 wk) │ └─ Optimization, layer caching │ ▼ [3] Docker Compose (1 wk) │ └─ Multi-container apps │ ▼ [4] Registry & Security (1 wk) └─ Push/pull, scanning, non-root
Kubernetes
[1] Pods & Deployments (2 wk) │ ▼ [2] Services & Networking (1-2 wk) │ ▼ [3] ConfigMaps & Secrets (1 wk) │ ▼ [4] Helm Charts (2 wk) │ ▼ [5] Production Patterns (ongoing) └─ HPA, PDB, resource limits
Terraform (IaC)
[1] Resources & State (1 wk) │ ▼ [2] Variables & Outputs (1 wk) │ ▼ [3] Modules (1-2 wk) │ ▼ [4] Remote State (1 wk) │ ▼ [5] Workspaces & Environments (1 wk)
Kubernetes Quick Reference
| Resource | Purpose | Example |
|---|---|---|
| Pod | Smallest unit | Single container |
| Deployment | Manage replicas | Web app |
| Service | Network access | ClusterIP, LoadBalancer |
| Ingress | HTTP routing | Path-based routing |
| ConfigMap | Configuration | Environment variables |
| Secret | Sensitive data | Credentials |
| StatefulSet | Stateful apps | Databases |
Terraform Structure
project/ ├── main.tf # Resources ├── variables.tf # Inputs ├── outputs.tf # Outputs ├── providers.tf # Provider config ├── versions.tf # Version constraints ├── modules/ │ ├── vpc/ │ ├── eks/ │ └── rds/ └── environments/ ├── dev.tfvars ├── staging.tfvars └── prod.tfvars
CI/CD Pipeline Template
# GitHub Actions name: CI/CD on: push: branches: [main] jobs: build-test-deploy: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Build run: docker build -t app . - name: Test run: docker run app pytest - name: Push run: docker push registry/app:${{ github.sha }} - name: Deploy if: github.ref == 'refs/heads/main' run: kubectl set image deployment/app app=registry/app:${{ github.sha }}
Monitoring Stack
┌─────────────────────────────────────────┐ │ OBSERVABILITY STACK │ ├─────────────────────────────────────────┤ │ Metrics: Prometheus → Grafana │ │ Logs: Loki / ELK │ │ Traces: Jaeger / Tempo │ │ Alerts: Alertmanager → PagerDuty │ └─────────────────────────────────────────┘
Troubleshooting
Container not starting? ├─► docker logs <container> ├─► Check port conflicts ├─► Check image name/tag └─► Check resource limits Pod in CrashLoopBackOff? ├─► kubectl describe pod <name> ├─► kubectl logs <pod> ├─► Check resource limits ├─► Check probes configuration └─► Check image pull secrets Terraform apply fails? ├─► terraform plan first ├─► Check state lock ├─► terraform import existing └─► Restore state from backup High cloud bill? ├─► Enable cost alerts ├─► Right-size instances ├─► Use spot instances ├─► Delete unused resources └─► Storage lifecycle policies
Common Failure Modes
| Symptom | Root Cause | Recovery |
|---|---|---|
| Pod CrashLoopBackOff | App error or OOM | Check logs, increase limits |
| ImagePullBackOff | Wrong image or auth | Verify image, check secrets |
| Terraform drift | Manual changes | Import or terraform apply |
| Slow deploys | Large images | Multi-stage builds, layer caching |
Best Practices
Docker
- Use multi-stage builds
- Run as non-root user
- Use .dockerignore
- Pin base image versions
- Scan for vulnerabilities
Kubernetes
- Set resource requests/limits
- Use readiness/liveness probes
- Store config in ConfigMaps
- Use namespaces for isolation
- Enable network policies
Terraform
- Use remote state (S3, GCS)
- Lock state file
- Use modules for reuse
- Plan before apply
- Tag all resources
Next Actions
Specify your cloud platform and focus area for detailed guidance.