Marketplace deploying-cloud-k8s
install
source · Clone the upstream repo
git clone https://github.com/aiskillstore/marketplace
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/aiskillstore/marketplace "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/asmayaseen/deploying-cloud-k8s" ~/.claude/skills/aiskillstore-marketplace-deploying-cloud-k8s && rm -rf "$T"
manifest:
skills/asmayaseen/deploying-cloud-k8s/SKILL.mdsource content
Deploying Cloud K8s
Quick Start
- Check cluster architecture:
kubectl get nodes -o jsonpath='{.items[*].status.nodeInfo.architecture}' - Match build platform to cluster (arm64 vs amd64)
- Set up GitHub Actions with path filters
- Deploy with Helm, passing secrets via
--set
Critical: Build-Time vs Runtime Variables
The Problem
Next.js
NEXT_PUBLIC_* variables are embedded at build time, not runtime:
# WRONG: Runtime ENV does nothing for NEXT_PUBLIC_* ENV NEXT_PUBLIC_API_URL=https://api.example.com # RIGHT: Must be build ARG ARG NEXT_PUBLIC_API_URL=https://api.example.com ENV NEXT_PUBLIC_API_URL=$NEXT_PUBLIC_API_URL
Build-Time (Next.js)
| Variable | Purpose |
|---|---|
| SSO endpoint for browser OAuth |
| API endpoint for browser fetch |
| App URL for redirects |
Runtime (ConfigMaps/Secrets)
| Variable | Source |
|---|---|
| Secret (Neon/managed DB) |
| ConfigMap (internal K8s: ) |
| Secret |
Architecture Matching
BEFORE ANY DEPLOYMENT, check architecture:
kubectl get nodes -o jsonpath='{.items[*].status.nodeInfo.architecture}' # Output: arm64 arm64 OR amd64 amd64
Docker Build
- uses: docker/build-push-action@v5 with: platforms: linux/arm64 # MATCH YOUR CLUSTER! provenance: false # Avoid manifest issues no-cache: true # When debugging
Why
? Buildx attestation creates complex manifest lists that cause "no match for platform" errors.provenance: false
GitHub Actions CI/CD
Selective Builds with Path Filters
jobs: changes: runs-on: ubuntu-latest outputs: api: ${{ steps.filter.outputs.api }} web: ${{ steps.filter.outputs.web }} steps: - uses: dorny/paths-filter@v3 id: filter with: filters: | api: - 'apps/api/**' web: - 'apps/web/**' build-api: needs: changes if: needs.changes.outputs.api == 'true'
Next.js Build Args
- name: Build and push (web) uses: docker/build-push-action@v5 with: build-args: | NEXT_PUBLIC_SSO_URL=https://sso.${{ vars.DOMAIN }} NEXT_PUBLIC_API_URL=https://api.${{ vars.DOMAIN }}
Helm Deployment
- name: Deploy run: | helm upgrade --install myapp ./helm/myapp \ --set global.imageTag=${{ github.sha }} \ --set "secrets.databaseUrl=${{ secrets.DATABASE_URL }}" \ --set "secrets.authSecret=${{ secrets.BETTER_AUTH_SECRET }}"
Troubleshooting Guide
Quick Diagnosis Flow
Pod not running? │ ├─► ImagePullBackOff │ ├─► "not found" ──► Wrong tag or registry │ ├─► "unauthorized" ──► Auth/imagePullSecrets │ └─► "no match for platform" ──► Architecture mismatch │ ├─► CrashLoopBackOff │ ├─► "exec format error" ──► Wrong CPU architecture │ ├─► Exit code 1 ──► App startup failure │ └─► OOMKilled ──► Memory limits too low │ └─► Pending ├─► Insufficient resources ──► Scale cluster └─► No matching node ──► Check nodeSelector
Diagnostic Commands
kubectl get pods -n <namespace> kubectl describe pod <pod-name> -n <namespace> | grep -E "(Image:|Failed|Error)" kubectl get events -n <namespace> --sort-by='.lastTimestamp' | tail -20 kubectl logs <pod-name> -n <namespace> --tail=50
Error: ImagePullBackOff "not found"
Causes:
- Tag doesn't exist (short vs full SHA)
- Wrong registry path
- Builds skipped by path filters
Fix: Verify image was pushed with exact tag used in deployment
Error: "no match for platform in manifest"
Cause: Image built for wrong architecture OR buildx provenance issue
Fix:
platforms: linux/arm64 # Match cluster! provenance: false # Simple manifest no-cache: true # Force rebuild
Error: "exec format error"
Cause: Binary architecture doesn't match node
Fix: Rebuild with correct platform, use
no-cache: true
Error: Helm comma parsing
failed parsing --set data: key "com" has no value
Cause: Helm interprets commas as array separators
Fix: Use heredoc values file:
- name: Deploy run: | cat > /tmp/overrides.yaml << EOF sso: env: ALLOWED_ORIGINS: "https://a.com,https://b.com" EOF helm upgrade --install app ./chart --values /tmp/overrides.yaml
Error: Password authentication failed
Cause: Password with special characters (base64
+/=)
Fix: Use hex passwords:
# Wrong openssl rand -base64 16 # Can have +/= # Right openssl rand -hex 16 # Alphanumeric only
Error: Logout redirects to 0.0.0.0
Cause:
request.url returns container bind address
Fix:
const APP_URL = process.env.NEXT_PUBLIC_APP_URL || "http://localhost:3000"; const response = NextResponse.redirect(new URL("/", APP_URL));
Pre-Deployment Checklist
Architecture
- Checked cluster node architecture
- Build platform matches cluster
Docker Build
-
setprovenance: false -
matches clusterplatforms: linux/<arch> - Image tags consistent between build and deploy
CI/CD
- All
as build argsNEXT_PUBLIC_* - Secrets passed via
(not in values.yaml)--set - Path filters configured
Helm
- No commas in
values--set - Internal K8s service names for inter-service communication
- Password single source of truth in values.yaml
Production Debugging
Trace Request Path
# 1. Frontend logs kubectl logs deploy/web -n myapp --tail=50 # 2. API logs kubectl logs deploy/api -n myapp --tail=100 | grep -i error # 3. Sidecar logs (Dapr, etc.) kubectl logs deploy/api -n myapp -c daprd --tail=50
Common Bug Patterns
| Error | Likely Cause |
|---|---|
| Model/schema mismatch |
on internal call | Wrong endpoint URL |
| Times off by hours | Timezone handling bug |
| Async SQLAlchemy pattern |
GitOps with ArgoCD
apiVersion: argoproj.io/v1alpha1 kind: Application metadata: name: myapp namespace: argocd spec: source: repoURL: https://github.com/org/repo.git path: k8s/overlays/production destination: server: https://kubernetes.default.svc namespace: myapp syncPolicy: automated: prune: true # Delete resources not in Git selfHeal: true # Fix drift automatically
Observability
# ServiceMonitor for Prometheus apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: myapp spec: selector: matchLabels: app: myapp endpoints: - port: metrics interval: 30s
Security
# Pod Security Context securityContext: runAsNonRoot: true runAsUser: 1000 allowPrivilegeEscalation: false readOnlyRootFilesystem: true capabilities: drop: ["ALL"]
Resilience
# HPA + PDB apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler spec: minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 --- apiVersion: policy/v1 kind: PodDisruptionBudget spec: minAvailable: 1
See references/production-patterns.md for full GitOps, observability, security, and resilience patterns.
Verification
Run:
python scripts/verify.py
Related Skills
- Docker and Helm chartscontainerizing-applications
- Local Kubernetes with Minikubeoperating-k8s-local
- Next.js patternsbuilding-nextjs-apps
References
- references/production-patterns.md - GitOps, ArgoCD, Prometheus, RBAC, HPA, PDB