Claude-skill-registry cloud-deploy-blueprint
End-to-end cloud deployment skill for Kubernetes (AKS/GKE/DOKS) with CI/CD pipelines. Covers managed services integration (Neon, Upstash), ingress configuration, SSL certificates, GitHub Actions workflows with selective builds, and Next.js build-time vs runtime environment handling. Battle-tested from 9-hour deployment session.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/cloud-deploy-blueprint" ~/.claude/skills/majiayu000-claude-skill-registry-cloud-deploy-blueprint && rm -rf "$T"
skills/data/cloud-deploy-blueprint/SKILL.mdCloud Deploy Blueprint
Overview
This skill captures the complete knowledge for deploying a multi-service application to cloud Kubernetes, based on battle-tested learnings from deploying TaskFlow (5 microservices) to Azure AKS.
When to Use
- Deploying to AKS, GKE, or DOKS
- Setting up CI/CD with GitHub Actions
- Integrating managed services (Neon PostgreSQL, Upstash Redis)
- Configuring ingress with SSL certificates
- Handling Next.js
variables in Docker/K8sNEXT_PUBLIC_*
Architecture Pattern
INTERNET │ ▼ ┌─────────────────┐ │ Load Balancer │ (Single Public IP) └────────┬────────┘ │ ┌────────▼────────┐ │ Ingress (Traefik│ Routes by subdomain │ or nginx) │ └────────┬────────┘ │ ┌────────────────────┼────────────────────┐ │ │ │ ▼ ▼ ▼ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ Web │ │ SSO │ │ MCP │ │ (PUBLIC) │ │ (PUBLIC) │ │ (PUBLIC) │ └────┬─────┘ └────┬─────┘ └────┬─────┘ │ │ │ │ ┌────▼─────┐ │ └──────────────► API ◄─────────────┘ │(INTERNAL)│ └────┬─────┘ │ ┌────────────┴────────────┐ ▼ ▼ ┌─────────────┐ ┌─────────────┐ │ Neon │ │ Upstash │ │ (Postgres) │ │ (Redis) │ │ EXTERNAL │ │ EXTERNAL │ └─────────────┘ └─────────────┘
Critical Concept: Build-Time vs Runtime Variables
The Problem
Next.js
NEXT_PUBLIC_* variables are embedded at build time, not runtime. This means:
# WRONG: Setting NEXT_PUBLIC_* at runtime does NOTHING ENV NEXT_PUBLIC_API_URL=https://api.example.com # RIGHT: Must be set as build ARG ARG NEXT_PUBLIC_API_URL=https://api.example.com ENV NEXT_PUBLIC_API_URL=$NEXT_PUBLIC_API_URL
The Solution
- In Dockerfile: Use ARG for NEXT_PUBLIC_* variables
- In CI/CD: Pass --build-arg with domain-specific values
- In values.yaml: These are NOT runtime configurable
Build-Time Variables (Next.js)
| Service | Variable | Purpose |
|---|---|---|
| Web | | SSO endpoint for browser OAuth |
| Web | | API endpoint for browser fetch |
| Web | | App URL for redirects |
| SSO | | Better Auth URL for browser |
| SSO | | Redirect after email verify |
Runtime Variables (ConfigMaps/Secrets)
| Service | Variable | Source |
|---|---|---|
| SSO | | Secret (Neon) |
| SSO | | Secret |
| API | | ConfigMap (internal K8s URL) |
| MCP | | ConfigMap (internal K8s URL) |
Internal K8s Service Names
Services communicate via K8s service names, NOT public URLs:
# CORRECT - Internal communication SSO_URL: http://sso-platform:3001 API_URL: http://taskflow-api:8000 # WRONG - Don't use public URLs for internal traffic SSO_URL: https://sso.example.com
GitHub Actions CI/CD Pattern
Selective Builds with Path Filters
jobs: changes: runs-on: ubuntu-latest outputs: api: ${{ steps.filter.outputs.api }} web: ${{ steps.filter.outputs.web }} steps: - uses: dorny/paths-filter@v3 id: filter with: filters: | api: - 'apps/api/**' web: - 'apps/web/**' build-api: needs: changes if: needs.changes.outputs.api == 'true' || github.event_name == 'workflow_dispatch'
Next.js Build Args Pattern
- name: Build and push (web) uses: docker/build-push-action@v5 with: build-args: | NEXT_PUBLIC_SSO_URL=https://sso.${{ vars.DOMAIN }} NEXT_PUBLIC_API_URL=https://api.${{ vars.DOMAIN }} NEXT_PUBLIC_APP_URL=https://${{ vars.DOMAIN }}
GitHub Secrets & Variables
Secrets (Sensitive)
NEON_SSO_DATABASE_URL NEON_API_DATABASE_URL NEON_CHATKIT_DATABASE_URL NEON_NOTIFICATION_DATABASE_URL UPSTASH_REDIS_HOST UPSTASH_REDIS_PASSWORD REDIS_URL REDIS_TOKEN BETTER_AUTH_SECRET OPENAI_API_KEY SMTP_USER SMTP_PASSWORD AZURE_CREDENTIALS (or GCP_CREDENTIALS)
Variables (Non-sensitive)
DOMAIN=example.com CLOUD_PROVIDER=azure AZURE_RESOURCE_GROUP=myapp-rg AZURE_CLUSTER_NAME=myapp-cluster INGRESS_CLASS=traefik
Helm Values Pattern
values-cloud.yaml (Committed, Non-sensitive defaults)
global: domain: "" # Set via --set namespace: taskflow imagePullPolicy: Always managedServices: neon: enabled: true # Connection strings injected via --set from secrets upstash: enabled: true # Credentials injected via --set from secrets sso: enabled: true name: sso-platform postgresql: enabled: false # Using Neon env: NODE_ENV: production BETTER_AUTH_URL: "" # Set via --set
Helm --set Pattern
helm upgrade --install taskflow ./infrastructure/helm/taskflow \ --values values-cloud.yaml \ --set global.imageRegistry="ghcr.io/owner/repo" \ --set global.imageTag="${{ github.sha }}" \ --set "managedServices.neon.ssoDatabase=${{ secrets.NEON_SSO_DATABASE_URL }}" \ --set "sso.env.BETTER_AUTH_SECRET=${{ secrets.BETTER_AUTH_SECRET }}"
CRITICAL: CPU Architecture Check
BEFORE ANY DEPLOYMENT, check your cluster's node architecture:
kubectl get nodes -o jsonpath='{.items[*].status.nodeInfo.architecture}'
→ Useamd64platforms: linux/amd64
→ Usearm64platforms: linux/arm64
ARM64 is increasingly common (Azure, AWS Graviton, Apple Silicon dev). Don't assume amd64!
Docker Build for Correct Architecture
- uses: docker/build-push-action@v5 with: platforms: linux/arm64 # MATCH YOUR CLUSTER! provenance: false # Avoid manifest issues no-cache: true # When debugging
Why
?
Buildx attestation creates complex manifest lists that can cause "no match for platform" errors. Disable for simple, reliable images.provenance: false
Common Gotchas (Battle-Tested)
1. Logout Redirect to 0.0.0.0
Problem:
request.url in K8s returns container bind address
Solution: Use NEXT_PUBLIC_APP_URL env var for redirects
// WRONG const response = NextResponse.redirect(new URL("/", request.url)); // RIGHT const APP_URL = process.env.NEXT_PUBLIC_APP_URL || "http://localhost:3000"; const response = NextResponse.redirect(new URL("/", APP_URL));
2. Email Verification Redirect to localhost
Problem: Missing
NEXT_PUBLIC_CONTINUE_URL in SSO Dockerfile
Solution: Add to Dockerfile and CD pipeline:
ARG NEXT_PUBLIC_CONTINUE_URL=http://localhost:3000 ENV NEXT_PUBLIC_CONTINUE_URL=$NEXT_PUBLIC_CONTINUE_URL
3. Browser Making Requests to localhost
Problem:
NEXT_PUBLIC_* not passed as build arg
Solution: Check ALL NEXT_PUBLIC_* variables systematically:
grep -r "NEXT_PUBLIC_" apps/web/src --include="*.ts" --include="*.tsx" | \ grep -oE "NEXT_PUBLIC_[A-Z_]+" | sort -u
4. Hardcoded Sensitive Data
Problem: Email/passwords hardcoded in values files Solution: Use
--set from GitHub Secrets for ALL sensitive data
5. Missing Database Sections in values.yaml
Problem: Helm templates expect
database.host, postgresql.name etc.
Solution: Include empty/default sections even for managed services:
postgresql: enabled: false name: sso-platform-postgres database: host: "" port: "5432" name: taskflow_sso user: postgres
SSL Certificate Pattern (cert-manager)
apiVersion: cert-manager.io/v1 kind: ClusterIssuer metadata: name: letsencrypt-prod spec: acme: server: https://acme-v02.api.letsencrypt.org/directory email: your-email@example.com privateKeySecretRef: name: letsencrypt-prod solvers: - http01: ingress: class: traefik
Ingress Annotations for TLS
annotations: cert-manager.io/cluster-issuer: letsencrypt-prod traefik.ingress.kubernetes.io/router.tls: "true"
Pre-Deployment Checklist
Code Changes
- All
vars documented and in DockerfilesNEXT_PUBLIC_* - Redirect URLs use env vars, not
request.url - No hardcoded localhost in production code paths
Dockerfiles
- All
as ARG and ENVNEXT_PUBLIC_* - Multi-stage build for slim production image
- Health check endpoint configured
CI/CD Pipeline
- Build args for Next.js apps
- Path filters for selective builds
- All secrets listed and documented
- Helm --set for all sensitive values
Helm Chart
- values-cloud.yaml has all required sections
- No sensitive data in committed files
- Internal service names for inter-service communication
- Ingress configured with correct class
GitHub Setup
- All secrets created in repository settings
- All variables created in repository settings
- Azure/GCP credentials configured
Related Skills
- Debug ImagePullBackOff, CrashLoopBackOff, architecture issuesaks-deployment-troubleshooter
- Dockerization patternscontainerize-apps
- Helm chart structurehelm-charts
- K8s fundamentalskubernetes-essentials
- SSO integrationbetter-auth-sso
Related Agents
- Pre-containerization analysisimpact-analyzer-agent