Claude-code-plugins-plus-skills flyio-prod-checklist
install
source · Clone the upstream repo
git clone https://github.com/jeremylongshore/claude-code-plugins-plus-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/jeremylongshore/claude-code-plugins-plus-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/saas-packs/flyio-pack/skills/flyio-prod-checklist" ~/.claude/skills/jeremylongshore-claude-code-plugins-plus-skills-flyio-prod-checklist && rm -rf "$T"
manifest:
plugins/saas-packs/flyio-pack/skills/flyio-prod-checklist/SKILL.mdsource content
Fly.io Production Checklist
Overview
Fly.io runs applications on edge infrastructure across 30+ regions with Machines, Volumes, and managed Postgres. A production deployment requires multi-region redundancy, proper secret management, health checks, and rollback procedures. Misconfigured auto-scaling means cold starts; missing volume backups mean data loss. This checklist ensures your Fly.io app is production-hardened.
Authentication & Secrets
-
stored in CI secrets (never in fly.toml or source)FLY_API_TOKEN - All app secrets set via
(notfly secrets
block)[env] - Deploy tokens scoped per app (not org-wide personal tokens)
- Key rotation scheduled (quarterly, or after team changes)
- No hardcoded secrets in Dockerfile or codebase
API Integration
- Production base URL: app deployed to
https://<app>.fly.dev -
in fly.toml http_serviceforce_https = true - Custom domain with TLS certificate active and auto-renewing
-
to avoid cold startsmin_machines_running = 1 - Machines deployed in 2+ regions for redundancy
- Concurrency limits tuned (
/soft_limit
per workload)hard_limit - Volumes backed up if using persistent storage
Error Handling & Resilience
- Health check endpoint configured with appropriate grace period
- Graceful shutdown handles SIGTERM within 10s window
- Auto-stop/auto-start configured for cost optimization
- Postgres standby replica provisioned for database apps
- Rollback procedure tested:
fly releases rollback <N> - Dockerfile builds and runs identically local vs deployed
Monitoring & Alerting
-
streaming configured for centralized loggingfly logs - Machine health monitored via
fly machine status - Platform status checked:
https://status.flyio.net - Alert on health check failures across any region
- VM resource utilization tracked (
)fly scale show
Validation Script
async function checkFlyioReadiness(): Promise<void> { const checks: { name: string; pass: boolean; detail: string }[] = []; // Fly.io API connectivity try { const res = await fetch('https://api.machines.dev/v1/apps', { headers: { Authorization: `Bearer ${process.env.FLY_API_TOKEN}` }, }); checks.push({ name: 'Fly API', pass: res.ok, detail: res.ok ? 'Connected' : `HTTP ${res.status}` }); } catch (e: any) { checks.push({ name: 'Fly API', pass: false, detail: e.message }); } // Token present checks.push({ name: 'API Token Set', pass: !!process.env.FLY_API_TOKEN, detail: process.env.FLY_API_TOKEN ? 'Present' : 'MISSING' }); // Platform status try { const res = await fetch('https://status.flyio.net/api/v2/status.json'); const data = await res.json(); const status = data?.status?.indicator || 'unknown'; checks.push({ name: 'Platform Status', pass: status === 'none', detail: status === 'none' ? 'Operational' : status }); } catch (e: any) { checks.push({ name: 'Platform Status', pass: false, detail: e.message }); } for (const c of checks) console.log(`[${c.pass ? 'PASS' : 'FAIL'}] ${c.name}: ${c.detail}`); } checkFlyioReadiness();
Error Handling
| Check | Risk if Skipped | Priority |
|---|---|---|
| Multi-region deployment | Single region outage = full downtime | P1 |
| Volume backups | Data loss on machine replacement | P1 |
| Health check config | Dead machines receive traffic | P2 |
| SIGTERM handling | Dropped requests during deploys | P2 |
| Rollback procedure | Stuck on broken release | P3 |
Resources
Next Steps
See
flyio-security-basics for network policies and secret management.