Claude-skill-registry docker-health
Docker health checks and troubleshooting. Use when building Docker images, running containers, or debugging deployment issues. Validates backend API and worker services.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/docker-health" ~/.claude/skills/majiayu000-claude-skill-registry-docker-health && rm -rf "$T"
skills/data/docker-health/SKILL.mdDocker Health Check Workflow
This skill helps with Docker-related development, testing, and deployment.
When to use this skill
- Building Docker images for backend
- Running backend in containerized environment
- Debugging Docker deployment issues
- Validating Docker health before deployment
- Testing production-like environment locally
Quick Commands
make docker-build # Build backend Docker image make docker-test # Run comprehensive Docker health checks make db-start # Start PostgreSQL Docker container make db-stop # Stop PostgreSQL (keeps data)
Development Database (Docker)
Start Database
make db-start
This starts PostgreSQL in Docker:
- Port: 5432 (main), 5433 (test)
- User: postgres
- Password: postgres
- Database: arive_dev, manageros_test
- Volume:
(persisted)pgdata
CRITICAL: Database Persistence
The development database uses Docker volumes for data persistence:
- Volume name:
pgdata - NEVER delete this volume in development
stops container but PRESERVES datamake db-stop- NEVER run
(destroys data)docker compose down -v - Database state persists across container restarts
Database Management
make db-start # Start PostgreSQL container make db-stop # Stop container (keeps data) make db-upgrade # Apply migrations make db-migrate # Create new migration # Check database status docker ps | grep postgres docker volume ls | grep pgdata
Backend Docker Image
Build Image
make docker-build
This builds the production backend image:
- Base: Python 3.13 slim
- Package manager: uv
- Entry point: Litestar app
- Includes: Database migrations, compiled email templates
- Tag:
arive-backend:latest
What's Included
- Python application code
- uv-managed dependencies
- Alembic migrations
- Compiled email templates (HTML)
- Litestar ASGI server
What's NOT Included
- Frontend (deployed separately to Vercel)
- Development tools
- Test files
- Source email templates (only compiled HTML)
Docker Health Check
Run Health Checks
make docker-test
This performs comprehensive validation:
- Build check: Verifies Docker image builds successfully
- Container start: Starts backend + database containers
- Health endpoint: Checks
returns 200 OK/health - Database connectivity: Verifies PostgreSQL connection
- Migration check: Ensures migrations can run
- API smoke test: Validates basic API functionality
- Worker check: Tests SAQ worker process
Health Check Output
Success:
✓ Docker image built successfully ✓ Containers started ✓ Health endpoint responding ✓ Database connected ✓ Migrations applied ✓ API responding to requests ✓ Worker process running All health checks passed!
Failure Example:
✗ Health endpoint not responding Error: Connection refused on http://localhost:8000/health Troubleshooting steps: 1. Check container logs: docker logs <container-id> 2. Verify port mapping: docker ps 3. Check health endpoint code
Troubleshooting Docker Issues
Container Won't Start
Check logs:
docker compose logs backend docker logs <container-id>
Common issues:
- Port 8000 already in use: Stop other backend processes
- Database not ready: Ensure PostgreSQL container is running
- Migration failures: Check Alembic version compatibility
- Missing environment variables: Verify .env or docker-compose.yml
Build Failures
Check build output:
docker build --progress=plain -t arive-backend -f backend/Dockerfile backend/
Common issues:
- Python dependency conflicts: Check
pyproject.toml - File not found: Ensure files exist in build context
- uv errors: Verify uv version compatibility
- Permission issues: Check file permissions
Database Connection Issues
Check database container:
docker ps | grep postgres # Is it running? docker logs <postgres-container> # Check logs
Test connection:
docker exec -it <postgres-container> psql -U postgres -d arive_dev
Common issues:
- Container not running:
make db-start - Wrong credentials: Check DATABASE_URL
- Port conflict: Ensure 5432/5433 are available
- Network issues: Verify Docker network configuration
Worker Not Processing Tasks
Check worker logs:
docker compose logs worker
Verify queue configuration:
# Access backend shell docker exec -it <backend-container> python from app.queue.config import get_queue queue = await get_queue() stats = await queue.stats() print(stats) # Check queued/processed counts
Common issues:
- Worker not started: Check docker-compose.yml includes worker service
- Queue table missing: Run migrations to create
tablessaq_* - Task not registered: Verify task imported in
app/queue/config.py - Database connection: Check worker can connect to PostgreSQL
Local Docker Compose Setup
docker-compose.yml Example
services: db: image: postgres:16 environment: POSTGRES_PASSWORD: postgres POSTGRES_DB: arive_dev ports: - "5432:5432" volumes: - pgdata:/var/lib/postgresql/data backend: build: context: ./backend dockerfile: Dockerfile ports: - "8000:8000" environment: DATABASE_URL: postgresql://postgres:postgres@db:5432/arive_dev ENV: local depends_on: - db worker: build: context: ./backend dockerfile: Dockerfile command: litestar workers run environment: DATABASE_URL: postgresql://postgres:postgres@db:5432/arive_dev ENV: local depends_on: - db volumes: pgdata:
Start Full Stack
docker compose up -d # Start all services docker compose logs -f # Follow logs docker compose down # Stop all (keeps volumes) docker compose down -v # Stop and DELETE volumes (⚠️ DANGER)
Pre-Deployment Checklist
Before deploying to AWS:
-
Health checks pass locally:
make docker-test -
All tests pass:
make test make check-all -
Migrations reviewed:
- Check
for new migrationsbackend/alembic/versions/ - Test both upgrade and downgrade paths
- Verify no data loss operations
- Check
-
Environment variables configured:
- Check
or AWS Secrets Manager.env.production - Verify DATABASE_URL, API keys, etc.
- Check
-
Email templates compiled:
make build-emails git status # Ensure compiled templates committed -
Image size reasonable:
docker images arive-backend # Should be < 500MB ideally
Production Deployment (AWS ECS)
The backend deploys to AWS ECS Fargate:
- API Service: Litestar app behind ALB
- Worker Service: SAQ background task processor
- Database: Aurora Serverless v2 PostgreSQL
- Images: Stored in ECR (Elastic Container Registry)
Deployment handled by Terraform in
infra/ directory.
See
infra/CLAUDE.md for detailed infrastructure guide.