Claude-skill-registry health-checker
Monitor database health and detect issues across environments
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/health-checker" ~/.claude/skills/majiayu000-claude-skill-registry-health-checker && rm -rf "$T"
skills/data/health-checker/SKILL.mdHealth Checker Skill
<CONTEXT> You are the health-checker skill responsible for monitoring database health, detecting issues, and providing recommendations. You are invoked by migration-deployer (pre/post deployment), rollback-manager (pre/post rollback), and the health-check command for comprehensive diagnostics.This skill implements connection testing, migration verification, schema drift detection, and basic performance monitoring. </CONTEXT>
<CRITICAL_RULES>
- ALWAYS check database connectivity first
- ALWAYS verify migration table accessibility
- ALWAYS detect schema drift between Prisma and database
- ALWAYS return structured health status (healthy, degraded, unhealthy)
- NEVER block operations for warnings (only errors)
- ALWAYS provide actionable recommendations for issues
- ALWAYS log health checks to audit trail
- NEVER expose sensitive data (passwords, tokens) in output
- ALWAYS timeout checks appropriately (don't hang)
- ALWAYS coordinate with migration tool handler (Prisma) </CRITICAL_RULES>
Example Request
</INPUTS> <WORKFLOW>{ "operation": "health-check", "parameters": { "environment": "production", "checks": ["connectivity", "migrations", "schema"], "working_directory": "/mnt/c/GitHub/myorg/myproject" } }
High-level process:
- Output start message with environment and checks
- Load configuration
- Set working directory context (CLAUDE_DB_CWD)
- Check connectivity (always first)
- Check migration status (if requested)
- Check schema drift (if requested)
- Check performance (if requested)
- Aggregate results and determine overall status
- Generate recommendations for issues
- Log health check results
- Return structured health report
This skill routes to migration tool handlers for tool-specific checks:
- Prisma (
): Prisma-specific health checkshandler-db-prisma - TypeORM (
): Futurehandler-db-typeorm - Sequelize (
): Futurehandler-db-sequelize
Handler operations:
- Test database connectivitycheck-connection
- Verify migration table and statuscheck-migrations
- Compare schema with databasecheck-schema-drift
- Basic performance metricscheck-performance
<COMPLETION_CRITERIA> You are complete when:
- All requested checks executed
- Results aggregated
- Overall health status determined (healthy/degraded/unhealthy)
- Recommendations generated for any issues
- Health check logged
- Structured health report returned
If checks fail:
- Continue with remaining checks (don't stop on first failure)
- Mark overall status as degraded or unhealthy
- Provide recovery suggestions
- Return complete health report with all results </COMPLETION_CRITERIA>
Output structured messages:
Start:
🎯 STARTING: Health Checker Environment: production Checks: connectivity, migrations, schema ───────────────────────────────────────
During execution, log key steps:
- ✓ Configuration loaded
- ✓ Testing connectivity...
- ✓ Database connection: healthy (25ms latency)
- ✓ Checking migration status...
- ✓ Migrations: 24 applied, 0 pending
- ✓ Checking schema drift...
- ⚠️ Schema drift detected: 1 manual change
- ✓ Overall status: DEGRADED
End (healthy):
✅ COMPLETED: Health Checker Environment: production Overall Status: HEALTHY Connectivity: ✓ Healthy (25ms) Migrations: ✓ Up to date (24 applied) Schema: ✓ No drift detected Performance: ✓ Normal (avg 15ms) ───────────────────────────────────────
End (degraded):
⚠️ COMPLETED: Health Checker Environment: production Overall Status: DEGRADED Connectivity: ✓ Healthy (25ms) Migrations: ✓ Up to date (24 applied) Schema: ⚠️ Drift detected (1 manual change) Performance: ✓ Normal (avg 15ms) Issues Found: ⚠️ Manual column added: users.last_login_ip Database has column not in Prisma schema Recommendations: 1. Update Prisma schema or remove manual column 2. Run: npx prisma db pull (to sync schema) 3. Or create migration: /faber-db:generate-migration "sync schema" ───────────────────────────────────────
End (unhealthy):
✗ COMPLETED: Health Checker Environment: production Overall Status: UNHEALTHY Connectivity: ✗ Failed (connection refused) Migrations: ✗ Not checked (no connection) Schema: ✗ Not checked (no connection) Critical Issues: ✗ Cannot connect to database Error: Connection refused at prod-db:5432 Recommendations: 1. Verify database is running 2. Check connection string: echo $PROD_DATABASE_URL 3. Test manually: psql $PROD_DATABASE_URL 4. Check firewall rules 5. Verify VPN/network access ───────────────────────────────────────
Return JSON:
Success (healthy):
{ "status": "success", "operation": "health-check", "environment": "production", "result": { "overall_status": "healthy", "checks": { "connectivity": { "status": "healthy", "latency_ms": 25, "message": "Database connection successful" }, "migrations": { "status": "healthy", "applied": 24, "pending": 0, "last_migration": "20250124130000_add_api_keys", "message": "All migrations applied" }, "schema": { "status": "healthy", "drift_detected": false, "message": "Schema matches database" }, "performance": { "status": "healthy", "avg_query_time_ms": 15, "connection_pool_usage": "45%", "message": "Performance normal" } }, "issues": [], "recommendations": [] }, "message": "Database is healthy" }
Warning (degraded):
{ "status": "success", "operation": "health-check", "environment": "production", "result": { "overall_status": "degraded", "checks": { "connectivity": { "status": "healthy", "latency_ms": 25 }, "migrations": { "status": "healthy", "applied": 24, "pending": 0 }, "schema": { "status": "degraded", "drift_detected": true, "drift_details": [ { "type": "extra_column", "table": "users", "column": "last_login_ip", "severity": "warning" } ], "message": "Schema drift detected" } }, "issues": [ { "severity": "warning", "check": "schema", "message": "Manual column added: users.last_login_ip", "details": "Database has column not in Prisma schema" } ], "recommendations": [ "Update Prisma schema or remove manual column", "Run: npx prisma db pull (to sync schema)", "Or create migration: /faber-db:generate-migration \"sync schema\"" ] }, "message": "Database is degraded - schema drift detected" }
Error (unhealthy):
</OUTPUTS>{ "status": "error", "operation": "health-check", "environment": "production", "error": "Database health check failed", "result": { "overall_status": "unhealthy", "checks": { "connectivity": { "status": "unhealthy", "error": "Connection refused", "message": "Cannot connect to database" } }, "issues": [ { "severity": "critical", "check": "connectivity", "message": "Cannot connect to database", "details": "Connection refused at prod-db:5432" } ], "recommendations": [ "Verify database is running", "Check connection string: echo $PROD_DATABASE_URL", "Test manually: psql $PROD_DATABASE_URL", "Check firewall rules", "Verify VPN/network access" ] } }
<ERROR_HANDLING>
Common health check failures:
Connection Failed:
{ "status": "error", "error": "Database connection failed", "result": { "overall_status": "unhealthy", "checks": { "connectivity": { "status": "unhealthy", "error": "Connection timeout after 30s" } } }, "recovery": { "suggestions": [ "Check database is running", "Verify connection string", "Test network connectivity", "Check firewall rules" ] } }
Schema Drift Detected:
{ "status": "success", "result": { "overall_status": "degraded", "checks": { "schema": { "status": "degraded", "drift_detected": true, "drift_details": [...] } }, "issues": [ { "severity": "warning", "message": "Schema drift detected" } ] } }
Migration Status Issues:
{ "status": "success", "result": { "overall_status": "degraded", "checks": { "migrations": { "status": "degraded", "applied": 24, "pending": 2, "message": "Pending migrations detected" } }, "issues": [ { "severity": "warning", "message": "2 pending migrations not applied" } ], "recommendations": [ "Apply pending migrations: /faber-db:migrate production" ] } }
</ERROR_HANDLING>
<DOCUMENTATION> Document health checks by: 1. Logging to fractary-logs plugin with check results 2. Recording check duration and status 3. Tracking issue history over time 4. Generating health trends (future) 5. Alerting on degraded/unhealthy status (future) </DOCUMENTATION> <INTEGRATION>Migration Deployer Integration (Phase 4)
Called before and after migration deployment:
Pre-Deployment Check:
{ "skill": "health-checker", "operation": "health-check", "parameters": { "environment": "production", "checks": ["connectivity", "migrations"] } }
Post-Deployment Check:
{ "skill": "health-checker", "operation": "health-check", "parameters": { "environment": "production", "checks": ["connectivity", "migrations", "schema"] } }
If pre-deployment check fails (unhealthy):
- Block deployment
- Return error to migration-deployer
- Require manual intervention
If post-deployment check fails (unhealthy):
- Trigger automatic rollback (if configured)
- Log failure details
- Alert team
Rollback Manager Integration (Phase 5)
Called before and after rollback:
Pre-Rollback Check:
- Verify backup exists and is valid
- Check current database state
Post-Rollback Check:
- Verify database restored successfully
- Check migration table matches backup state
- Detect any restoration issues
Health Check Types
Connectivity Check
- Test database connection
- Measure latency
- Verify credentials valid
- Check network accessibility
Prisma Implementation:
# Test connection with simple query psql "$DATABASE_URL" -c "SELECT 1" --timeout=10
Migration Status Check
- Verify migration table exists
- Count applied migrations
- Count pending migrations
- Check for failed migrations
Prisma Implementation:
npx prisma migrate status
Schema Drift Check
- Compare Prisma schema with database
- Detect extra tables/columns in database
- Detect missing tables/columns in database
- Detect type mismatches
Prisma Implementation:
npx prisma migrate diff \ --from-schema-datamodel prisma/schema.prisma \ --to-schema-datasource prisma/schema.prisma
Performance Check (Basic)
- Average query time
- Connection pool usage
- Active connections
- Long-running queries
Implementation:
-- PostgreSQL SELECT count(*) FROM pg_stat_activity; SELECT avg(mean_exec_time) FROM pg_stat_statements;
Health Status Determination
overall_status = "healthy" if any check has status == "unhealthy": overall_status = "unhealthy" elif any check has status == "degraded": overall_status = "degraded" if overall_status == "unhealthy": block_operations = true elif overall_status == "degraded": warn_user = true allow_operations = true else: allow_operations = true
Issue Severity Levels
- Critical: Database unreachable, operations must stop
- Error: Significant issue, operations should not proceed
- Warning: Minor issue, operations can proceed with caution
- Info: Informational, no action required
Recommendations Engine
Based on detected issues, generate actionable recommendations:
Connection Failed:
- Check database is running
- Verify connection string
- Test manual connection
- Check firewall/VPN
Pending Migrations:
- Review pending migrations
- Apply with: /faber-db:migrate <env>
Schema Drift:
- Update Prisma schema: npx prisma db pull
- Or remove manual changes
- Create sync migration
Performance Degraded:
- Check for long-running queries
- Analyze slow queries
- Consider adding indexes
- Review connection pool settings
Configuration
Health check settings in
.fractary/plugins/faber-db/config.json:
{ "health_checks": { "enabled": true, "default_checks": ["connectivity", "migrations", "schema"], "timeouts": { "connectivity": 10, "migrations": 30, "schema": 60, "performance": 30 }, "thresholds": { "latency_warning_ms": 100, "latency_critical_ms": 500, "pending_migrations_warning": 5 }, "on_unhealthy": { "block_deployment": true, "trigger_rollback": false, "notify_team": true } } }
Notes
- Non-blocking: Health checks should complete quickly (<60s total)
- Idempotent: Safe to run repeatedly
- Read-only: Health checks never modify database
- Comprehensive: All checks run even if one fails
- Actionable: Recommendations for every issue found