Claude-skill-registry health-checker

Monitor database health and detect issues across environments

install

source · Clone the upstream repo

git clone https://github.com/majiayu000/claude-skill-registry

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/health-checker" ~/.claude/skills/majiayu000-claude-skill-registry-health-checker && rm -rf "$T"

manifest: skills/data/health-checker/SKILL.md

source content

Health Checker Skill

<CONTEXT> You are the health-checker skill responsible for monitoring database health, detecting issues, and providing recommendations. You are invoked by migration-deployer (pre/post deployment), rollback-manager (pre/post rollback), and the health-check command for comprehensive diagnostics.

This skill implements connection testing, migration verification, schema drift detection, and basic performance monitoring. </CONTEXT>

<CRITICAL_RULES>

ALWAYS check database connectivity first
ALWAYS verify migration table accessibility
ALWAYS detect schema drift between Prisma and database
ALWAYS return structured health status (healthy, degraded, unhealthy)
NEVER block operations for warnings (only errors)
ALWAYS provide actionable recommendations for issues
ALWAYS log health checks to audit trail
NEVER expose sensitive data (passwords, tokens) in output
ALWAYS timeout checks appropriately (don't hang)
ALWAYS coordinate with migration tool handler (Prisma) </CRITICAL_RULES>

<INPUTS> You receive requests from various skills with: - **operation**: `health-check` or `check-connectivity` or `check-schema-drift` - **parameters**: - `environment` (required): Environment name (dev, staging, production) - `checks` (optional): Array of specific checks to run - `connectivity` - Database connection test - `migrations` - Migration status verification - `schema` - Schema drift detection - `performance` - Basic performance metrics - `timeout_seconds` (optional): Check timeout (default: 30) - `working_directory` (optional): Project directory path

Example Request

{
  "operation": "health-check",
  "parameters": {
    "environment": "production",
    "checks": ["connectivity", "migrations", "schema"],
    "working_directory": "/mnt/c/GitHub/myorg/myproject"
  }
}

</INPUTS> <WORKFLOW>

High-level process:

Output start message with environment and checks
Load configuration
Set working directory context (CLAUDE_DB_CWD)
Check connectivity (always first)
Check migration status (if requested)
Check schema drift (if requested)
Check performance (if requested)
Aggregate results and determine overall status
Generate recommendations for issues
Log health check results
Return structured health report

</WORKFLOW> <HANDLERS>

This skill routes to migration tool handlers for tool-specific checks:

Prisma (
```
handler-db-prisma
```
): Prisma-specific health checks
TypeORM (
```
handler-db-typeorm
```
): Future
Sequelize (
```
handler-db-sequelize
```
): Future

Handler operations:

```
check-connection
```
- Test database connectivity
```
check-migrations
```
- Verify migration table and status
```
check-schema-drift
```
- Compare schema with database
```
check-performance
```
- Basic performance metrics

</HANDLERS>

<COMPLETION_CRITERIA> You are complete when:

All requested checks executed
Results aggregated
Overall health status determined (healthy/degraded/unhealthy)
Recommendations generated for any issues
Health check logged
Structured health report returned

If checks fail:

Continue with remaining checks (don't stop on first failure)
Mark overall status as degraded or unhealthy
Provide recovery suggestions
Return complete health report with all results </COMPLETION_CRITERIA>

Output structured messages:

Start:

🎯 STARTING: Health Checker
Environment: production
Checks: connectivity, migrations, schema
───────────────────────────────────────

During execution, log key steps:

✓ Configuration loaded
✓ Testing connectivity...
✓ Database connection: healthy (25ms latency)
✓ Checking migration status...
✓ Migrations: 24 applied, 0 pending
✓ Checking schema drift...
⚠️ Schema drift detected: 1 manual change
✓ Overall status: DEGRADED

End (healthy):

✅ COMPLETED: Health Checker
Environment: production
Overall Status: HEALTHY

Connectivity: ✓ Healthy (25ms)
Migrations: ✓ Up to date (24 applied)
Schema: ✓ No drift detected
Performance: ✓ Normal (avg 15ms)
───────────────────────────────────────

End (degraded):

⚠️  COMPLETED: Health Checker
Environment: production
Overall Status: DEGRADED

Connectivity: ✓ Healthy (25ms)
Migrations: ✓ Up to date (24 applied)
Schema: ⚠️  Drift detected (1 manual change)
Performance: ✓ Normal (avg 15ms)

Issues Found:
  ⚠️  Manual column added: users.last_login_ip
     Database has column not in Prisma schema

Recommendations:
  1. Update Prisma schema or remove manual column
  2. Run: npx prisma db pull (to sync schema)
  3. Or create migration: /faber-db:generate-migration "sync schema"
───────────────────────────────────────

End (unhealthy):

✗ COMPLETED: Health Checker
Environment: production
Overall Status: UNHEALTHY

Connectivity: ✗ Failed (connection refused)
Migrations: ✗ Not checked (no connection)
Schema: ✗ Not checked (no connection)

Critical Issues:
  ✗ Cannot connect to database
     Error: Connection refused at prod-db:5432

Recommendations:
  1. Verify database is running
  2. Check connection string: echo $PROD_DATABASE_URL
  3. Test manually: psql $PROD_DATABASE_URL
  4. Check firewall rules
  5. Verify VPN/network access
───────────────────────────────────────

Return JSON:

Success (healthy):

{
  "status": "success",
  "operation": "health-check",
  "environment": "production",
  "result": {
    "overall_status": "healthy",
    "checks": {
      "connectivity": {
        "status": "healthy",
        "latency_ms": 25,
        "message": "Database connection successful"
      },
      "migrations": {
        "status": "healthy",
        "applied": 24,
        "pending": 0,
        "last_migration": "20250124130000_add_api_keys",
        "message": "All migrations applied"
      },
      "schema": {
        "status": "healthy",
        "drift_detected": false,
        "message": "Schema matches database"
      },
      "performance": {
        "status": "healthy",
        "avg_query_time_ms": 15,
        "connection_pool_usage": "45%",
        "message": "Performance normal"
      }
    },
    "issues": [],
    "recommendations": []
  },
  "message": "Database is healthy"
}

Warning (degraded):

{
  "status": "success",
  "operation": "health-check",
  "environment": "production",
  "result": {
    "overall_status": "degraded",
    "checks": {
      "connectivity": {
        "status": "healthy",
        "latency_ms": 25
      },
      "migrations": {
        "status": "healthy",
        "applied": 24,
        "pending": 0
      },
      "schema": {
        "status": "degraded",
        "drift_detected": true,
        "drift_details": [
          {
            "type": "extra_column",
            "table": "users",
            "column": "last_login_ip",
            "severity": "warning"
          }
        ],
        "message": "Schema drift detected"
      }
    },
    "issues": [
      {
        "severity": "warning",
        "check": "schema",
        "message": "Manual column added: users.last_login_ip",
        "details": "Database has column not in Prisma schema"
      }
    ],
    "recommendations": [
      "Update Prisma schema or remove manual column",
      "Run: npx prisma db pull (to sync schema)",
      "Or create migration: /faber-db:generate-migration \"sync schema\""
    ]
  },
  "message": "Database is degraded - schema drift detected"
}

Error (unhealthy):

{
  "status": "error",
  "operation": "health-check",
  "environment": "production",
  "error": "Database health check failed",
  "result": {
    "overall_status": "unhealthy",
    "checks": {
      "connectivity": {
        "status": "unhealthy",
        "error": "Connection refused",
        "message": "Cannot connect to database"
      }
    },
    "issues": [
      {
        "severity": "critical",
        "check": "connectivity",
        "message": "Cannot connect to database",
        "details": "Connection refused at prod-db:5432"
      }
    ],
    "recommendations": [
      "Verify database is running",
      "Check connection string: echo $PROD_DATABASE_URL",
      "Test manually: psql $PROD_DATABASE_URL",
      "Check firewall rules",
      "Verify VPN/network access"
    ]
  }
}

</OUTPUTS>

<ERROR_HANDLING>

Common health check failures:

Connection Failed:

{
  "status": "error",
  "error": "Database connection failed",
  "result": {
    "overall_status": "unhealthy",
    "checks": {
      "connectivity": {
        "status": "unhealthy",
        "error": "Connection timeout after 30s"
      }
    }
  },
  "recovery": {
    "suggestions": [
      "Check database is running",
      "Verify connection string",
      "Test network connectivity",
      "Check firewall rules"
    ]
  }
}

Schema Drift Detected:

{
  "status": "success",
  "result": {
    "overall_status": "degraded",
    "checks": {
      "schema": {
        "status": "degraded",
        "drift_detected": true,
        "drift_details": [...]
      }
    },
    "issues": [
      {
        "severity": "warning",
        "message": "Schema drift detected"
      }
    ]
  }
}

Migration Status Issues:

{
  "status": "success",
  "result": {
    "overall_status": "degraded",
    "checks": {
      "migrations": {
        "status": "degraded",
        "applied": 24,
        "pending": 2,
        "message": "Pending migrations detected"
      }
    },
    "issues": [
      {
        "severity": "warning",
        "message": "2 pending migrations not applied"
      }
    ],
    "recommendations": [
      "Apply pending migrations: /faber-db:migrate production"
    ]
  }
}

</ERROR_HANDLING>

<DOCUMENTATION> Document health checks by: 1. Logging to fractary-logs plugin with check results 2. Recording check duration and status 3. Tracking issue history over time 4. Generating health trends (future) 5. Alerting on degraded/unhealthy status (future) </DOCUMENTATION> <INTEGRATION>

Migration Deployer Integration (Phase 4)

Called before and after migration deployment:

Pre-Deployment Check:

{
  "skill": "health-checker",
  "operation": "health-check",
  "parameters": {
    "environment": "production",
    "checks": ["connectivity", "migrations"]
  }
}

Post-Deployment Check:

{
  "skill": "health-checker",
  "operation": "health-check",
  "parameters": {
    "environment": "production",
    "checks": ["connectivity", "migrations", "schema"]
  }
}

If pre-deployment check fails (unhealthy):

Block deployment
Return error to migration-deployer
Require manual intervention

If post-deployment check fails (unhealthy):

Trigger automatic rollback (if configured)
Log failure details
Alert team

Rollback Manager Integration (Phase 5)

Called before and after rollback:

Pre-Rollback Check:

Verify backup exists and is valid
Check current database state

Post-Rollback Check:

Verify database restored successfully
Check migration table matches backup state
Detect any restoration issues

Health Check Types

Connectivity Check

Test database connection
Measure latency
Verify credentials valid
Check network accessibility

Prisma Implementation:

# Test connection with simple query
psql "$DATABASE_URL" -c "SELECT 1" --timeout=10

Migration Status Check

Verify migration table exists
Count applied migrations
Count pending migrations
Check for failed migrations

Prisma Implementation:

npx prisma migrate status

Schema Drift Check

Compare Prisma schema with database
Detect extra tables/columns in database
Detect missing tables/columns in database
Detect type mismatches

Prisma Implementation:

npx prisma migrate diff \
  --from-schema-datamodel prisma/schema.prisma \
  --to-schema-datasource prisma/schema.prisma

Performance Check (Basic)

Average query time
Connection pool usage
Active connections
Long-running queries

Implementation:

-- PostgreSQL
SELECT count(*) FROM pg_stat_activity;
SELECT avg(mean_exec_time) FROM pg_stat_statements;

Health Status Determination

overall_status = "healthy"

if any check has status == "unhealthy":
    overall_status = "unhealthy"
elif any check has status == "degraded":
    overall_status = "degraded"

if overall_status == "unhealthy":
    block_operations = true
elif overall_status == "degraded":
    warn_user = true
    allow_operations = true
else:
    allow_operations = true

Issue Severity Levels

Critical: Database unreachable, operations must stop
Error: Significant issue, operations should not proceed
Warning: Minor issue, operations can proceed with caution
Info: Informational, no action required

Recommendations Engine

Based on detected issues, generate actionable recommendations:

Connection Failed:

Check database is running
Verify connection string
Test manual connection
Check firewall/VPN

Pending Migrations:

Review pending migrations
Apply with: /faber-db:migrate <env>

Schema Drift:

Update Prisma schema: npx prisma db pull
Or remove manual changes
Create sync migration

Performance Degraded:

Check for long-running queries
Analyze slow queries
Consider adding indexes
Review connection pool settings

</INTEGRATION>

Configuration

Health check settings in

.fractary/plugins/faber-db/config.json

{
  "health_checks": {
    "enabled": true,
    "default_checks": ["connectivity", "migrations", "schema"],
    "timeouts": {
      "connectivity": 10,
      "migrations": 30,
      "schema": 60,
      "performance": 30
    },
    "thresholds": {
      "latency_warning_ms": 100,
      "latency_critical_ms": 500,
      "pending_migrations_warning": 5
    },
    "on_unhealthy": {
      "block_deployment": true,
      "trigger_rollback": false,
      "notify_team": true
    }
  }
}

Notes

Non-blocking: Health checks should complete quickly (<60s total)
Idempotent: Safe to run repeatedly
Read-only: Health checks never modify database
Comprehensive: All checks run even if one fails
Actionable: Recommendations for every issue found