Claude-skill-registry check-alerts

Check currently firing Grafana alerts, analyze alert status, and investigate alert issues in the Kagenti platform

install

source · Clone the upstream repo

git clone https://github.com/majiayu000/claude-skill-registry

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/check-alerts" ~/.claude/skills/majiayu000-claude-skill-registry-check-alerts && rm -rf "$T"

manifest: skills/data/check-alerts/SKILL.md

Check Alerts Skill

This skill helps you check and analyze Grafana alerts in the Kagenti platform.

When to Use

User asks "what alerts are firing?"
User wants to check alert status
After platform changes or deployments
During incident investigation
When troubleshooting platform issues

What This Skill Does

List Firing Alerts: Show all currently active alerts
Alert Details: Display alert severity, component, and description
Alert History: Check recent alert state changes
Query Alert Rules: Verify alert configuration
Test Alert Queries: Validate PromQL queries

Examples

Check Firing Alerts

# Get all currently firing alerts from Grafana
kubectl exec -n observability deployment/grafana -- \
  curl -s 'http://localhost:3000/api/alertmanager/grafana/api/v2/alerts' \
  -u admin:admin123 | python3 -c "
import sys, json
alerts = json.load(sys.stdin)
firing = [a for a in alerts if a.get('status', {}).get('state') == 'active']
print(f'Firing alerts: {len(firing)}')
for alert in firing:
    labels = alert.get('labels', {})
    annotations = alert.get('annotations', {})
    print(f\"\\n• {labels.get('alertname')} ({labels.get('severity')})\")
    print(f\"  Component: {labels.get('component')}\")
    print(f\"  Description: {annotations.get('description', 'N/A')[:100]}...\")
"

List All Alert Rules

# Get all configured alert rules
kubectl exec -n observability deployment/grafana -- \
  curl -s 'http://localhost:3000/api/v1/provisioning/alert-rules' \
  -u admin:admin123 | python3 -c "
import sys, json
rules = json.load(sys.stdin)
print(f'Total alert rules: {len(rules)}')
for rule in rules:
    print(f\"  • {rule.get('title')} ({rule.get('labels', {}).get('severity')})\")
"

Check Specific Alert Configuration

# Get configuration for a specific alert
kubectl exec -n observability deployment/grafana -- \
  curl -s 'http://localhost:3000/api/v1/provisioning/alert-rules' \
  -u admin:admin123 | python3 -c "
import sys, json
rules = json.load(sys.stdin)
alert_uid = 'prometheus-down'  # Change this to the alert UID
rule = next((r for r in rules if r.get('uid') == alert_uid), None)
if rule:
    print(f\"Alert: {rule.get('title')}\")
    print(f\"Query: {rule.get('data', [{}])[0].get('model', {}).get('expr')}\")
    print(f\"noDataState: {rule.get('noDataState')}\")
    print(f\"execErrState: {rule.get('execErrState')}\")
"

Test Alert Query Against Prometheus

# Test an alert's PromQL query
QUERY='up{job="kubernetes-pods",app="prometheus"} == 0'

kubectl exec -n observability deployment/grafana -- \
  curl -s -G 'http://prometheus.observability.svc:9090/api/v1/query' \
  --data-urlencode "query=${QUERY}" | python3 -m json.tool

Check Alert Evaluation State

# Check why an alert is firing or not firing
kubectl exec -n observability deployment/grafana -- \
  curl -s 'http://localhost:3000/api/v1/eval/rules' \
  -u admin:admin123 | python3 -m json.tool

Alert Locations in Grafana UI

Access Grafana: https://grafana.localtest.me:9443 Credentials: admin / admin123

Navigation:

Alerting → Alert rules - View all configured alerts
Alerting → Alert list - See firing/pending alerts
Alerting → Silences - Manage alert silences
Alerting → Contact points - Check notification settings
Alerting → Notification policies - View routing rules

Common Alert Issues

False Positives

Check
```
noDataState
```
configuration (should be
```
OK
```
for most alerts)
Verify query matches actual resource type (Deployment vs StatefulSet)
Test query returns correct results

Alert Not Firing When It Should

Verify metric exists in Prometheus
Check alert threshold is appropriate
Verify
```
for
```
duration isn't too long
Check
```
noDataState
```
isn't masking the issue

Alert Configuration Not Loading

Restart Grafana:

kubectl rollout restart deployment/grafana -n observability

Check ConfigMap applied:

kubectl get configmap grafana-alerting -n observability

Verify no YAML syntax errors

Runbooks by Alert

When an alert fires, consult its runbook:

```
docs/runbooks/alerts/<alert-uid>.md
```

Example: If "Prometheus Down" alert fires →

docs/runbooks/alerts/prometheus-down.md

🤖 Generated with Claude Code