openclaw-self-healing
4-tier autonomous self-healing system for OpenClaw Gateway with persistent learning, reasoning logs, and multi-channel alerts. Features Claude Code as Level 3 emergency doctor for AI-powered diagnosis and repair.
install
source · Clone the upstream repo
git clone https://github.com/Ramsbaby/openclaw-self-healing
Claude Code · Install into ~/.claude/skills/
git clone --depth=1 https://github.com/Ramsbaby/openclaw-self-healing ~/.claude/skills/ramsbaby-openclaw-self-healing-openclaw-self-healing
OpenClaw · Install into ~/.openclaw/skills/
git clone --depth=1 https://github.com/Ramsbaby/openclaw-self-healing ~/.openclaw/skills/ramsbaby-openclaw-self-healing-openclaw-self-healing
manifest:
SKILL.mdsource content
OpenClaw Self-Healing System
"The system that heals itself — or calls for help when it can't."
A 4-tier autonomous self-healing system for OpenClaw Gateway.
Architecture
Level 1: Watchdog (180s) → Process monitoring (OpenClaw built-in) Level 2: Health Check (300s) → HTTP 200 + 3 retries Level 3: Claude Recovery → 30min AI-powered diagnosis 🧠 Level 4: Discord Alert → Human escalation
What's Special (v2.0)
- World's first Claude Code as Level 3 emergency doctor
- Persistent Learning - Automatic recovery documentation (symptom → cause → solution → prevention)
- Reasoning Logs - Explainable AI decision-making process
- Multi-Channel Alerts - Discord + Telegram support
- Metrics Dashboard - Success rate, recovery time, trending analysis
- Production-tested (verified recovery Feb 5-6, 2026)
- macOS LaunchAgent integration
Quick Setup
1. Install Dependencies
brew install tmux npm install -g @anthropic-ai/claude-code
2. Configure Environment
# Copy template to OpenClaw config directory cp .env.example ~/.openclaw/.env # Edit and add your Discord webhook (optional) nano ~/.openclaw/.env
3. Install Scripts
# Copy scripts cp scripts/*.sh ~/openclaw/scripts/ chmod +x ~/openclaw/scripts/*.sh # Install LaunchAgent cp launchagent/com.openclaw.healthcheck.plist ~/Library/LaunchAgents/ launchctl load ~/Library/LaunchAgents/com.openclaw.healthcheck.plist
4. Verify
# Check Health Check is running launchctl list | grep openclaw.healthcheck # View logs tail -f ~/openclaw/memory/healthcheck-$(date +%Y-%m-%d).log
Scripts
| Script | Level | Description |
|---|---|---|
| 2 | HTTP 200 check + 3 retries + escalation |
| 3 | Claude Code PTY session for AI diagnosis (v1) |
| 3 | Enhanced with learning + reasoning logs (v2) ⭐ |
| 4 | Discord/Telegram notification on failure |
| - | Visualize recovery statistics (NEW) |
Configuration
All settings via environment variables in
~/.openclaw/.env:
| Variable | Default | Description |
|---|---|---|
| (none) | Discord webhook for alerts |
| | Gateway health check URL |
| | Restart attempts before escalation |
| | Claude recovery timeout (30 min) |
Testing
Test Level 2 (Health Check)
# Run manually bash ~/openclaw/scripts/gateway-healthcheck.sh # Expected output: # ✅ Gateway healthy
Test Level 3 (Claude Recovery)
# Inject a config error (backup first!) cp ~/.openclaw/openclaw.json ~/.openclaw/openclaw.json.bak # Wait for Health Check to detect and escalate (~8 min) tail -f ~/openclaw/memory/emergency-recovery-*.log
Links
- GitHub: https://github.com/Ramsbaby/openclaw-self-healing
- Docs: https://github.com/Ramsbaby/openclaw-self-healing/tree/main/docs
License
MIT License - do whatever you want with it.
Built by @ramsbaby + Jarvis 🦞