Skills boot-resume
git clone https://github.com/openclaw/skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/belugary/boot-resume" ~/.claude/skills/openclaw-skills-boot-resume && rm -rf "$T"
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/belugary/boot-resume" ~/.openclaw/skills/openclaw-skills-boot-resume && rm -rf "$T"
skills/belugary/boot-resume/SKILL.mdBoot Resume
Zero-cooperation session recovery after gateway restart. No checkpoints, no hooks, no agent involvement — just reads the evidence and picks up where it left off.
Problem
When the gateway restarts, any in-flight agent turn dies mid-execution. Session history is preserved on disk, but the agent doesn't know it needs to continue. Users must manually tell each interrupted session to resume.
Checkpoint-based approaches require the agent to save state before dying. Unexpected kills (SIGKILL, OOM, power loss) bypass this entirely.
Solution
A deterministic shell script runs on every gateway start via systemd
ExecStartPost. No LLM in the detection loop.
┌─────────┐ ┌──────────┐ ┌──────────┐ │ Scan │ ──▶ │ Detect │ ──▶ │ Resume │ │sessions │ │ JSONL │ │ cron add │ │ .json │ │ tail │ │--sys-evt │ └─────────┘ └──────────┘ └──────────┘
- Scan — finds sessions updated within the last 20 minutes
- Detect — reads the last 5 JSONL lines to classify session state
- Resume — schedules a one-shot
to inject a continuation promptopenclaw cron add --system-event --wake now
Key insight: the JSONL session files already contain all the evidence needed to detect an interruption — no pre-save required.
Detection Rules
| Last JSONL Entry | Status | Meaning |
|---|---|---|
| | Tool returned, agent never processed it |
(empty text) | | Tool call dispatched, killed before response |
(non-trivial) | | Message received, never processed |
(with text) | | Session ended normally — skip |
(trivial: "ok", emoji) | | No meaningful request pending — skip |
Install
One command
bash {baseDir}/install.sh
Deploys three components:
→boot-resume-check.sh~/.openclaw/workspace/scripts/
→ systemd drop-in (triggers script on every gateway start)boot-resume.conf
→ systemd user service (triggers script on system wake from sleep/suspend)boot-resume-wake.service
Manual
cp {baseDir}/scripts/boot-resume-check.sh ~/.openclaw/workspace/scripts/ chmod +x ~/.openclaw/workspace/scripts/boot-resume-check.sh mkdir -p ~/.config/systemd/user/openclaw-gateway.service.d cp {baseDir}/templates/boot-resume.conf ~/.config/systemd/user/openclaw-gateway.service.d/ cp {baseDir}/templates/boot-resume-wake.service ~/.config/systemd/user/ systemctl --user daemon-reload systemctl --user enable boot-resume-wake.service
Verify
systemctl --user restart openclaw-gateway sleep 20 cat /tmp/openclaw/boot-resume.log
Expected output:
[boot-resume] now=... cut=... (20min window) [boot-resume] scanning agent: main [boot-resume] candidates: 0 (agent=main) [boot-resume] done
Test
- Send a message that triggers a multi-step task (web search, code analysis, etc.)
- Wait for the agent to start processing (tool calls in flight)
systemctl --user restart openclaw-gateway- Agent resumes automatically within ~35 seconds
Slash Command
When invoked as
/boot-resume, run the script with --no-wait to skip the startup delay:
bash {baseDir}/scripts/boot-resume-check.sh --no-wait
Report results to the user: which sessions were resumed, or that none were found.
Configuration
| Variable | Default | Description |
|---|---|---|
| | How far back to scan for interrupted sessions |
| | Delay before injecting the resume event |
Edit at the top of
scripts/boot-resume-check.sh.
Features
- Dual trigger — covers both gateway restart (ExecStartPost) and system sleep/wake (systemd sleep.target)
- Multi-agent support — scans all agents under
, not just~/.openclaw/agents/main - Smart filtering — skips system, heartbeat, cron, and subagent sessions automatically
- Deduplication — respects
to avoid double-resuming planned restartsrestart-resume.json - Log rotation — auto-truncates log at 1000 lines
- Error visibility — Python and cron errors are logged, not swallowed
- Unique job names — timestamp-based to prevent conflicts on rapid restarts
Comparison
| Approach | Pre-save required | Survives SIGKILL | LLM-free |
|---|---|---|---|
| Checkpoint / snapshot files | Yes | No | No |
| Pre-restart state dump | Yes | No | No |
| Session history replay | Yes | Partial | No |
| Post-hoc JSONL detection (this skill) | No | Yes | Yes |
Logs
Output:
/tmp/openclaw/boot-resume.log
Each run logs: timestamp, scan window, candidate count, per-session status, and whether a resume job was armed.
Limitations
- 20-minute scan window (configurable) — sessions idle longer than this are not resumed
- Resume prompt is generic — the agent relies on session context for continuity
- Telegram/Discord message queues already handle unprocessed incoming messages — this skill targets mid-execution interruptions
- Requires systemd (Linux); macOS users need manual launchd setup
Uninstall
rm ~/.config/systemd/user/openclaw-gateway.service.d/boot-resume.conf systemctl --user disable boot-resume-wake.service 2>/dev/null rm ~/.config/systemd/user/boot-resume-wake.service systemctl --user daemon-reload rm ~/.openclaw/workspace/scripts/boot-resume-check.sh rm -rf ~/.openclaw/workspace/skills/boot-resume