Software_development_department resume-from

Restores cognitive state from an atomic checkpoint in .tasks/checkpoints/[task_id].md, enabling instant recovery at the exact point of failure without re-running the full task.

install

source · Clone the upstream repo

git clone https://github.com/tranhieutt/software_development_department

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/tranhieutt/software_development_department "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/resume-from" ~/.claude/skills/tranhieutt-software-development-department-resume-from && rm -rf "$T"

manifest: .claude/skills/resume-from/SKILL.md

source content

Resume From Checkpoint

Restore the working context of a specific task from its atomic checkpoint at

.tasks/checkpoints/[task_id].md

, then continue execution from the exact next step — without re-running completed work.

Steps

1. Validate argument

$ARGUMENTS

must contain a

task_id

. If missing or empty:

❌ Usage: /resume-from <task_id>
   Example: /resume-from 042
   Example: /resume-from auth-api

Available checkpoints:
[Run: ls .tasks/checkpoints/ and list files excluding .gitkeep]

Stop here if no

task_id

is provided.

2. Load checkpoint

Read

.tasks/checkpoints/[task_id].md

If the file does not exist:

❌ No checkpoint found for task: [task_id]
   Expected: .tasks/checkpoints/[task_id].md

Available checkpoints:
[List files in .tasks/checkpoints/ excluding .gitkeep]

Tip: Run /save-state [task_id] first to create a checkpoint.

Stop here if file is missing.

3. Parse and surface checkpoint

Extract the following fields from the checkpoint and display them clearly:

🔁 Resuming task: [task_id]
   Agent:         [agent_id]
   Saved at:      [saved_at]
   Retry count:   [retry_count]

📄 Output Snapshot (last known state):
   [output_snapshot content]

✅ Completed Steps:
   [completed steps list]

⏭️  Next Step:
   [next_step content]

❓ Open Questions:
   [open_questions content — or "None" if empty]

📁 Files Modified So Far:
   [files_modified list]

4. Apply exponential backoff if retrying

Check

retry_count

in the checkpoint frontmatter:

```
retry_count = 0
```
→ proceed immediately, no wait
```
retry_count = 1
```
→ wait 2s before continuing
```
retry_count = 2
```
→ wait 4s before continuing
```
retry_count = 3
```
→ wait 8s before continuing
```
retry_count >= 4
```
→ surface a warning:

⚠️  This task has failed [retry_count] times.
    Continuing, but consider escalating to a senior agent or the user
    if the same error recurs.

Then increment

retry_count

and update

backoff_next_s

(double the previous value, max 64s) in the checkpoint file before proceeding.

5. Resume execution

Hand off context to the appropriate agent (

agent_id

from checkpoint) with the following instruction:

"You are resuming task
[task_id]
. The completed steps and output snapshot above are already done — do NOT repeat them. Your only job is to execute the Next Step listed above and continue from there."

6. Update checkpoint on success

When the task completes successfully, update

.tasks/checkpoints/[task_id].md

Set
```
status: completed
```
Set
```
completed_at: [ISO timestamp]
```
Append to
```
## Completed Steps
```

Print:

✅ Task [task_id] completed successfully.
   Checkpoint updated → .tasks/checkpoints/[task_id].md (status: completed)

Checkpoint lifecycle

/save-state [task_id]   → creates  .tasks/checkpoints/[task_id].md (status: in_progress)
/resume-from [task_id]  → reads checkpoint, increments retry_count, resumes
                        → on success: sets status: completed

Completed checkpoints are kept for audit — they are never auto-deleted. To list all checkpoints:

ls .tasks/checkpoints/