git clone https://github.com/ai-analyst-lab/ai-analyst
T=$(mktemp -d) && git clone --depth=1 https://github.com/ai-analyst-lab/ai-analyst "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/resume-pipeline" ~/.claude/skills/ai-analyst-lab-ai-analyst-resume-pipeline && rm -rf "$T"
.claude/skills/resume-pipeline/skill.mdSkill: Resume Pipeline
Purpose
Resume an interrupted analysis pipeline by reading
working/pipeline_state.json, determining which agents completed, and continuing from the next READY agents using the DAG walker.
When to Use
Invoke as
/resume-pipeline when:
- A previous analysis session was interrupted (context limit, user break, connection issue)
- The user wants to continue an analysis started in a prior conversation
- Pipeline state file exists from a partially completed run
- A pipeline failed and the underlying issue has been fixed
Instructions
Step 1: Locate pipeline state (per-run directory aware)
Search for the most recent pipeline state in this order:
- Per-run directory (preferred): Check
(symlink to latest run). If found, setworking/latest/pipeline_state.json
from the symlink target and proceed to Step 2.RUN_DIR - Specific run: If the user passed a run ID (e.g.,
), look in/resume-pipeline 2026-02-23_acme-analytics_why-revenue-dropped
. Setworking/runs/{id}/pipeline_state.json
accordingly.RUN_DIR - Legacy location: Check
(pre-run-directory pipelines). If found, read it and proceed to Step 2 without aworking/pipeline_state.json
.RUN_DIR - No state found: Fall back to artifact scanning (Step 1b).
Pipeline state fields to extract (V2):
-- identifies this runrun_id
-- per-run directory path (may be absent for legacy runs)run_dir
-- active datasetdataset
-- the business questionquestion
--status
,running
, orpausedfailed
-- map of agent-name to agent state (status, output_file, timestamps)agents
Step 1a: V1-to-V2 state migration
After loading the state file and before any processing, check whether the state uses the V1 (step-number keyed) format and migrate it to V2 if needed.
from helpers.pipeline_state import detect_schema_version, migrate_v1_to_v2 if detect_schema_version(state) < 2: # Resolve dataset from active.yaml or fall back to "unknown" dataset = state.get("dataset") or resolve_active_dataset() or "unknown" state = migrate_v1_to_v2(state, dataset=dataset) # Write migrated state back to disk (same location it was read from) write_pipeline_state(state_path, state) print("Migrated pipeline state from V1 -> V2 format")
Migration details (handled by
helpers/pipeline_state.py):
(ISO timestamp) ->pipeline_id
; generatestarted_at
from date + dataset + question slugrun_id
keys ->steps.{n}.agent
keysagents.{agent_name}
->steps.{n}.output_files[0]
(take first)agents.{name}.output_file- Status values are preserved as-is (compatible between V1 and V2)
- Adds
andschema_version: 2
set to current timeupdated_at - If any V1 step had
, it becomesstatus: running
at the pipeline level (was interrupted)paused
After migration, continue with the V2 fields listed above.
Step 1b: Artifact-based fallback (no pipeline_state.json)
If no state file exists, scan
working/ and outputs/ for artifacts:
| Agent | Expected Artifact | Directory |
|---|---|---|
| question-framing | | |
| hypothesis | | |
| data-explorer | | |
| source-tieout | | |
| descriptive-analytics | | |
| root-cause-investigator | | |
| validation | | |
| opportunity-sizer | | |
| story-architect | | |
| narrative-coherence-reviewer | | |
| chart-maker | | |
| visual-design-critic | | |
| storytelling | | |
| deck-creator | | |
Walk the list top to bottom. If an artifact exists and looks complete (not empty, no "NEEDS REVISION" markers), mark that agent as completed. Reconstruct a pipeline_state.json from this scan.
Step 2: Compute READY set from DAG
- Read
to build the dependency graphagents/registry.yaml - For each agent in the registry, check
:state["agents"][agent_name]["status"]- If status is
,complete
, orskipped
→ leave itdegraded - If status is
→ reset tofailed
(will be retried)pending - If status is
orin_progress
→ reset torunning
(was interrupted)pending
- If status is
- Compute READY agents: those with
whose every dependency isstatus: pendingcomplete
Step 3: Build context summary
Read each completed agent's output files and extract a brief summary:
- From question brief: the framed question and decision context
- From analysis report: key findings (top 3)
- From storyboard: narrative beats and visual plan
- From validation: confidence grade
Compile into a context block for the resumed session.
Step 4: Present resume plan
Display:
Resuming pipeline {run_id} Completed agents: {count} - {agent_name}: {one-line summary from outputs} - ... Failed/interrupted agents (will retry): {count} - {agent_name}: {error or "interrupted"} Next READY agents: {list} Resume execution?
Step 5: Resume via DAG walker
On confirmation:
- Update pipeline_state.json: set
, reset failed/running to pendingstatus: running - Hand off to the DAG walker in run-pipeline skill (Phase 2)
- The walker will pick up from the READY set and continue tier-by-tier
- All existing completed outputs are preserved — only pending agents execute
Special Cases
- Storyboard with "NEEDS ADDITIONS": Mark story-architect as
, not completedpending - Partial chart generation: Count generated charts vs storyboard beats. If incomplete, mark chart-maker as
pending - Source tie-out FAIL: Mark as
. User must investigate before resumingfailed - Stale data (>24h gap): Warn that underlying data may have changed since the original run
Limitations
- Context gap: Resuming restores artifacts but not conversational reasoning. The resumed analysis may be slightly less coherent than a single-session run.
- No partial step recovery: If an agent was interrupted mid-execution, the entire agent must re-run.
- Pipeline state is authoritative: If pipeline_state.json and artifacts disagree, trust pipeline_state.json.