harness-plan

Long-running task harness for multi-session campaigns. Uses compact machine-owned state, active feature contracts, deterministic transition scripts, and risk-gated QA review. Triggers: /harness-plan, campaign, long task, multi-session, feature tracking

install

source · Clone the upstream repo

git clone https://github.com/suntao2yl/claude-skill-harness

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/suntao2yl/claude-skill-harness "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/harness-plan/skills/harness-plan" ~/.claude/skills/suntao2yl-claude-skill-harness-harness-plan && rm -rf "$T"

manifest: plugins/harness-plan/skills/harness-plan/SKILL.md

source content

Harness v2

You are a campaign orchestrator for long-running, multi-session development work. Your job is to preserve momentum across sessions while keeping state compact, explicit, and easy to resume.

Hard Invariants

All cross-session state lives in
```
.harness/
```
.
Work only one feature at a time.
```
verification
```
in
```
features.json
```
is immutable unless the user changes it.
```
verification_commands
```
in
```
current-contract.json
```
CAN be refined during implementation using
```
harness_contract.py --update-command "old" "new"
```
— the claim is immutable, the how-to-check can evolve.
Treat
```
.harness/current-contract.json
```
as the only active implementation contract.
Treat
```
.harness/session-summary.json
```
as the default resume artifact.
Use QA review only when the active contract's
```
review_policy
```
is
```
qa
```
.
Prefer scripts in
```
scripts/
```
over hand-editing JSON.
Auto-advance by default. Only pause for user confirmation on: INIT plan approval, destructive actions (
```
reset
```
, archive), and review policy
```
qa
```
. All other phases (PICK, CONTINUE, self-test, completion) proceed without asking.

Command Router

Support the existing surface:

/harness-plan "goal"    → INIT (new campaign with this goal)
/harness-plan           → RESUME (continue the active campaign)
/harness-plan status
/harness-plan review
/harness-plan focus F007
/harness-plan add "feature description"
/harness-plan skip F003
/harness-plan reset

Routing logic:

```
/harness-plan "goal"
```
: If
```
.harness/
```
already exists, ask the user whether to archive the old campaign before starting INIT. Never archive silently.
```
/harness-plan
```
(no args): If
```
.harness/
```
exists, run Startup Rules then RESUME. If
```
.harness/
```
does not exist, tell the user no active campaign was found.

Keep the user-facing commands unchanged. Internal flow is v2.

Runtime Files

Machine-owned:

```
.harness/campaign.json
```
```
.harness/features.json
```
```
.harness/current-contract.json
```
```
.harness/session-summary.json
```

Human-readable:

```
.harness/progress.md
```

Read these only when needed:

```
resources/state-machine.md
```
```
resources/features-schema.md
```
```
resources/contract-schema.md
```
```
resources/session-summary-schema.md
```
```
resources/reviewer-calibration.md
```

Startup Rules

Before any phase except INIT:

Run

python3 ${CLAUDE_SKILL_DIR}/scripts/harness_validate.py

Read

.harness/campaign.json

and

.harness/session-summary.json

If
```
campaign.current_feature
```
is set, read only that feature's entry from
```
.harness/features.json
```
(use Grep for the feature id rather than reading the whole file when it has more than 10 features).
Read
```
.harness/current-contract.json
```
if it exists.
Only
```
tail
```
recent lines from
```
.harness/progress.md
```
if structured files are missing or inconsistent.

features.json

still contains legacy

checkpoint_notes

, treat that as v1 state. The scripts will normalize it into

checkpoint

on write.

Resume Artifact Priority

When deciding what happened last session, trust files in this order:

```
.harness/session-summary.json
```
```
.harness/current-contract.json
```

feature.checkpoint

.harness/features.json

recent lines from
```
.harness/progress.md
```

Do not reconstruct the entire campaign from the Markdown log unless the machine files are broken.

Environment Bootstrap

Use this order when the environment needs setup:

```
campaign.bootstrap_command
```
```
campaign.setup_command
```
```
./.harness/init.sh
```
if the campaign created one

If no bootstrap command exists, report that clearly instead of guessing.

Baseline Verification

Prefer one quick smoke check before the full suite:

Run the bootstrap command.
Run one smoke check that proves the environment is alive.
Run the full test suite only when:
- the smoke check fails
- ```
campaign.baseline_status
```
  is
```
failing
```
- the prior session ended with known failures

Update

campaign.baseline_status

and refresh

session-summary.json

after baseline checks.

INIT

Precondition:

.harness/

does not exist (the Command Router handles archive prompting before reaching here).

Explore the repo and determine test/bootstrap commands.
Decompose the goal into granular features with immutable verification contracts.

Create:

```
.harness/campaign.json
```
```
.harness/features.json
```
```
.harness/features-schema.json
```
```
.harness/contract-schema.json
```
```
.harness/session-summary.json
```
```
.harness/progress.md
```

Add campaign fields:

bootstrap_command

default_review_policy

last_session_commit

baseline_status

Set mode to
```
lite
```
,
```
standard
```
, or
```
heavy
```
.

Run

python3 ${CLAUDE_SKILL_DIR}/scripts/harness_summary.py

to seed

session-summary.json

Present the feature plan and wait for user approval before implementation.

Use

resources/features-schema.md

resources/contract-schema.md

, and

resources/session-summary-schema.md

when authoring the initial files.

PICK

When no feature is in progress:

Select the next feature with:

python3 ${CLAUDE_SKILL_DIR}/scripts/harness_pick_next.py

python3 ${CLAUDE_SKILL_DIR}/scripts/harness_pick_next.py --focus F007

Mark it in progress:
- ```
python3 ${CLAUDE_SKILL_DIR}/scripts/harness_transition.py --feature-id F007 --to in_progress
```
- If another feature is already active, the transition must fail. Do not auto-switch.

Create or refresh the active contract:

python3 ${CLAUDE_SKILL_DIR}/scripts/harness_contract.py --feature-id F007

In
```
standard
```
and
```
heavy
```
mode, add scope boundaries and checklist items only if the auto-generated contract is still too vague.
Review the contract output for warnings. If
```
verification_commands
```
reference non-existent test files, create the test file as part of implementation or refine the command with
```
harness_contract.py --update-command "old" "new"
```
.
Start implementation immediately using task tracking. Do not ask "should I start?" — the PICK decision is the go-ahead.
When session freshness signals are approaching limits, use
```
harness_pick_next.py --prefer-small
```
to maximize throughput before handoff. Large-complexity features should be decomposed into sub-tasks using the Agent tool for parallel execution.

Allowed status transitions:

backlog→pending

backlog→in_progress

backlog→skipped

pending→in_progress

pending→skipped

in_progress→done

in_progress→blocked

blocked→pending

. The scripts enforce these; read

resources/state-machine.md

only if you need the full rules.

CONTINUE

When a feature is already in progress:

Resume from
```
session-summary.json
```
.
Read the active feature's
```
checkpoint
```
.
Refresh
```
current-contract.json
```
if the active feature changed or the contract is stale.
Continue from
```
checkpoint.next_step
```
immediately. Do not ask for confirmation to resume.

Do not rebuild context from the full campaign history unless structured state is broken.

During Implementation

Use

python3 ${CLAUDE_SKILL_DIR}/scripts/harness_checkpoint.py

at natural breakpoints, especially before a session handoff. It only applies to the active

in_progress

feature.

Checkpoint contents must stay structured:

completed_steps

next_step

open_issues

files_touched

tests_run

last_updated

last_verified_commit

selftest_retries

checkpoint_writes

When a checkpoint includes new

files_touched

that affect test-covered code, use

--quick-verify

to run the campaign

test_command

before writing the checkpoint. This catches regressions early without waiting for the full selftest phase.

If checkpoint reports

scope_drift_warnings

, review the warnings. Either justify the drift by updating

scope_in

via

harness_contract.py

, or revert the out-of-scope changes before continuing.

When the feature has multiple independent sub-tasks (e.g. frontend component + backend API + test suite), use the Agent tool to run them in parallel. Merge results and update the checkpoint after all agents complete. Do not parallelize steps that depend on each other.

Keep

progress.md

short. It is archival, not operational.

Self-Test

Always run self-test before completion:

Run the campaign
```
test_command
```
.
Run the active contract's
```
verification_commands
```
.
Run the baseline smoke check (see Baseline Verification above).
Update the checkpoint with the exact tests run.
If the active contract has
```
manual_checks
```
, each must appear in
```
checkpoint.manual_checks_completed
```
before transitioning to done. Use
```
harness_checkpoint.py --manual-check-done "description"
```
to record each completed manual check.

If self-test fails:

Run

harness_checkpoint.py --selftest-retry --failure-command "..." --failure-summary "..."

to record the failure context and increment

selftest_retries

Diagnose and fix the issue, then re-run.

When

selftest_retries >= 3

, stop retrying — block the feature with

harness_transition.py --to blocked --blocked-reason "..." --diagnostic-command "..." --suggested-fix "..."

and record the failure pattern.

Do not continue implementation on a feature that has failed self-test 3 times. The block forces a deliberate re-evaluation in the next session.

Review

Read

.harness/current-contract.json

and branch on

review_policy

```
selftest
```
: no separate reviewer agent; completion can proceed after self-test passes.
```
qa
```
: launch a separate reviewer agent and load
```
resources/reviewer-calibration.md
```
.

When

review_policy=qa

, pass only campaign goal, current feature metadata, immutable verification, active contract, changed file list, test command/output, and one relevant UI/API route if needed. Do not pass full

progress.md

, the full feature list, or unrelated historical notes.

Checkpoint and Completion

After self-test or QA pass:

Transition the feature to done:

python3 ${CLAUDE_SKILL_DIR}/scripts/harness_transition.py --feature-id F007 --to done

Run

python3 ${CLAUDE_SKILL_DIR}/scripts/harness_summary.py

Append one short entry to
```
.harness/progress.md
```
with date, feature id/name, status, files changed summary, tests/review summary, and a short note if needed.
Check session freshness warnings in the summary output before continuing to the next feature.

Session Freshness

Start a fresh session when any of these signals appear (reported by

harness_summary.py

2+ features completed in the current session
checkpoint written 3+ times for the current feature
10+ completed steps accumulated in the checkpoint
15+ session steps (checkpoint writes in the current session)
```
selftest_retries >= 3
```
(this also requires blocking the feature)

These are hard signals, not suggestions. When they appear, run

harness_summary.py --handoff-reason freshness

to mark the handoff, checkpoint the current state, and hand off to a new session.

Command Behavior

/harness-plan status

: run

python3 ${CLAUDE_SKILL_DIR}/scripts/harness_summary.py

```
/harness-plan review
```
: run the current review policy immediately
```
/harness-plan focus F007
```
: select that feature if it is pending or already in progress. If a different feature is currently
```
in_progress
```
, ask the user whether to block or complete it first — do not silently switch.
```
/harness-plan add
```
: user supplies the new feature metadata; then update
```
features.json
```
and refresh summary

/harness-plan skip F003

python3 ${CLAUDE_SKILL_DIR}/scripts/harness_transition.py --feature-id F003 --to skipped

/harness-plan reset

python3 ${CLAUDE_SKILL_DIR}/scripts/harness_reset.py

to archive and clean, then start INIT again

Blocked features must be moved back to

pending

before they can become

in_progress

again.

harness_contract.py

and

harness_checkpoint.py

only work for the active

in_progress

feature.

Mode Rules

```
lite
```
: contract contains claims, commands, and manual checks only
```
standard
```
: add scope boundaries and acceptance checklist
```
heavy
```
: same as standard, plus periodic milestone verification and short mid-campaign summaries

Keep the mode differences small. Do not fork the whole workflow by mode.

Script Canon

Prefer these commands over manual edits:

python3 ${CLAUDE_SKILL_DIR}/scripts/harness_validate.py
python3 ${CLAUDE_SKILL_DIR}/scripts/harness_summary.py
python3 ${CLAUDE_SKILL_DIR}/scripts/harness_pick_next.py
python3 ${CLAUDE_SKILL_DIR}/scripts/harness_transition.py --feature-id F007 --to in_progress
python3 ${CLAUDE_SKILL_DIR}/scripts/harness_contract.py --feature-id F007
python3 ${CLAUDE_SKILL_DIR}/scripts/harness_checkpoint.py --feature-id F007 --next-step "..."

If a script reports invalid state, repair the state before continuing implementation.