Session-orchestrator evolve
git clone https://github.com/Kanevry/session-orchestrator
T=$(mktemp -d) && git clone --depth=1 https://github.com/Kanevry/session-orchestrator "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/evolve" ~/.claude/skills/kanevry-session-orchestrator-evolve && rm -rf "$T"
skills/evolve/SKILL.mdPlatform Note: State files use the platform's native directory:
(Claude Code),.claude/(Codex CLI), or.codex/(Cursor IDE). Shared metrics live in.cursor/(v2) with fallback to.orchestrator/metrics/for pre-v2.0 legacy data. See<state-dir>/metrics/.skills/_shared/platform-tools.md
Evolve Skill
Phase 0: Bootstrap Gate
Read
skills/_shared/bootstrap-gate.md and execute the gate check. If the gate is CLOSED, invoke skills/bootstrap/SKILL.md and wait for completion before proceeding. If the gate is OPEN, continue to Phase 1.
<HARD-GATE>
Do NOT proceed past Phase 0 if GATE_CLOSED. There is no bypass. Refer to `skills/_shared/bootstrap-gate.md` for the full HARD-GATE constraints.
</HARD-GATE>
Phase 1: Config & Data Loading
1.1 Read Session Config
Read and parse Session Config per
skills/_shared/config-reading.md. Store result as $CONFIG.
1.2 Check Persistence
Extract
persistence from $CONFIG. If persistence is false, abort with message:
"Learnings require persistence to be enabled in Session Config. Add
to your Session Config block (CLAUDE.md for Claude Code, AGENTS.md for Codex CLI)."persistence: true
1.3 Determine Mode
Read mode from
$ARGUMENTS:
- If empty or not provided, default to
analyze - Valid modes:
,analyze
,reviewlist - If invalid mode provided, report error and list valid modes
1.4 Load Data
Lazy-create defensive (#185): If
.orchestrator/metrics/learnings.jsonl does not exist (pre-#185 repo or bootstrap skipped), create an empty file and emit an info log — do NOT hard-fail:
LEARNINGS_FILE=".orchestrator/metrics/learnings.jsonl" if [[ ! -f "$LEARNINGS_FILE" ]]; then mkdir -p "$(dirname "$LEARNINGS_FILE")" : > "$LEARNINGS_FILE" echo "info(#185): auto-created $LEARNINGS_FILE (was missing)" >&2 fi
This defensive step is idempotent and cheap — it ensures
/evolve analyze|review|list never fails because of a missing artifact file.
- Read
(session history). If it does not exist, check.orchestrator/metrics/sessions.jsonl
as a legacy fallback (where<state-dir>/metrics/sessions.jsonl
is<state-dir>
,.claude/
, or.codex/
per platform). If neither exists, warn: "No session history found. Run at least one session first.".cursor/ - Read
if it exists. If not found, check.orchestrator/metrics/learnings.jsonl
as a legacy fallback.<state-dir>/metrics/learnings.jsonl - Count existing learnings, note any where
< current date (expired)expires_at
Phase 2: Mode Dispatch
Route based on mode:
→ Phase 3analyze
→ Phase 4review
→ Phase 5list
Phase 3: Analyze Mode (default)
Extract learnings from session history.
Vault Integration: If
isvault-integration.enabledin Session Config, confirmed learnings are mirrored to the configured Obsidian vault after the atomic write (Step 3.5, step 9). Seetruefor thedocs/session-config-reference.mdconfig block.vault-integration
Step 3.1: Read Session Data
- Read all entries from
(or.orchestrator/metrics/sessions.jsonl
if the v2 path does not exist — see Phase 1.4 fallback)<state-dir>/metrics/sessions.jsonl - Parse each JSONL line as JSON
- Sort by
descending (most recent first)completed_at - If no sessions found, abort: "No session data available. Complete at least one session before running evolve."
Step 3.2: Pattern Extraction
For each of the 6 learning types, apply these heuristics:
1. fragile-file (type: fragile-file
)
fragile-file- Look at wave data: if the same file appears in 3+ waves'
within a session, it is fragilefiles_changed - Cross-session: if a file appears in 3+ different sessions'
, flag itfiles_changed - Subject = file path (relative to project root)
2. effective-sizing (type: effective-sizing
)
effective-sizing- Compare
andtotal_agents
across session typestotal_waves - Calculate average agents per wave for each session type
- Subject = canonical identifier like
ordeep-session-sizingfeature-session-sizing - Insight = "Deep sessions average X agents across Y waves" or "Feature sessions work well with X agents/wave"
3. recurring-issue (type: recurring-issue
)
recurring-issue- Look at
— ifagent_summary
orfailed
> 0 across multiple sessions, flagpartial - Check wave
fields — repeated failures indicate recurring issuesquality - Subject = issue pattern identifier (e.g., "test-failures-in-wave-execution", "lint-regressions")
4. scope-guidance (type: scope-guidance
)
scope-guidance- Cross-reference
vseffectiveness.planned_issueseffectiveness.completion_rate - Skip sessions that lack the
field (early sessions may not have it)effectiveness - If completion_rate is consistently 1.0 with N issues, note "N issues per session works well"
- If completion_rate < 0.7, note "scope was too large"
- Subject =
optimal-scope-per-session-type
5. deviation-pattern (type: deviation-pattern
)
deviation-patternOwnership Reference: See
. evolve has read-only access to STATE.md.skills/_shared/state-ownership.md
- Read
if it exists and check<state-dir>/STATE.md
section## Deviations - Cross-reference with session duration vs planned waves
- Subject = pattern name (e.g., "scope-creep-in-feature-sessions", "underestimated-complexity")
6. stagnation-class-frequency (type: stagnation-class-frequency
)
stagnation-class-frequency- Read
from the most recent 5 sessions instagnation_events
(skip sessions lacking the field — they predate #84).sessions.jsonl - For each
pair appearing in ≥2 sessions, extract a candidate:(file, error_class)- Subject =
(e.g.,<file>:<error_class>
)skills/wave-executor/wave-loop.md:edit-format-friction - Insight = "File <X> has <error_class> stagnation in <N> recent sessions — candidate for pre-edit grounding (#85)."
- Evidence = "<N> sessions with stagnation_events for this file/class"
- Subject =
- These learnings feed #85 (pre-edit grounding injection) when it ships — high-frequency pairs trigger grounding.
Step 3.2b: Zero Patterns Check
If no patterns were extracted across all 6 types, report: "No patterns found in session history. This can happen with very few sessions or sessions that lack detailed wave/agent data." and skip to end (do not proceed to AskUserQuestion).
Step 3.3: Deduplicate Against Existing Learnings
For each extracted pattern, check if a learning with same
type + subject already exists in learnings.jsonl:
- If exists: propose confidence update (+0.15 if confirmed by new evidence, -0.2 if contradicted)
- If new: propose as new learning with confidence 0.5
Step 3.4: Present Findings via AskUserQuestion
Present extracted patterns to the user for confirmation. Use AskUserQuestion with
multiSelect: true:
On Codex CLI where AskUserQuestion is unavailable, present as a numbered Markdown list.
AskUserQuestion({ questions: [{ question: "Which learnings should be saved?\n\nExtracted patterns from session history:", header: "Evolve — Confirm Learnings", options: [ { label: "[type] subject", description: "insight | evidence: ... | confidence: 0.5 (new) or +0.15 (update)" }, ... { label: "Skip all", description: "Do not save any learnings this time" } ], multiSelect: true }] })
If user selects "Skip all" or selects nothing, abort gracefully: "No learnings saved."
Step 3.5: Write Confirmed Learnings
For confirmed learnings, use atomic rewrite strategy:
-
Read ALL existing lines from
(if exists) into memory. If not found, check.orchestrator/metrics/learnings.jsonl
as a legacy fallback. If legacy data is found, it will be migrated to the v2 path on write (step 8).<state-dir>/metrics/learnings.jsonl -
Apply confidence updates for confirmed existing learnings:
- Increment confidence by +0.15
- Cap at 1.0
- Reset
to current date +expires_at
(default: 30)learning-expiry-days
-
Apply confidence decrements for contradicted learnings (-0.2) — do NOT reset
for contradicted learnings (let them decay naturally)expires_at -
Append new learnings with:
: generate a uuid-v4 (useid
or equivalent)uuidgen
: one oftype
,fragile-file
,effective-sizing
,recurring-issue
,scope-guidance
,deviation-patternstagnation-class-frequency
: the pattern subjectsubject
: human-readable description of the patterninsight
: specific data points that support the patternevidence
: 0.5 for new learningsconfidence
: session ID from which the pattern was extractedsource_session
: current ISO 8601 datecreated_at
: current date +expires_at
(default: 30) (ISO 8601)learning-expiry-days
-
Verify write: Read back the first line of the written file to confirm valid JSON. If read-back fails or is not valid JSON, report error to user.
-
Prune: remove entries where
< current date ORexpires_at
<= 0.0confidence -
Consolidate duplicates: if same
+type
appears more than once, keep the entry with highest confidencesubject -
Write entire result back to
with.orchestrator/metrics/learnings.jsonl
(atomic rewrite, NOT append>
)>> -
Vault mirror (conditional): Check
via jq. If the field is missing or$CONFIG."vault-integration".enabled
, skip this step entirely — skill behavior is unchanged.falseIf
isenabled
:truea. Check
. If$CONFIG."vault-integration".mode
ismode
, skip the mirror invocation (treat as disabled). Ifoff
is absent, default tomode
.warnb. Resolve the vault directory: use
if non-null, otherwise fall back to the$CONFIG."vault-integration"."vault-dir"
environment variable. If neither is set, emit a warning and skip.$VAULT_DIRc. Invoke the mirror script:
node "$PLUGIN_ROOT/scripts/vault-mirror.mjs" \ --vault-dir "<vault-dir>" \ --source .orchestrator/metrics/learnings.jsonl \ --kind learningd. Handle the exit code according to
:mode
(default): on non-zero exit, surface a warning in evolve output (e.g. "Warning: vault mirror failed — learnings saved locally but not mirrored.") but do NOT fail the skill.warn
: on non-zero exit, fail the skill immediately and report the error to the user.strict
e. On success (exit 0), report: "Mirrored N learnings to
."<vault-dir>/40-learnings/
Report: "Saved N new learnings, updated M existing. Total active: K."
Phase 4: Review Mode
Interactive management of existing learnings.
Step 4.1: Load Learnings
- Read
. If not found, check.orchestrator/metrics/learnings.jsonl
as a legacy fallback.<state-dir>/metrics/learnings.jsonl - If neither exists or both are empty: "No learnings found. Run
first."/evolve analyze - Parse each line as JSON
Step 4.2: Display Learnings
Present a formatted table grouped by type:
## Active Learnings | # | Type | Subject | Confidence | Expires | Insight | |---|------|---------|------------|---------|---------| | 1 | fragile-file | src/lib/auth.ts | 0.80 | 2026-07-05 | Changed in 4 of last 5 sessions | | 2 | effective-sizing | feature-session-sizing | 0.65 | 2026-06-20 | Feature sessions work well with 3 agents/wave | | ... | ... | ... | ... | ... | ... | Summary: N active learnings (M high confidence, K expiring soon)
Step 4.3: Interactive Management
Use AskUserQuestion with options:
On Codex CLI where AskUserQuestion is unavailable, present as a numbered Markdown list.
AskUserQuestion({ questions: [{ question: "What would you like to do with your learnings?", header: "Evolve — Review", options: [ { label: "Boost confidence", description: "Select learnings to boost (+0.15)" }, { label: "Reduce confidence", description: "Select learnings to reduce (-0.2)" }, { label: "Delete specific learnings", description: "Select learnings to remove" }, { label: "Extend expiry", description: "Reset expires_at by learning-expiry-days from now" }, { label: "Done — no changes", description: "Exit without changes" } ] }] })
If user selects "Boost confidence", "Reduce confidence", "Delete specific learnings", or "Extend expiry", present a follow-up AskUserQuestion with
multiSelect: true listing all learnings by # | type | subject so the user can select which ones to modify.
On Codex CLI where AskUserQuestion is unavailable, present as a numbered Markdown list.
Step 4.4: Apply Changes
Use the same atomic rewrite strategy as Phase 3, Step 3.5:
- Read all lines from
learnings.jsonl - Apply the selected operation to selected learnings:
- Boost: +0.15 confidence (cap 1.0), reset expires_at to +
learning-expiry-days - Reduce: -0.2 confidence
- Delete: remove selected entries
- Extend: reset expires_at to current date +
learning-expiry-days
- Boost: +0.15 confidence (cap 1.0), reset expires_at to +
- Prune entries where
< current date ORexpires_at
<= 0.0confidence - Consolidate duplicates (same type + subject): keep highest confidence
- Write entire result back with
(atomic rewrite)>
Report: "Updated N learnings. Total active: K."
Phase 5: List Mode
Simple read-only display.
Step 5.1: Load and Display
- Read
. If not found, check.orchestrator/metrics/learnings.jsonl
as a legacy fallback.<state-dir>/metrics/learnings.jsonl - If neither exists: "No learnings yet. Run
to extract patterns from session history."/evolve analyze - Parse each line as JSON
Step 5.2: Formatted Output
Display a formatted table grouped by type:
## Active Learnings ### fragile-file | Subject | Confidence | Expires | Insight | |---------|------------|---------|---------| | ... | ... | ... | ... | ### effective-sizing | Subject | Confidence | Expires | Insight | |---------|------------|---------|---------| | ... | ... | ... | ... | (repeat for each type that has entries)
Step 5.3: Summary
Display summary line:
N active learnings (M high confidence, K expiring soon)
- High confidence = confidence > 0.7
- Expiring soon = expires_at within 14 days of current date
Critical Rules
- NEVER modify
without reading it first — race condition preventionlearnings.jsonl - NEVER skip the deduplication check — duplicates degrade the intelligence system
- NEVER write learnings without user confirmation — always present via AskUserQuestion first (on Codex CLI where AskUserQuestion is unavailable, present as a numbered Markdown list)
- ALWAYS use uuid-v4 for new learning IDs (generate via
or equivalent bash command)uuidgen - ALWAYS set
to current date +expires_at
from config (default: 30) for new learningslearning-expiry-days - ALWAYS present findings to user before writing — no silent writes
- ALWAYS use atomic rewrite (read all, modify, write all with
) — never append with>>> - ALWAYS cap confidence at 1.0 — never exceed
Anti-Patterns
- DO NOT write learnings without user confirmation — always present via AskUserQuestion first (on Codex CLI where AskUserQuestion is unavailable, present as a numbered Markdown list)
- DO NOT append to
— always use atomic rewrite (read all, modify, write all)learnings.jsonl - DO NOT create duplicate learnings — always check type + subject match first
- DO NOT set confidence above 1.0 or forget to cap it
- DO NOT fabricate patterns — only extract from actual session data with verifiable evidence
- DO NOT skip the pruning step — expired and zero-confidence entries must be removed on every write