Session-orchestrator evolve

install
source · Clone the upstream repo
git clone https://github.com/Kanevry/session-orchestrator
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/Kanevry/session-orchestrator "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/evolve" ~/.claude/skills/kanevry-session-orchestrator-evolve && rm -rf "$T"
manifest: skills/evolve/SKILL.md
source content

Platform Note: State files use the platform's native directory:

.claude/
(Claude Code),
.codex/
(Codex CLI), or
.cursor/
(Cursor IDE). Shared metrics live in
.orchestrator/metrics/
(v2) with fallback to
<state-dir>/metrics/
for pre-v2.0 legacy data. See
skills/_shared/platform-tools.md
.

Evolve Skill

Phase 0: Bootstrap Gate

Read

skills/_shared/bootstrap-gate.md
and execute the gate check. If the gate is CLOSED, invoke
skills/bootstrap/SKILL.md
and wait for completion before proceeding. If the gate is OPEN, continue to Phase 1.

<HARD-GATE> Do NOT proceed past Phase 0 if GATE_CLOSED. There is no bypass. Refer to `skills/_shared/bootstrap-gate.md` for the full HARD-GATE constraints. </HARD-GATE>

Phase 1: Config & Data Loading

1.1 Read Session Config

Read and parse Session Config per

skills/_shared/config-reading.md
. Store result as
$CONFIG
.

1.2 Check Persistence

Extract

persistence
from
$CONFIG
. If
persistence
is
false
, abort with message:

"Learnings require persistence to be enabled in Session Config. Add

persistence: true
to your Session Config block (CLAUDE.md for Claude Code, AGENTS.md for Codex CLI)."

1.3 Determine Mode

Read mode from

$ARGUMENTS
:

  • If empty or not provided, default to
    analyze
  • Valid modes:
    analyze
    ,
    review
    ,
    list
  • If invalid mode provided, report error and list valid modes

1.4 Load Data

Lazy-create defensive (#185): If

.orchestrator/metrics/learnings.jsonl
does not exist (pre-#185 repo or bootstrap skipped), create an empty file and emit an info log — do NOT hard-fail:

LEARNINGS_FILE=".orchestrator/metrics/learnings.jsonl"
if [[ ! -f "$LEARNINGS_FILE" ]]; then
  mkdir -p "$(dirname "$LEARNINGS_FILE")"
  : > "$LEARNINGS_FILE"
  echo "info(#185): auto-created $LEARNINGS_FILE (was missing)" >&2
fi

This defensive step is idempotent and cheap — it ensures

/evolve analyze|review|list
never fails because of a missing artifact file.

  1. Read
    .orchestrator/metrics/sessions.jsonl
    (session history). If it does not exist, check
    <state-dir>/metrics/sessions.jsonl
    as a legacy fallback (where
    <state-dir>
    is
    .claude/
    ,
    .codex/
    , or
    .cursor/
    per platform). If neither exists, warn: "No session history found. Run at least one session first."
  2. Read
    .orchestrator/metrics/learnings.jsonl
    if it exists. If not found, check
    <state-dir>/metrics/learnings.jsonl
    as a legacy fallback.
  3. Count existing learnings, note any where
    expires_at
    < current date (expired)

Phase 2: Mode Dispatch

Route based on mode:

  • analyze
    → Phase 3
  • review
    → Phase 4
  • list
    → Phase 5

Phase 3: Analyze Mode (default)

Extract learnings from session history.

Vault Integration: If

vault-integration.enabled
is
true
in Session Config, confirmed learnings are mirrored to the configured Obsidian vault after the atomic write (Step 3.5, step 9). See
docs/session-config-reference.md
for the
vault-integration
config block.

Step 3.1: Read Session Data

  • Read all entries from
    .orchestrator/metrics/sessions.jsonl
    (or
    <state-dir>/metrics/sessions.jsonl
    if the v2 path does not exist — see Phase 1.4 fallback)
  • Parse each JSONL line as JSON
  • Sort by
    completed_at
    descending (most recent first)
  • If no sessions found, abort: "No session data available. Complete at least one session before running evolve."

Step 3.2: Pattern Extraction

For each of the 6 learning types, apply these heuristics:

1. fragile-file (type:
fragile-file
)

  • Look at wave data: if the same file appears in 3+ waves'
    files_changed
    within a session, it is fragile
  • Cross-session: if a file appears in 3+ different sessions'
    files_changed
    , flag it
  • Subject = file path (relative to project root)

2. effective-sizing (type:
effective-sizing
)

  • Compare
    total_agents
    and
    total_waves
    across session types
  • Calculate average agents per wave for each session type
  • Subject = canonical identifier like
    deep-session-sizing
    or
    feature-session-sizing
  • Insight = "Deep sessions average X agents across Y waves" or "Feature sessions work well with X agents/wave"

3. recurring-issue (type:
recurring-issue
)

  • Look at
    agent_summary
    — if
    failed
    or
    partial
    > 0 across multiple sessions, flag
  • Check wave
    quality
    fields — repeated failures indicate recurring issues
  • Subject = issue pattern identifier (e.g., "test-failures-in-wave-execution", "lint-regressions")

4. scope-guidance (type:
scope-guidance
)

  • Cross-reference
    effectiveness.planned_issues
    vs
    effectiveness.completion_rate
  • Skip sessions that lack the
    effectiveness
    field
    (early sessions may not have it)
  • If completion_rate is consistently 1.0 with N issues, note "N issues per session works well"
  • If completion_rate < 0.7, note "scope was too large"
  • Subject =
    optimal-scope-per-session-type

5. deviation-pattern (type:
deviation-pattern
)

Ownership Reference: See

skills/_shared/state-ownership.md
. evolve has read-only access to STATE.md.

  • Read
    <state-dir>/STATE.md
    if it exists and check
    ## Deviations
    section
  • Cross-reference with session duration vs planned waves
  • Subject = pattern name (e.g., "scope-creep-in-feature-sessions", "underestimated-complexity")

6. stagnation-class-frequency (type:
stagnation-class-frequency
)

  • Read
    stagnation_events
    from the most recent 5 sessions in
    sessions.jsonl
    (skip sessions lacking the field — they predate #84).
  • For each
    (file, error_class)
    pair appearing in ≥2 sessions, extract a candidate:
    • Subject =
      <file>:<error_class>
      (e.g.,
      skills/wave-executor/wave-loop.md:edit-format-friction
      )
    • Insight = "File <X> has <error_class> stagnation in <N> recent sessions — candidate for pre-edit grounding (#85)."
    • Evidence = "<N> sessions with stagnation_events for this file/class"
  • These learnings feed #85 (pre-edit grounding injection) when it ships — high-frequency pairs trigger grounding.

Step 3.2b: Zero Patterns Check

If no patterns were extracted across all 6 types, report: "No patterns found in session history. This can happen with very few sessions or sessions that lack detailed wave/agent data." and skip to end (do not proceed to AskUserQuestion).

Step 3.3: Deduplicate Against Existing Learnings

For each extracted pattern, check if a learning with same

type
+
subject
already exists in
learnings.jsonl
:

  • If exists: propose confidence update (+0.15 if confirmed by new evidence, -0.2 if contradicted)
  • If new: propose as new learning with confidence 0.5

Step 3.4: Present Findings via AskUserQuestion

Present extracted patterns to the user for confirmation. Use AskUserQuestion with

multiSelect: true
:

On Codex CLI where AskUserQuestion is unavailable, present as a numbered Markdown list.

AskUserQuestion({
  questions: [{
    question: "Which learnings should be saved?\n\nExtracted patterns from session history:",
    header: "Evolve — Confirm Learnings",
    options: [
      {
        label: "[type] subject",
        description: "insight | evidence: ... | confidence: 0.5 (new) or +0.15 (update)"
      },
      ...
      {
        label: "Skip all",
        description: "Do not save any learnings this time"
      }
    ],
    multiSelect: true
  }]
})

If user selects "Skip all" or selects nothing, abort gracefully: "No learnings saved."

Step 3.5: Write Confirmed Learnings

For confirmed learnings, use atomic rewrite strategy:

  1. Read ALL existing lines from

    .orchestrator/metrics/learnings.jsonl
    (if exists) into memory. If not found, check
    <state-dir>/metrics/learnings.jsonl
    as a legacy fallback. If legacy data is found, it will be migrated to the v2 path on write (step 8).

  2. Apply confidence updates for confirmed existing learnings:

    • Increment confidence by +0.15
    • Cap at 1.0
    • Reset
      expires_at
      to current date +
      learning-expiry-days
      (default: 30)
  3. Apply confidence decrements for contradicted learnings (-0.2) — do NOT reset

    expires_at
    for contradicted learnings (let them decay naturally)

  4. Append new learnings with:

    • id
      : generate a uuid-v4 (use
      uuidgen
      or equivalent)
    • type
      : one of
      fragile-file
      ,
      effective-sizing
      ,
      recurring-issue
      ,
      scope-guidance
      ,
      deviation-pattern
      ,
      stagnation-class-frequency
    • subject
      : the pattern subject
    • insight
      : human-readable description of the pattern
    • evidence
      : specific data points that support the pattern
    • confidence
      : 0.5 for new learnings
    • source_session
      : session ID from which the pattern was extracted
    • created_at
      : current ISO 8601 date
    • expires_at
      : current date +
      learning-expiry-days
      (default: 30) (ISO 8601)
  5. Verify write: Read back the first line of the written file to confirm valid JSON. If read-back fails or is not valid JSON, report error to user.

  6. Prune: remove entries where

    expires_at
    < current date OR
    confidence
    <= 0.0

  7. Consolidate duplicates: if same

    type
    +
    subject
    appears more than once, keep the entry with highest confidence

  8. Write entire result back to

    .orchestrator/metrics/learnings.jsonl
    with
    >
    (atomic rewrite, NOT append
    >>
    )

  9. Vault mirror (conditional): Check

    $CONFIG."vault-integration".enabled
    via jq. If the field is missing or
    false
    , skip this step entirely — skill behavior is unchanged.

    If

    enabled
    is
    true
    :

    a. Check

    $CONFIG."vault-integration".mode
    . If
    mode
    is
    off
    , skip the mirror invocation (treat as disabled). If
    mode
    is absent, default to
    warn
    .

    b. Resolve the vault directory: use

    $CONFIG."vault-integration"."vault-dir"
    if non-null, otherwise fall back to the
    $VAULT_DIR
    environment variable. If neither is set, emit a warning and skip.

    c. Invoke the mirror script:

    node "$PLUGIN_ROOT/scripts/vault-mirror.mjs" \
      --vault-dir "<vault-dir>" \
      --source .orchestrator/metrics/learnings.jsonl \
      --kind learning
    

    d. Handle the exit code according to

    mode
    :

    • warn
      (default): on non-zero exit, surface a warning in evolve output (e.g. "Warning: vault mirror failed — learnings saved locally but not mirrored.") but do NOT fail the skill.
    • strict
      : on non-zero exit, fail the skill immediately and report the error to the user.

    e. On success (exit 0), report: "Mirrored N learnings to

    <vault-dir>/40-learnings/
    ."

Report: "Saved N new learnings, updated M existing. Total active: K."


Phase 4: Review Mode

Interactive management of existing learnings.

Step 4.1: Load Learnings

  • Read
    .orchestrator/metrics/learnings.jsonl
    . If not found, check
    <state-dir>/metrics/learnings.jsonl
    as a legacy fallback.
  • If neither exists or both are empty: "No learnings found. Run
    /evolve analyze
    first."
  • Parse each line as JSON

Step 4.2: Display Learnings

Present a formatted table grouped by type:

## Active Learnings

| # | Type | Subject | Confidence | Expires | Insight |
|---|------|---------|------------|---------|---------|
| 1 | fragile-file | src/lib/auth.ts | 0.80 | 2026-07-05 | Changed in 4 of last 5 sessions |
| 2 | effective-sizing | feature-session-sizing | 0.65 | 2026-06-20 | Feature sessions work well with 3 agents/wave |
| ... | ... | ... | ... | ... | ... |

Summary: N active learnings (M high confidence, K expiring soon)

Step 4.3: Interactive Management

Use AskUserQuestion with options:

On Codex CLI where AskUserQuestion is unavailable, present as a numbered Markdown list.

AskUserQuestion({
  questions: [{
    question: "What would you like to do with your learnings?",
    header: "Evolve — Review",
    options: [
      { label: "Boost confidence", description: "Select learnings to boost (+0.15)" },
      { label: "Reduce confidence", description: "Select learnings to reduce (-0.2)" },
      { label: "Delete specific learnings", description: "Select learnings to remove" },
      { label: "Extend expiry", description: "Reset expires_at by learning-expiry-days from now" },
      { label: "Done — no changes", description: "Exit without changes" }
    ]
  }]
})

If user selects "Boost confidence", "Reduce confidence", "Delete specific learnings", or "Extend expiry", present a follow-up AskUserQuestion with

multiSelect: true
listing all learnings by
# | type | subject
so the user can select which ones to modify.

On Codex CLI where AskUserQuestion is unavailable, present as a numbered Markdown list.

Step 4.4: Apply Changes

Use the same atomic rewrite strategy as Phase 3, Step 3.5:

  1. Read all lines from
    learnings.jsonl
  2. Apply the selected operation to selected learnings:
    • Boost: +0.15 confidence (cap 1.0), reset expires_at to +
      learning-expiry-days
    • Reduce: -0.2 confidence
    • Delete: remove selected entries
    • Extend: reset expires_at to current date +
      learning-expiry-days
  3. Prune entries where
    expires_at
    < current date OR
    confidence
    <= 0.0
  4. Consolidate duplicates (same type + subject): keep highest confidence
  5. Write entire result back with
    >
    (atomic rewrite)

Report: "Updated N learnings. Total active: K."


Phase 5: List Mode

Simple read-only display.

Step 5.1: Load and Display

  • Read
    .orchestrator/metrics/learnings.jsonl
    . If not found, check
    <state-dir>/metrics/learnings.jsonl
    as a legacy fallback.
  • If neither exists: "No learnings yet. Run
    /evolve analyze
    to extract patterns from session history."
  • Parse each line as JSON

Step 5.2: Formatted Output

Display a formatted table grouped by type:

## Active Learnings

### fragile-file
| Subject | Confidence | Expires | Insight |
|---------|------------|---------|---------|
| ... | ... | ... | ... |

### effective-sizing
| Subject | Confidence | Expires | Insight |
|---------|------------|---------|---------|
| ... | ... | ... | ... |

(repeat for each type that has entries)

Step 5.3: Summary

Display summary line:

N active learnings (M high confidence, K expiring soon)
  • High confidence = confidence > 0.7
  • Expiring soon = expires_at within 14 days of current date

Critical Rules

  • NEVER modify
    learnings.jsonl
    without reading it first — race condition prevention
  • NEVER skip the deduplication check — duplicates degrade the intelligence system
  • NEVER write learnings without user confirmation — always present via AskUserQuestion first (on Codex CLI where AskUserQuestion is unavailable, present as a numbered Markdown list)
  • ALWAYS use uuid-v4 for new learning IDs (generate via
    uuidgen
    or equivalent bash command)
  • ALWAYS set
    expires_at
    to current date +
    learning-expiry-days
    from config (default: 30) for new learnings
  • ALWAYS present findings to user before writing — no silent writes
  • ALWAYS use atomic rewrite (read all, modify, write all with
    >
    ) — never append with
    >>
  • ALWAYS cap confidence at 1.0 — never exceed

Anti-Patterns

  • DO NOT write learnings without user confirmation — always present via AskUserQuestion first (on Codex CLI where AskUserQuestion is unavailable, present as a numbered Markdown list)
  • DO NOT append to
    learnings.jsonl
    — always use atomic rewrite (read all, modify, write all)
  • DO NOT create duplicate learnings — always check type + subject match first
  • DO NOT set confidence above 1.0 or forget to cap it
  • DO NOT fabricate patterns — only extract from actual session data with verifiable evidence
  • DO NOT skip the pruning step — expired and zero-confidence entries must be removed on every write