Agentic-context-engine kayba-pipeline

End-to-end agent evaluation and improvement pipeline. Takes a traces folder and optional HITL flag, then orchestrates sub-agents through 7 stages — each stage is its own skill invoked by a dedicated sub-agent. Trigger when the user says "run the pipeline", "kayba pipeline", "evaluate and fix", "full eval", "analyze traces and fix", or provides a traces folder with intent to improve their agent.

install

source · Clone the upstream repo

git clone https://github.com/kayba-ai/agentic-context-engine

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/kayba-ai/agentic-context-engine "$T" && mkdir -p ~/.claude/skills && cp -r "$T/ace/cli/skills/kayba-pipeline" ~/.claude/skills/kayba-ai-agentic-context-engine-kayba-pipeline-1f97e4 && rm -rf "$T"

manifest: ace/cli/skills/kayba-pipeline/SKILL.md

source content

kayba-pipeline

End-to-end pipeline: analyze traces → define metrics → build rubric → plan fixes → implement fixes.

Each stage is a separate skill file that can be run independently or as part of this pipeline.

Inputs

The user provides two things:

TRACES_FOLDER
— path to a directory containing trace JSON files
HITL
—
```
true
```
or
```
false
```
— whether to pause for human review before implementing fixes

If the user doesn't specify HITL, default to

true

(safe default).

Pipeline overview

┌─────────────────────────────────────────────────────────────────────┐
│  Stage 1: Kayba API Analysis        → skill: kayba-pipeline:stage-1-api-analysis   │
│  Stage 2: Domain Context Gathering  → skill: kayba-pipeline:stage-2-domain-context │
│  ─── stages 1 & 2 run in parallel ───                                              │
│  Stage 3: Metrics & Analysis        → skill: kayba-pipeline:stage-3-metrics        │
│  Stage 4: Rubric Definition         → skill: kayba-pipeline:stage-4-rubric         │
│  Stage 5: Action Plan               → skill: kayba-pipeline:stage-5-action-plan    │
│  Stage 6: HITL Gate                 → skill: kayba-pipeline:stage-6-hitl           │
│  Stage 7: Fix Implementation        → skill: kayba-pipeline:stage-7-fixer          │
└─────────────────────────────────────────────────────────────────────┘

Orchestration instructions

You are the orchestrator. Your job is to:

Create the
```
eval/
```
directory and
```
eval/pipeline_log.md
```
Spawn sub-agents that invoke stage skills via the Skill tool
Coordinate stage ordering and handle the HITL gate

Setup

Create

eval/

directory and initialize

eval/pipeline_log.md

# Pipeline Log

| Stage | Name | Status | Started | Completed | Notes |
|-------|------|--------|---------|-----------|-------|
| 1 | Kayba API Analysis | pending | | | |
| 2 | Domain Context | pending | | | |
| 3 | Metrics & Analysis | pending | | | |
| 4 | Rubric Definition | pending | | | |
| 5 | Action Plan | pending | | | |
| 6 | HITL Gate | pending | | | |
| 7 | Fix Implementation | pending | | | |

Stages 1 & 2 — run in parallel

Spawn two sub-agents in parallel using the Agent tool:

Agent 1:

Name:
```
api-analyst
```
Type:
```
general-purpose
```

Prompt:

Invoke the skill "kayba-pipeline:stage-1-api-analysis" using the Skill tool. The traces folder is: {TRACES_FOLDER}. Follow the skill instructions completely.

Agent 2:

Name:
```
domain-scout
```
Type:
```
general-purpose
```

Prompt:

Invoke the skill "kayba-pipeline:stage-2-domain-context" using the Skill tool. The traces folder is: {TRACES_FOLDER}. Follow the skill instructions completely.

Wait for both to complete before proceeding.

Stage 3 — sequential

Spawn one sub-agent after stages 1 & 2 complete:

Name:
```
metric-engineer
```
Type:
```
general-purpose
```

Prompt:

Invoke the skill "kayba-pipeline:stage-3-metrics" using the Skill tool. The traces folder is: {TRACES_FOLDER}. Follow the skill instructions completely — this includes iterating on the metrics until you're satisfied.

Stage 4 — sequential

Spawn one sub-agent after stage 3 completes:

Name:
```
rubric-builder
```
Type:
```
general-purpose
```

Prompt:

Invoke the skill "kayba-pipeline:stage-4-rubric" using the Skill tool. Follow the skill instructions completely.

Stage 5 — sequential

Spawn one sub-agent after stage 4 completes:

Name:
```
action-planner
```
Type:
```
general-purpose
```

Prompt:

Invoke the skill "kayba-pipeline:stage-5-action-plan" using the Skill tool. Follow the skill instructions completely.

Stage 6 — HITL Gate

HITL

is
true
:

Spawn one sub-agent after stage 5 completes:

Name:
```
hitl-reviewer
```
Type:
```
general-purpose
```

Prompt:

Invoke the skill "kayba-pipeline:stage-6-hitl" using the Skill tool. Follow the skill instructions completely. Present the full review to the user and collect their decision before proceeding.

Wait for the sub-agent to complete. Check

eval/stage6_decision.md

for the outcome:

If decision is "Approve all" or "Approve with modifications" — proceed to Stage 7
If decision is "Reject" — re-run Stage 5 with the user feedback recorded in
```
eval/stage6_decision.md
```
, then re-run Stage 6
Only proceed to Stage 7 after a clear approval is recorded

HITL

is
false
:

Skip to Stage 7
Log "HITL skipped" in
```
eval/pipeline_log.md
```

Stage 7 — sequential

Spawn one sub-agent after stage 6 completes (or is skipped):

Name:
```
fixer
```
Type:
```
general-purpose
```

Prompt:

Invoke the skill "kayba-pipeline:stage-7-fixer" using the Skill tool. Follow the skill instructions completely.

Error handling

If any stage fails, log the failure in
```
eval/pipeline_log.md
```
with the stage number and error
Do not proceed to dependent stages if a prerequisite failed
If Stage 1 fails (kayba CLI issues), ask the user whether to proceed without API insights — if yes, skip Stage 1 and have Stage 3 work from domain context + raw traces only

After completion

Update

eval/pipeline_log.md

with final status for all stages. Report to the user:

How many stages completed successfully
Summary of metrics (from rubric)
Summary of fixes applied (from changes log)