Joelclaw langfuse
Instrument joelclaw LLM calls with Langfuse tracing. Covers the @langfuse/tracing SDK, observation hierarchy (spans, generations, tools, agents), propagateAttributes for userId/sessionId/tags, the pi-session extension (langfuse-cost), and the system-bus OTEL integration. Use when adding Langfuse traces, debugging missing/broken traces, checking cost data, or improving observability on any LLM surface.
git clone https://github.com/joelhooks/joelclaw
T=$(mktemp -d) && git clone --depth=1 https://github.com/joelhooks/joelclaw "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/langfuse" ~/.claude/skills/joelhooks-joelclaw-langfuse && rm -rf "$T"
skills/langfuse/SKILL.mdLangfuse Observability
Langfuse is the LLM observability layer for joelclaw. Every LLM call produces a Langfuse trace with nested hierarchy, I/O, usage, cost, and attribution.
Architecture
joelclaw has two Langfuse integration points:
1. Pi-session extension (langfuse-cost
)
langfuse-cost- Source:
(canonical, git-tracked in this repo)pi/extensions/langfuse-cost/index.ts - Runtime: loaded as a pi extension from the same source tree
- What it traces: Every gateway + interactive pi session LLM call
- How: Hooks into pi session events (
,session_start
,message_start
,message_end
,tool_call
,tool_result
)session_shutdown - Dedup:
guard prevents duplicate extension instancesglobalThis.__langfuse_cost_loaded__ - Optional dependency behavior:
is lazily loaded (no top-level hard import). Missing module must disable telemetry, not crash extension import. Regression test:langfusepi/extensions/langfuse-cost/index.test.ts - Runtime dependency location: because the extension is loaded from
at repo root instead of a workspace package, thepi/extensions/
npm package must be available from the repo rootlangfuse
. If root install drift drops it, gateway/session telemetry silently degrades to the optional-dependency warning again.package.json
2. System-bus OTEL bridge (langfuse.ts
)
langfuse.ts- Source:
packages/system-bus/src/lib/langfuse.ts - What it traces: All Inngest function LLM calls (reflect, triage, email cleanup, docs ingest)
- How:
@langfuse/otel
+LangfuseSpanProcessor@langfuse/tracingstartObservation() - Produces:
traces with generation childrenjoelclaw.inference
Current Trace Hierarchy (pi-session)
The
langfuse-cost extension produces a 4-level nested span hierarchy:
joelclaw.session (trace) └── session (span) — entire session lifetime └── turn-1 (span) — user message → final assistant response │ ├── tool:bash (span) — individual tool execution │ ├── tool:read (span) │ └── llm.call (generation) — the LLM API call with usage/cost └── turn-2 (span) ├── tool:edit (span) ├── tool:bash (span) └── llm.call (generation)
What each level captures
| Level | Created on | Ended on | Contains |
|---|---|---|---|
trace | | | userId, sessionId, tags, turn count |
span | | | Channel, session type, turn count |
span | | with text output | User input (clean), sourceChannel metadata |
span | event | event | Tool input, output (truncated 500 chars) |
generation | | immediate | Model, usage, cache tokens, cost, I/O |
Channel header stripping
User messages from Telegram arrive with a
---\nChannel:...\n--- header. The extension:
- Strips the header from trace
(clean user text only)input - Parses known keys (
,channel
,date
) intoplatform_capabilities
metadatasourceChannel - Skips multi-line values (e.g.
)formatting_guide
Credentials
Langfuse creds in
agent-secrets:
—langfuse_public_keypk-lf-cb8b...
—langfuse_secret_keysk-lf-c86f...
—langfuse_base_urlhttps://us.cloud.langfuse.com
Gateway gets them via
gateway-start.sh env exports. System-bus resolves via env → secrets lease fallback.
Gotcha:
secrets lease prints a JSON error envelope to stdout and still exits 0 when the daemon is unavailable. Any Langfuse loader that shells to secrets must either use --json and read result.value, or explicitly reject ok:false JSON payloads. Never trust raw stdout as a base URL or credential.
Trace Conventions
Naming
- Pi-session:
(trace) →joelclaw.session
→session
→turn-N
→tool:namellm.call - System-bus:
(trace) → generation childrenjoelclaw.inference
Required Attributes
Every trace MUST have:
userId: "joel"
— pi session ID for groupingsessionId
— minimum:tags["joelclaw", "pi-session"]- Dynamic tags:
,provider:anthropic
,model:anthropic/claude-opus-4-6
,channel:centralsession:central
Metadata Shape (flat, filterable)
{ channel: "central", // GATEWAY_ROLE env sessionType: "central", // "gateway" | "interactive" | "codex" | "central" component: "pi-session", model: "anthropic/claude-opus-4-6", provider: "anthropic", stopReason: "toolUse", // or "endTurn" turnCount: 5, // Updated on each turn sourceChannel: { // Only on first user message per turn channel: "telegram", date: "...", platform_capabilities: "..." }, tools: ["bash", "read"], // Tool names used this turn }
Generation usageDetails
{ input: 1, // Non-cached input tokens output: 97, // Output tokens total: 68195, // Total tokens cache_read_input_tokens: 67877, // 90% discount cache_write_input_tokens: 220, // 25% premium (NOT priced by Langfuse — known gap) }
Pi session guardrails (alert-only)
Long-running pi sessions can dominate Langfuse spend. The extension now tracks per-session totals and emits warnings only on first threshold breach per guardrail type:
(default:JOELCLAW_LANGFUSE_ALERT_MAX_LLM_CALLS
)120
(default:JOELCLAW_LANGFUSE_ALERT_MAX_TOTAL_TOKENS
)1200000
(default:JOELCLAW_LANGFUSE_ALERT_MAX_COST_USD
)20
Behavior:
- no automatic model switch
- no forced compaction
- no stop/interruption
- emits
with session ID + current countersconsole.warn(...) - records breach flags and first breach turn index in trace metadata (
)guardrails
Model/provider normalization
Both the pi-session extension and system-bus Langfuse bridge normalize provider/model before writing tags, trace metadata, and generation model fields. This keeps
provider:* + model:* tags aligned with metadata after model switches and for provider-prefixed IDs such as:
anthropic/claude-opus-4-6openai-codex/gpt-5.4
Normalization is fail-open: tracing continues even if normalization cannot resolve a value.
Output-contract + usage-coverage signals (2026-03-02)
System-bus inference now emits explicit coverage/output-contract metadata so low-yield calls are queryable:
usageCoverage: "present"|"missing"usageCaptured: boolean
,jsonRequested
,jsonParsedoutputChars- warning OTEL event:
model_router.usage_missing
For strict machine-readable paths, callers can require output contracts:
— parse failure becomes inference failurerequireJson: true
— empty text becomes inference failurerequireTextOutput: true
Recall rewrite traces now include
rewriteReason in addition to strategy (disabled|skipped|haiku|openai|fallback) to separate deliberate skips from failure fallbacks.
Known Gaps
| Issue | Severity | Notes |
|---|---|---|
not priced | Medium | Langfuse platform limitation — no cache write rate in their pricing table |
No on first turn | Low | not set before first |
matching | Low | Relies on — if pi changes the field name, spans won't close |
Debugging
Check recent traces
LF_PK=$(secrets lease langfuse_public_key --ttl 5m) LF_SK=$(secrets lease langfuse_secret_key --ttl 5m) curl -s -u "$LF_PK:$LF_SK" "https://us.cloud.langfuse.com/api/public/traces?limit=5" \ | jq '[.data[] | {name, ts: .timestamp[:19], obs: (.observations | length), output: (.output // "" | tostring | .[0:60])}]'
Check nested observations on a trace
TRACE_ID="<id>" curl -s -u "$LF_PK:$LF_SK" "https://us.cloud.langfuse.com/api/public/observations?traceId=$TRACE_ID" \ | jq '[.data[] | {name, type, model, startTime: .startTime[:19], endTime: .endTime[:19]}]'
Common Issues
| Symptom | Cause | Fix |
|---|---|---|
| Double traces | Extension loaded twice via symlink/realpath split | globalThis dedup guard (already fixed) |
output instead of tool names | events not firing | Check pi version, verify field on event |
| No traces at all | Langfuse creds missing | Check / env |
on gateway | not set | Must be in |
| Stale extension code | Gateway/interactive session not reloaded after change | Restart gateway and start a fresh interactive session |
| OTEL emit errors in gateway | system-bus-worker port-forward down | |
Key Files
- Pi extension:
pi/extensions/langfuse-cost/index.ts - Pi extension tests:
pi/extensions/langfuse-cost/index.test.ts - System-bus bridge:
packages/system-bus/src/lib/langfuse.ts - Gateway ops notes:
docs/gateway.md
Deployment Workflow
After editing the pi extension:
- Commit changes in this repo (source of truth).
- Restart gateway so the updated extension is loaded.
- Start a new interactive pi session (or reload) so per-session tracing uses the new code.
ADRs
- ADR-0146: Inference Cost Monitoring and Control —
shipped - ADR-0147: Named Agent Profiles (trace attribution by role)