Joelclaw langfuse

Instrument joelclaw LLM calls with Langfuse tracing. Covers the @langfuse/tracing SDK, observation hierarchy (spans, generations, tools, agents), propagateAttributes for userId/sessionId/tags, the pi-session extension (langfuse-cost), and the system-bus OTEL integration. Use when adding Langfuse traces, debugging missing/broken traces, checking cost data, or improving observability on any LLM surface.

install
source · Clone the upstream repo
git clone https://github.com/joelhooks/joelclaw
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/joelhooks/joelclaw "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/langfuse" ~/.claude/skills/joelhooks-joelclaw-langfuse && rm -rf "$T"
manifest: skills/langfuse/SKILL.md
source content

Langfuse Observability

Langfuse is the LLM observability layer for joelclaw. Every LLM call produces a Langfuse trace with nested hierarchy, I/O, usage, cost, and attribution.

Architecture

joelclaw has two Langfuse integration points:

1. Pi-session extension (
langfuse-cost
)

  • Source:
    pi/extensions/langfuse-cost/index.ts
    (canonical, git-tracked in this repo)
  • Runtime: loaded as a pi extension from the same source tree
  • What it traces: Every gateway + interactive pi session LLM call
  • How: Hooks into pi session events (
    session_start
    ,
    message_start
    ,
    message_end
    ,
    tool_call
    ,
    tool_result
    ,
    session_shutdown
    )
  • Dedup:
    globalThis.__langfuse_cost_loaded__
    guard prevents duplicate extension instances
  • Optional dependency behavior:
    langfuse
    is lazily loaded (no top-level hard import). Missing module must disable telemetry, not crash extension import. Regression test:
    pi/extensions/langfuse-cost/index.test.ts
  • Runtime dependency location: because the extension is loaded from
    pi/extensions/
    at repo root instead of a workspace package, the
    langfuse
    npm package must be available from the repo root
    package.json
    . If root install drift drops it, gateway/session telemetry silently degrades to the optional-dependency warning again.

2. System-bus OTEL bridge (
langfuse.ts
)

  • Source:
    packages/system-bus/src/lib/langfuse.ts
  • What it traces: All Inngest function LLM calls (reflect, triage, email cleanup, docs ingest)
  • How:
    @langfuse/otel
    LangfuseSpanProcessor
    +
    @langfuse/tracing
    startObservation()
  • Produces:
    joelclaw.inference
    traces with generation children

Current Trace Hierarchy (pi-session)

The

langfuse-cost
extension produces a 4-level nested span hierarchy:

joelclaw.session (trace)
  └── session (span) — entire session lifetime
        └── turn-1 (span) — user message → final assistant response
        │     ├── tool:bash (span) — individual tool execution
        │     ├── tool:read (span)
        │     └── llm.call (generation) — the LLM API call with usage/cost
        └── turn-2 (span)
              ├── tool:edit (span)
              ├── tool:bash (span)
              └── llm.call (generation)

What each level captures

LevelCreated onEnded onContains
joelclaw.session
trace
session_start
session_shutdown
userId, sessionId, tags, turn count
session
span
session_start
session_shutdown
Channel, session type, turn count
turn-N
span
message_start[user]
message_end[assistant]
with text output
User input (clean), sourceChannel metadata
tool:name
span
tool_call
event
tool_result
event
Tool input, output (truncated 500 chars)
llm.call
generation
message_end[assistant]
immediateModel, usage, cache tokens, cost, I/O

Channel header stripping

User messages from Telegram arrive with a

---\nChannel:...\n---
header. The extension:

  1. Strips the header from trace
    input
    (clean user text only)
  2. Parses known keys (
    channel
    ,
    date
    ,
    platform_capabilities
    ) into
    sourceChannel
    metadata
  3. Skips multi-line values (e.g.
    formatting_guide
    )

Credentials

Langfuse creds in

agent-secrets
:

  • langfuse_public_key
    pk-lf-cb8b...
  • langfuse_secret_key
    sk-lf-c86f...
  • langfuse_base_url
    https://us.cloud.langfuse.com

Gateway gets them via

gateway-start.sh
env exports. System-bus resolves via env →
secrets lease
fallback.

Gotcha:

secrets lease
prints a JSON error envelope to stdout and still exits
0
when the daemon is unavailable. Any Langfuse loader that shells to
secrets
must either use
--json
and read
result.value
, or explicitly reject
ok:false
JSON payloads. Never trust raw stdout as a base URL or credential.

Trace Conventions

Naming

  • Pi-session:
    joelclaw.session
    (trace) →
    session
    turn-N
    tool:name
    llm.call
  • System-bus:
    joelclaw.inference
    (trace) → generation children

Required Attributes

Every trace MUST have:

  • userId: "joel"
  • sessionId
    — pi session ID for grouping
  • tags
    — minimum:
    ["joelclaw", "pi-session"]
  • Dynamic tags:
    provider:anthropic
    ,
    model:anthropic/claude-opus-4-6
    ,
    channel:central
    ,
    session:central

Metadata Shape (flat, filterable)

{
  channel: "central",           // GATEWAY_ROLE env
  sessionType: "central",       // "gateway" | "interactive" | "codex" | "central"
  component: "pi-session",
  model: "anthropic/claude-opus-4-6",
  provider: "anthropic",
  stopReason: "toolUse",        // or "endTurn"
  turnCount: 5,                 // Updated on each turn
  sourceChannel: {              // Only on first user message per turn
    channel: "telegram",
    date: "...",
    platform_capabilities: "..."
  },
  tools: ["bash", "read"],      // Tool names used this turn
}

Generation usageDetails

{
  input: 1,                      // Non-cached input tokens
  output: 97,                    // Output tokens
  total: 68195,                  // Total tokens
  cache_read_input_tokens: 67877, // 90% discount
  cache_write_input_tokens: 220,  // 25% premium (NOT priced by Langfuse — known gap)
}

Pi session guardrails (alert-only)

Long-running pi sessions can dominate Langfuse spend. The extension now tracks per-session totals and emits warnings only on first threshold breach per guardrail type:

  • JOELCLAW_LANGFUSE_ALERT_MAX_LLM_CALLS
    (default:
    120
    )
  • JOELCLAW_LANGFUSE_ALERT_MAX_TOTAL_TOKENS
    (default:
    1200000
    )
  • JOELCLAW_LANGFUSE_ALERT_MAX_COST_USD
    (default:
    20
    )

Behavior:

  • no automatic model switch
  • no forced compaction
  • no stop/interruption
  • emits
    console.warn(...)
    with session ID + current counters
  • records breach flags and first breach turn index in trace metadata (
    guardrails
    )

Model/provider normalization

Both the pi-session extension and system-bus Langfuse bridge normalize provider/model before writing tags, trace metadata, and generation model fields. This keeps

provider:*
+
model:*
tags aligned with metadata after model switches and for provider-prefixed IDs such as:

  • anthropic/claude-opus-4-6
  • openai-codex/gpt-5.4

Normalization is fail-open: tracing continues even if normalization cannot resolve a value.

Output-contract + usage-coverage signals (2026-03-02)

System-bus inference now emits explicit coverage/output-contract metadata so low-yield calls are queryable:

  • usageCoverage: "present"|"missing"
  • usageCaptured: boolean
  • jsonRequested
    ,
    jsonParsed
    ,
    outputChars
  • warning OTEL event:
    model_router.usage_missing

For strict machine-readable paths, callers can require output contracts:

  • requireJson: true
    — parse failure becomes inference failure
  • requireTextOutput: true
    — empty text becomes inference failure

Recall rewrite traces now include

rewriteReason
in addition to strategy (
disabled|skipped|haiku|openai|fallback
) to separate deliberate skips from failure fallbacks.

Known Gaps

IssueSeverityNotes
cache_write_input_tokens
not priced
MediumLangfuse platform limitation — no cache write rate in their pricing table
No
completionStartTime
on first turn
Low
lastAssistantStartTime
not set before first
message_start[assistant]
tool_result
matching
LowRelies on
toolCallId
— if pi changes the field name, spans won't close

Debugging

Check recent traces

LF_PK=$(secrets lease langfuse_public_key --ttl 5m)
LF_SK=$(secrets lease langfuse_secret_key --ttl 5m)
curl -s -u "$LF_PK:$LF_SK" "https://us.cloud.langfuse.com/api/public/traces?limit=5" \
  | jq '[.data[] | {name, ts: .timestamp[:19], obs: (.observations | length), output: (.output // "" | tostring | .[0:60])}]'

Check nested observations on a trace

TRACE_ID="<id>"
curl -s -u "$LF_PK:$LF_SK" "https://us.cloud.langfuse.com/api/public/observations?traceId=$TRACE_ID" \
  | jq '[.data[] | {name, type, model, startTime: .startTime[:19], endTime: .endTime[:19]}]'

Common Issues

SymptomCauseFix
Double tracesExtension loaded twice via symlink/realpath splitglobalThis dedup guard (already fixed)
[toolUse]
output instead of tool names
tool_call
events not firing
Check pi version, verify
toolName
field on event
No traces at allLangfuse creds missingCheck
LANGFUSE_PUBLIC_KEY
/
LANGFUSE_SECRET_KEY
env
channel:interactive
on gateway
GATEWAY_ROLE
not set
Must be in
gateway-start.sh
Stale extension codeGateway/interactive session not reloaded after changeRestart gateway and start a fresh interactive session
OTEL emit errors in gatewaysystem-bus-worker port-forward down
kubectl port-forward -n joelclaw svc/system-bus-worker 3111:3111

Key Files

  • Pi extension:
    pi/extensions/langfuse-cost/index.ts
  • Pi extension tests:
    pi/extensions/langfuse-cost/index.test.ts
  • System-bus bridge:
    packages/system-bus/src/lib/langfuse.ts
  • Gateway ops notes:
    docs/gateway.md

Deployment Workflow

After editing the pi extension:

  1. Commit changes in this repo (source of truth).
  2. Restart gateway so the updated extension is loaded.
  3. Start a new interactive pi session (or reload) so per-session tracing uses the new code.

ADRs

  • ADR-0146: Inference Cost Monitoring and Control —
    shipped
  • ADR-0147: Named Agent Profiles (trace attribution by role)