Claude-skill-registry langfuse-advanced-filters
Precisely filter and query Langfuse traces/observations using advanced filter operators for debugging and optimization workflows
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/langfuse-advanced-filters" ~/.claude/skills/majiayu000-claude-skill-registry-langfuse-advanced-filters && rm -rf "$T"
skills/data/langfuse-advanced-filters/SKILL.mdLangfuse Advanced Filters Skill
Leverage Langfuse's new advanced filtering API to perform surgical, precise queries on traces and observations. Perfect for debugging specific issues, analyzing performance patterns, and optimizing workflows with exact filter criteria.
When to Use This Skill
- "Find all traces with latency > 5 seconds for case 0001"
- "Show me observations where the edit node failed validation"
- "Get traces with metadata case_id=0001 AND profile_name='The Prep'"
- "Find all ERROR level observations from the last 24 hours"
- "Query traces where custom_metric > threshold"
- "Analyze traces with specific tag combinations"
- "Debug why tool selection is failing for financial topics"
What Makes This Different
Existing langfuse-optimization skill: Great for config analysis, but uses simple tag/name filters This skill: Surgical precision with advanced operators (
>, <, =, contains, etc.) and complex filter combinations
Advanced Filter Capabilities
Filter Operators
Based on Langfuse Launch Week 4 (Oct 2025), the API now supports:
- Exact match=
- Greater than (numeric, datetime)>
- Less than (numeric, datetime)<
- Greater than or equal>=
- Less than or equal<=
- String contains (case-sensitive)contains
- String does not containnot_contains
- Value in listin
- Value not in list Inot_in
Filter Structure
{ "column": "string", // Column to filter on (e.g., "name", "level", "metadata") "operator": "string", // Operator (=, >, <, contains, etc.) "value": "any", // Value to compare against "type": "string", // Data type: "string", "number", "stringObject", "datetime" "key": "string" // Required for metadata filters (e.g., "case_id") }
Filterable Fields
Traces:
- Trace namename
- User identifieruser_id
- Session identifiersession_id
- Tags arraytags
- Custom metadata (usemetadata
parameter)key
- Trace timestamptimestamp
- Input datainput
- Output dataoutput
Observations:
- Observation namename
- Observation type (SPAN, GENERATION, EVENT)type
- Log level (DEBUG, DEFAULT, WARNING, ERROR)level
- Parent trace IDtrace_id
- Parent observation IDparent_observation_id
- Observation start timestart_time
- Observation end timeend_time
- Custom metadata (usemetadata
parameter)key
- Computed latency (ms)latency
Required Environment Variables
: Your Langfuse public API keyLANGFUSE_PUBLIC_KEY
: Your Langfuse secret API keyLANGFUSE_SECRET_KEY
: Langfuse host URL (default: https://cloud.langfuse.com)LANGFUSE_HOST
Workflow
Step 1: Understand User's Query Intent
Ask clarifying questions to build precise filters:
For debugging:
- What specific issue are you investigating?
- What time range? (last hour, last 24h, specific date range)
- Any known trace IDs or patterns?
- Which component/node is problematic?
For optimization:
- What metric are you optimizing? (latency, error rate, check failures)
- What threshold defines "problematic"? (> 5s, < 0.7 score)
- Which case or workflow?
- Do you need aggregated metrics or individual traces?
For analysis:
- What pattern are you looking for?
- Do you need comparison across time periods?
- Should results be grouped by metadata field?
Step 2: Build Filter Query
Use the advanced filter helpers to construct precise queries:
Helper A: Query Traces with Advanced Filters
cd /home/user/writing_ecosystem/.claude/skills/langfuse-advanced-filters/helpers # Example 1: Find slow traces (latency > 5000ms) for a specific case python3 query_with_filters.py \ --view traces \ --filters '[ {"column": "metadata", "operator": "=", "key": "case_id", "value": "0001", "type": "stringObject"}, {"column": "latency", "operator": ">", "value": 5000, "type": "number"} ]' \ --from-date "2025-11-01" \ --to-date "2025-11-04" \ --limit 50 \ --output /tmp/langfuse_queries/slow_traces.json # Example 2: Find ERROR level observations in edit node python3 query_with_filters.py \ --view observations \ --filters '[ {"column": "name", "operator": "=", "value": "edit_node", "type": "string"}, {"column": "level", "operator": "=", "value": "ERROR", "type": "string"} ]' \ --from-date "2025-11-03" \ --limit 100 \ --output /tmp/langfuse_queries/edit_errors.json # Example 3: Complex AND filter - case AND profile AND time range python3 query_with_filters.py \ --view traces \ --filters '[ {"column": "metadata", "operator": "=", "key": "case_id", "value": "0001", "type": "stringObject"}, {"column": "metadata", "operator": "=", "key": "profile_name", "value": "The Prep", "type": "stringObject"}, {"column": "timestamp", "operator": ">=", "value": "2025-11-03T00:00:00Z", "type": "datetime"} ]' \ --limit 25 \ --output /tmp/langfuse_queries/prep_case_traces.json # Example 4: Find traces with specific tags AND name pattern python3 query_with_filters.py \ --view traces \ --filters '[ {"column": "name", "operator": "contains", "value": "workflow", "type": "string"}, {"column": "tags", "operator": "contains", "value": "production", "type": "string"} ]' \ --from-date "2025-11-01" \ --output /tmp/langfuse_queries/prod_workflows.json
Output: JSON file with filtered traces/observations matching ALL criteria (AND logic)
Helper B: Query Metrics with Filters (Aggregated Analysis)
For aggregated insights, use the Metrics API with filters:
cd /home/user/writing_ecosystem/.claude/skills/langfuse-advanced-filters/helpers # Example 1: Average latency by case_id, filtered by time range python3 query_metrics.py \ --view traces \ --metrics '[{"measure": "latency", "aggregation": "avg"}]' \ --dimensions '[{"field": "metadata.case_id"}]' \ --filters '[ {"column": "metadata", "operator": "!=", "key": "case_id", "value": null, "type": "stringObject"} ]' \ --from-date "2025-11-01" \ --to-date "2025-11-04" \ --output /tmp/langfuse_queries/latency_by_case.json # Example 2: Error count by observation name (which nodes fail most?) python3 query_metrics.py \ --view observations \ --metrics '[{"measure": "count", "aggregation": "count"}]' \ --dimensions '[{"field": "name"}]' \ --filters '[ {"column": "level", "operator": "=", "value": "ERROR", "type": "string"} ]' \ --from-date "2025-11-01" \ --output /tmp/langfuse_queries/error_counts.json # Example 3: P95 latency histogram for specific case (performance distribution) python3 query_metrics.py \ --view traces \ --metrics '[{"measure": "latency", "aggregation": "p95"}]' \ --filters '[ {"column": "metadata", "operator": "=", "key": "case_id", "value": "0001", "type": "stringObject"} ]' \ --from-date "2025-10-01" \ --to-date "2025-11-04" \ --time-granularity "day" \ --output /tmp/langfuse_queries/latency_p95_trend.json # Example 4: Count traces by user, filtered by metadata python3 query_metrics.py \ --view traces \ --metrics '[{"measure": "count", "aggregation": "count"}]' \ --dimensions '[{"field": "userId"}]' \ --filters '[ {"column": "metadata", "operator": "=", "key": "workflow_version", "value": "2", "type": "stringObject"} ]' \ --from-date "2025-11-01" \ --output /tmp/langfuse_queries/user_trace_counts.json
Output: Aggregated metrics grouped by dimensions
Helper C: Build Filter JSON (Interactive Builder)
For complex queries, use the interactive builder:
cd /home/user/writing_ecosystem/.claude/skills/langfuse-advanced-filters/helpers # Interactive mode python3 build_filters.py --interactive # Or programmatic mode python3 build_filters.py \ --add-filter '{"column": "metadata", "operator": "=", "key": "case_id", "value": "0001", "type": "stringObject"}' \ --add-filter '{"column": "latency", "operator": ">", "value": 3000, "type": "number"}' \ --validate \ --output /tmp/langfuse_queries/my_filters.json
Step 3: Analyze Results
Once you have filtered data, extract insights:
cd /home/user/writing_ecosystem/.claude/skills/langfuse-advanced-filters/helpers # Analyze filtered traces for patterns python3 analyze_filtered_results.py \ --input /tmp/langfuse_queries/slow_traces.json \ --analysis-type latency-breakdown \ --output /tmp/langfuse_queries/analysis_report.json # Compare two filter result sets python3 analyze_filtered_results.py \ --input /tmp/langfuse_queries/before_fix.json \ --compare /tmp/langfuse_queries/after_fix.json \ --analysis-type comparison \ --output /tmp/langfuse_queries/comparison_report.json
Step 4: Generate Insights Report
Synthesize findings into actionable recommendations:
# Filter Query Results - [Query Description] **Query**: [Natural language description] **Filters Applied**: ```json [Show filter JSON]
Results:
- Total matches: X traces / Y observations
- Time range: [start] to [end]
- Key patterns:
- [Pattern 1 with counts]
- [Pattern 2 with counts]
Findings
Issue #1: [Title]
Severity: High/Medium/Low Frequency: X occurrences (Y% of filtered set) Pattern: [Description]
Evidence:
- Trace IDs: [list top 3-5]
- Common metadata: [shared values]
- Time distribution: [pattern]
Root Cause Hypothesis: [Analysis based on filtered data]
Recommended Fix: [Specific action item]
Next Steps
- [Action item 1]
- [Action item 2]
- [Follow-up query to validate fix]
## Common Use Cases ### Use Case 1: Debug Slow Workflows **Query**: "Why are traces for case 0001 suddenly taking >10s?" **Filters**: ```json [ {"column": "metadata", "operator": "=", "key": "case_id", "value": "0001", "type": "stringObject"}, {"column": "timestamp", "operator": ">=", "value": "2025-11-03T00:00:00Z", "type": "datetime"}, {"column": "latency", "operator": ">", "value": 10000, "type": "number"} ]
Analysis:
- Retrieve matching traces
- Extract observations for each trace
- Compare node latencies
- Identify bottleneck node (research, write, edit)
Use Case 2: Find Failing Validation Checks
Query: "Which traces have failing style checks for tone_consistency?"
Approach:
- Filter traces by case
- Get edit node observations
- Extract validation_report from output
- Count tone_consistency failures
Filters:
[ {"column": "metadata", "operator": "=", "key": "case_id", "value": "0001", "type": "stringObject"}, {"column": "name", "operator": "contains", "value": "edit", "type": "string"} ]
Use Case 3: Analyze Tool Selection Patterns
Query: "Is the research node selecting finnhub for financial topics?"
Filters:
[ {"column": "name", "operator": "contains", "value": "research", "type": "string"}, {"column": "metadata", "operator": "contains", "key": "topic", "value": "stock", "type": "stringObject"} ]
Analysis:
- Extract tool selection from research node output
- Count finnhub vs other tools
- Identify cases where finnhub was NOT selected but should have been
Use Case 4: Performance Regression Detection
Query: "Did latency increase after deploying new workflow version?"
Strategy:
- Query metrics for workflow_version=1 (before)
- Query metrics for workflow_version=2 (after)
- Compare p95 latency
Filters (Before):
[ {"column": "metadata", "operator": "=", "key": "workflow_version", "value": "1", "type": "stringObject"}, {"column": "timestamp", "operator": ">=", "value": "2025-10-01", "type": "datetime"}, {"column": "timestamp", "operator": "<", "value": "2025-10-15", "type": "datetime"} ]
Filters (After):
[ {"column": "metadata", "operator": "=", "key": "workflow_version", "value": "2", "type": "stringObject"}, {"column": "timestamp", "operator": ">=", "value": "2025-10-15", "type": "datetime"} ]
Use Case 5: Error Spike Investigation
Query: "Why did ERROR observations spike in the last 6 hours?"
Filters:
[ {"column": "level", "operator": "=", "value": "ERROR", "type": "string"}, {"column": "start_time", "operator": ">=", "value": "2025-11-04T12:00:00Z", "type": "datetime"} ]
Analysis:
- Group errors by observation name (which node?)
- Group by trace_id to find affected workflows
- Extract error messages from status_message
- Identify common patterns
Filter Syntax Reference
Metadata Filters
Metadata filters require special syntax:
{ "column": "metadata", "operator": "=", // or ">", "<", "contains", etc. "key": "your_metadata_key", // REQUIRED for metadata "value": "expected_value", "type": "stringObject" // Always "stringObject" for metadata }
Example metadata keys (from writing ecosystem):
- Case identifier (0001, 0002, etc.)case_id
- Profile name ("The Prep", "Stock Deep Dive")profile_name
- Workflow version numberworkflow_version
- Node namelanggraph_node
- Step numberlanggraph_step
- Topic stringtopic
- Style identifierstyle_id
Combining Multiple Filters
All filters in the array are combined with AND logic:
[ {"column": "metadata", "operator": "=", "key": "case_id", "value": "0001", "type": "stringObject"}, {"column": "level", "operator": "=", "value": "ERROR", "type": "string"}, {"column": "latency", "operator": ">", "value": 5000, "type": "number"} ]
→ Matches:
case_id=0001 AND level=ERROR AND latency>5000
Time Range Filters
Two approaches:
Approach 1: Query-level parameters (recommended):
--from-date "2025-11-01T00:00:00Z" --to-date "2025-11-04T23:59:59Z"
Approach 2: Filter-level (for precise control):
[ {"column": "timestamp", "operator": ">=", "value": "2025-11-01T00:00:00Z", "type": "datetime"}, {"column": "timestamp", "operator": "<=", "value": "2025-11-04T23:59:59Z", "type": "datetime"} ]
Numeric Comparisons
// Latency greater than 5 seconds {"column": "latency", "operator": ">", "value": 5000, "type": "number"} // Token count less than 1000 {"column": "usage_total", "operator": "<", "value": 1000, "type": "number"} // Score exactly 8.5 {"column": "value", "operator": "=", "value": 8.5, "type": "number"}
String Operations
// Name contains "workflow" {"column": "name", "operator": "contains", "value": "workflow", "type": "string"} // Name exactly matches {"column": "name", "operator": "=", "value": "write_node", "type": "string"} // User ID in list {"column": "user_id", "operator": "in", "value": ["user1", "user2", "user3"], "type": "string"}
Tips & Best Practices
1. Start Broad, Then Narrow
# Step 1: Find all traces for case python3 query_with_filters.py --view traces \ --filters '[{"column": "metadata", "operator": "=", "key": "case_id", "value": "0001", "type": "stringObject"}]' \ --limit 100 # Step 2: Narrow to slow traces python3 query_with_filters.py --view traces \ --filters '[ {"column": "metadata", "operator": "=", "key": "case_id", "value": "0001", "type": "stringObject"}, {"column": "latency", "operator": ">", "value": 5000, "type": "number"} ]' \ --limit 50
2. Use Metrics API for Aggregation
Don't retrieve 1000 traces just to count them - use metrics:
# WRONG: Retrieve all traces and count in Python python3 query_with_filters.py --view traces --limit 1000 | wc -l # RIGHT: Use metrics API python3 query_metrics.py --view traces \ --metrics '[{"measure": "count", "aggregation": "count"}]' \ --dimensions '[{"field": "metadata.case_id"}]'
3. Validate Filters Before Large Queries
# Test with small limit first python3 query_with_filters.py --filters '[...]' --limit 5 # Once validated, increase limit python3 query_with_filters.py --filters '[...]' --limit 500
4. Save Filter Definitions
# Save complex filters for reuse cat > /tmp/my_filters/slow_case_0001.json <<EOF [ {"column": "metadata", "operator": "=", "key": "case_id", "value": "0001", "type": "stringObject"}, {"column": "latency", "operator": ">", "value": 5000, "type": "number"} ] EOF # Reuse saved filters python3 query_with_filters.py --filters-file /tmp/my_filters/slow_case_0001.json
5. Combine with Existing Skills
# Step 1: Use advanced filters to find problematic traces (THIS SKILL) python3 query_with_filters.py --filters '[...]' --output /tmp/filtered_traces.json # Step 2: Analyze those traces with langfuse-optimization skill # (Switch to langfuse-optimization skill with filtered trace IDs)
Troubleshooting
"No results returned":
- Verify filters are correct (check column names, types)
- Try broader time range
- Remove filters one by one to isolate issue
- Check if data exists with basic query first
"Invalid filter syntax":
- Ensure JSON is valid (use
)python3 build_filters.py --validate - Check
matches data type (string, number, datetime, stringObject)type - For metadata, ensure
is specifiedkey
"Query timeout":
- Reduce time range
- Add more specific filters
- Use pagination (multiple queries with smaller limits)
"Metadata filter not working":
- Ensure
istype
(not"stringObject"
)"string" - Verify
matches exact metadata field namekey - Check metadata exists in traces (inspect raw trace first)
Success Criteria
Good queries should:
- ✅ Use precise filters (not retrieving 10x more data than needed)
- ✅ Combine multiple filters for surgical precision
- ✅ Validate results match expectation
- ✅ Lead to actionable insights (not just data dumps)
- ✅ Be reproducible (save filter definitions)
Remember: This skill is about precision filtering, not general trace analysis. Use it when you need:
- Exact matching criteria
- Numeric thresholds
- Complex AND conditions
- Metadata-based filtering
- Aggregated metrics
For general config optimization, use the
langfuse-optimization skill instead.