Babysitter phoenix-arize-setup
Arize Phoenix observability platform setup for LLM debugging and evaluation
install
source · Clone the upstream repo
git clone https://github.com/a5c-ai/babysitter
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/a5c-ai/babysitter "$T" && mkdir -p ~/.claude/skills && cp -r "$T/library/specializations/ai-agents-conversational/skills/phoenix-arize-setup" ~/.claude/skills/a5c-ai-babysitter-phoenix-arize-setup && rm -rf "$T"
manifest:
library/specializations/ai-agents-conversational/skills/phoenix-arize-setup/SKILL.mdsource content
Phoenix Arize Setup Skill
Capabilities
- Set up Phoenix local server
- Configure tracing instrumentation
- Design evaluation experiments
- Implement embedding visualizations
- Set up retrieval analysis
- Create custom evaluations with LLM-as-judge
Target Processes
- llm-observability-monitoring
- agent-evaluation-framework
Implementation Details
Core Features
- Tracing: OpenTelemetry-based LLM traces
- Evals: LLM-as-judge evaluations
- Embeddings: Visualization and drift detection
- Retrieval: RAG quality analysis
- Datasets: Experiment management
Instrumentation
- OpenAI auto-instrumentation
- LangChain instrumentation
- LlamaIndex instrumentation
- Custom span creation
Configuration Options
- Phoenix server setup
- Trace sampling
- Evaluation metrics
- Embedding models
- Export settings
Best Practices
- Comprehensive instrumentation
- Regular evaluation runs
- Monitor embedding drift
- Analyze retrieval quality
Dependencies
- arize-phoenix
- openinference-instrumentation-openai