Babysitter A/B Test Statistical Analyzer

Performs statistical analysis for A/B testing experiments

install
source · Clone the upstream repo
git clone https://github.com/a5c-ai/babysitter
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/a5c-ai/babysitter "$T" && mkdir -p ~/.claude/skills && cp -r "$T/library/specializations/data-engineering-analytics/skills/ab-test-statistical-analyzer" ~/.claude/skills/a5c-ai-babysitter-a-b-test-statistical-analyzer && rm -rf "$T"
manifest: library/specializations/data-engineering-analytics/skills/ab-test-statistical-analyzer/SKILL.md
source content

A/B Test Statistical Analyzer

Overview

Performs statistical analysis for A/B testing experiments. This skill provides rigorous statistical methods to determine experiment validity and significance.

Capabilities

  • Sample size calculation
  • Statistical significance testing
  • Bayesian analysis
  • Sequential testing
  • Multi-armed bandit analysis
  • Segment analysis
  • Novelty/primacy effect detection
  • SRM (Sample Ratio Mismatch) detection
  • Confidence interval calculation
  • Power analysis

Input Schema

{
  "experimentData": {
    "control": "object",
    "variants": ["object"]
  },
  "metrics": [{
    "name": "string",
    "type": "conversion|continuous|ratio"
  }],
  "analysisType": "frequentist|bayesian|sequential"
}

Output Schema

{
  "results": [{
    "metric": "string",
    "controlValue": "number",
    "variantValues": ["number"],
    "pValue": "number",
    "confidenceInterval": "object",
    "significant": "boolean"
  }],
  "srmCheck": "object",
  "recommendation": "string"
}

Target Processes

  • A/B Testing Pipeline
  • Feature Store Setup

Usage Guidelines

  1. Provide complete experiment data for control and variants
  2. Define metrics with appropriate types
  3. Select analysis methodology based on requirements
  4. Review SRM checks before interpreting results

Best Practices

  • Always check for sample ratio mismatch before analysis
  • Use appropriate statistical tests for metric types
  • Consider practical significance alongside statistical significance
  • Account for multiple comparison corrections
  • Document assumptions and limitations