Vibe-Skills evaluating-machine-learning-models

install
source · Clone the upstream repo
git clone https://github.com/foryourhealth111-pixel/Vibe-Skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/foryourhealth111-pixel/Vibe-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/bundled/skills/evaluating-machine-learning-models" ~/.claude/skills/foryourhealth111-pixel-vibe-skills-evaluating-machine-learning-models && rm -rf "$T"
manifest: bundled/skills/evaluating-machine-learning-models/SKILL.md
source content

Model Evaluation Suite

Use this skill when the model exists and the question is whether it is good enough.

Overview

This skill focuses on choosing and interpreting the right evaluation metrics for the problem, then comparing candidate models or thresholds.

When to Use This Skill

  • Comparing candidate models with consistent metrics
  • Reviewing precision/recall/F1/AUC, regression error, calibration, or ranking quality
  • Stress-testing validation strategy before deployment or publication

Not For / Boundaries

  • Building the training pipeline itself: use
    training-machine-learning-models
  • Engineering features: use
    engineering-features-for-machine-learning
  • Checking train/test contamination: use
    ml-data-leakage-guard

Typical Outputs

  • Metric suite recommendations
  • Model comparison tables
  • Notes on threshold tradeoffs, calibration, and validation weaknesses

Related Skills

  • confusion-matrix-generator
    for class-level error breakdowns
  • scientific-reporting
    when the evaluation must become a deliverable