Vibe-Skills evaluating-machine-learning-models
install
source · Clone the upstream repo
git clone https://github.com/foryourhealth111-pixel/Vibe-Skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/foryourhealth111-pixel/Vibe-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/bundled/skills/evaluating-machine-learning-models" ~/.claude/skills/foryourhealth111-pixel-vibe-skills-evaluating-machine-learning-models && rm -rf "$T"
manifest:
bundled/skills/evaluating-machine-learning-models/SKILL.mdsource content
Model Evaluation Suite
Use this skill when the model exists and the question is whether it is good enough.
Overview
This skill focuses on choosing and interpreting the right evaluation metrics for the problem, then comparing candidate models or thresholds.
When to Use This Skill
- Comparing candidate models with consistent metrics
- Reviewing precision/recall/F1/AUC, regression error, calibration, or ranking quality
- Stress-testing validation strategy before deployment or publication
Not For / Boundaries
- Building the training pipeline itself: use
training-machine-learning-models - Engineering features: use
engineering-features-for-machine-learning - Checking train/test contamination: use
ml-data-leakage-guard
Typical Outputs
- Metric suite recommendations
- Model comparison tables
- Notes on threshold tradeoffs, calibration, and validation weaknesses
Related Skills
for class-level error breakdownsconfusion-matrix-generator
when the evaluation must become a deliverablescientific-reporting