Babysitter solution-comparator
Compare multiple solutions for correctness and performance
install
source · Clone the upstream repo
git clone https://github.com/a5c-ai/babysitter
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/a5c-ai/babysitter "$T" && mkdir -p ~/.claude/skills && cp -r "$T/library/specializations/algorithms-optimization/skills/solution-comparator" ~/.claude/skills/a5c-ai-babysitter-solution-comparator && rm -rf "$T"
manifest:
library/specializations/algorithms-optimization/skills/solution-comparator/SKILL.mdsource content
Solution Comparator Skill
Purpose
Compare multiple algorithm solutions against the same test cases to verify correctness and benchmark performance.
Capabilities
- Run solutions against same test cases
- Performance benchmarking and comparison
- Output diff analysis
- Find minimal failing test case
- Memory usage comparison
- Time complexity validation
Target Processes
- correctness-proof-testing
- complexity-optimization
- upsolving
- algorithm-implementation
Comparison Modes
- Correctness: Compare outputs against a known-correct solution
- Performance: Benchmark execution time across solutions
- Stress Testing: Run with random large inputs to find discrepancies
- Minimal Counter-example: Binary search to find smallest failing case
Input Schema
{ "type": "object", "properties": { "solutions": { "type": "array", "items": { "type": "object", "properties": { "name": { "type": "string" }, "code": { "type": "string" }, "language": { "type": "string" } } } }, "testCases": { "type": "array" }, "mode": { "type": "string", "enum": ["correctness", "performance", "stress", "minimal"] }, "oracleSolution": { "type": "string" }, "timeout": { "type": "integer", "default": 5000 } }, "required": ["solutions", "mode"] }
Output Schema
{ "type": "object", "properties": { "success": { "type": "boolean" }, "results": { "type": "array" }, "discrepancies": { "type": "array" }, "performance": { "type": "object" }, "minimalFailingCase": { "type": "object" } }, "required": ["success"] }