AutoResearchClaw experimental-design

Name: experimental-design
Author: aiming-lab

Best practices for designing reproducible ML experiments. Use when planning ablations, baselines, or controlled experiments.

install

source · Clone the upstream repo

git clone https://github.com/aiming-lab/AutoResearchClaw

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/aiming-lab/AutoResearchClaw "$T" && mkdir -p ~/.claude/skills && cp -r "$T/researchclaw/skills/builtin/experiment/experimental-design" ~/.claude/skills/aiming-lab-autoresearchclaw-experimental-design && rm -rf "$T"

manifest: researchclaw/skills/builtin/experiment/experimental-design/SKILL.md

source content

Experimental Design Best Practice

ALWAYS include meaningful baselines (not just random):
- At least one classical method baseline
- At least one recent SOTA method baseline
- A simple-but-strong baseline (e.g., linear probe, k-NN)
Use MULTIPLE random seeds (minimum 3, ideally 5)
Report mean +/- std across seeds
Design ablations that isolate EACH key component:
- Remove one component at a time
- Each ablation must be meaningfully different from baseline
Control variables: change only ONE thing per comparison
Use standard splits (train/val/test) — never test on training data
Report wall-clock time and memory usage alongside accuracy