Claude-skills setup
Set up a new autoresearch experiment interactively. Collects domain, target file, eval command, metric, direction, and evaluator.
install
source · Clone the upstream repo
git clone https://github.com/alirezarezvani/claude-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/alirezarezvani/claude-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.gemini/skills/setup" ~/.claude/skills/alirezarezvani-claude-skills-setup && rm -rf "$T"
manifest:
.gemini/skills/setup/SKILL.mdsource content
/ar:setup — Create New Experiment
Set up a new autoresearch experiment with all required configuration.
Usage
/ar:setup # Interactive mode /ar:setup engineering api-speed src/api.py "pytest bench.py" p50_ms lower /ar:setup --list # Show existing experiments /ar:setup --list-evaluators # Show available evaluators
What It Does
If arguments provided
Pass them directly to the setup script:
python {skill_path}/scripts/setup_experiment.py \ --domain {domain} --name {name} \ --target {target} --eval "{eval_cmd}" \ --metric {metric} --direction {direction} \ [--evaluator {evaluator}] [--scope {scope}]
If no arguments (interactive mode)
Collect each parameter one at a time:
- Domain — Ask: "What domain? (engineering, marketing, content, prompts, custom)"
- Name — Ask: "Experiment name? (e.g., api-speed, blog-titles)"
- Target file — Ask: "Which file to optimize?" Verify it exists.
- Eval command — Ask: "How to measure it? (e.g., pytest bench.py, python evaluate.py)"
- Metric — Ask: "What metric does the eval output? (e.g., p50_ms, ctr_score)"
- Direction — Ask: "Is lower or higher better?"
- Evaluator (optional) — Show built-in evaluators. Ask: "Use a built-in evaluator, or your own?"
- Scope — Ask: "Store in project (.autoresearch/) or user (~/.autoresearch/)?"
Then run
setup_experiment.py with the collected parameters.
Listing
# Show existing experiments python {skill_path}/scripts/setup_experiment.py --list # Show available evaluators python {skill_path}/scripts/setup_experiment.py --list-evaluators
Built-in Evaluators
| Name | Metric | Use Case |
|---|---|---|
| (lower) | Function/API execution time |
| (lower) | File, bundle, Docker image size |
| (higher) | Test suite pass percentage |
| (lower) | Build/compile/Docker build time |
| (lower) | Peak memory during execution |
| (higher) | Headlines, titles, descriptions |
| (higher) | System prompts, agent instructions |
| (higher) | Social posts, ad copy, emails |
After Setup
Report to the user:
- Experiment path and branch name
- Whether the eval command worked and the baseline metric
- Suggest: "Run
to start iterating, or/ar:run {domain}/{name}
for autonomous mode."/ar:loop {domain}/{name}