Agentsys perf-benchmarker

Use when running performance benchmarks, establishing baselines, or validating regressions with sequential runs. Enforces 60s minimum runs (30s only for binary search) and no parallel benchmarks.

install

source · Clone the upstream repo

git clone https://github.com/agent-sh/agentsys

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/agent-sh/agentsys "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.kiro/skills/perf-benchmarker" ~/.claude/skills/agent-sh-agentsys-perf-benchmarker && rm -rf "$T"

manifest: .kiro/skills/perf-benchmarker/SKILL.md

source content

perf-benchmarker

Run sequential benchmarks with strict duration rules.

docs/perf-requirements.md

as the canonical contract.

Parse Arguments

const args = '$ARGUMENTS'.split(' ').filter(Boolean);
const command = args.find(a => !a.match(/^\d+$/)) || '';
const duration = parseInt(args.find(a => a.match(/^\d+$/)) || '60', 10);

Required Rules

Benchmarks MUST run sequentially (never parallel).
Minimum duration: 60s per run (30s only for binary search).
Warmup: 10s minimum before measurement.
Re-run anomalies.

Output Format

command: <benchmark command>
duration: <seconds>
warmup: <seconds>
results: <metrics summary>
notes: <anomalies or reruns>

Output Contract

Benchmarks MUST emit a JSON metrics block between markers:

PERF_METRICS_START
{"scenarios":{"low":{"latency_ms":120},"high":{"latency_ms":450}}}
PERF_METRICS_END

Constraints

No short runs unless binary-search phase.
Do not change code while benchmarking.