install
source · Clone the upstream repo
git clone https://github.com/nwiizo/ccswarm
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/nwiizo/ccswarm "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/benchmark-runner" ~/.claude/skills/nwiizo-ccswarm-benchmark-runner && rm -rf "$T"
manifest:
.claude/skills/benchmark-runner/SKILL.mdsource content
Benchmark Runner Workflow
Multi-step workflow for running performance benchmarks on ccswarm.
Overview
This skill guides you through running, analyzing, and comparing performance benchmarks for the ccswarm system.
Benchmark Types
1. Microbenchmarks
Fine-grained performance measurements for specific functions.
2. Integration Benchmarks
End-to-end performance for complete workflows.
3. Load Testing
System behavior under sustained load.
Running Benchmarks
1. Setup
# Ensure criterion is available cargo install cargo-criterion # Build in release mode first cargo build --release --workspace
2. Run All Benchmarks
# Run with criterion cargo criterion --workspace # Or standard bench cargo bench --workspace
3. Run Specific Benchmarks
# By crate cargo bench -p ccswarm # By benchmark name cargo bench --bench session_benchmark # By filter cargo bench -- "orchestrator"
Benchmark Configuration
Criterion Config
// benches/bench.rs use criterion::{criterion_group, criterion_main, Criterion}; fn benchmark_task_delegation(c: &mut Criterion) { c.bench_function("delegate_task", |b| { b.iter(|| { // Benchmark code }) }); } criterion_group!(benches, benchmark_task_delegation); criterion_main!(benches);
Cargo.toml
[[bench]] name = "orchestrator_bench" harness = false
Analysis
1. View Results
# Results are in target/criterion/ open target/criterion/report/index.html
2. Compare with Baseline
# Save baseline cargo bench -- --save-baseline before # Make changes, then compare cargo bench -- --baseline before
3. Profiling
# With flamegraph cargo flamegraph --bench orchestrator_bench # With perf (Linux) perf record cargo bench --bench orchestrator_bench perf report
Key Metrics
Session Management
- Session creation time
- Context compression ratio
- Message throughput
Orchestration
- Task delegation latency
- Agent assignment time
- Consensus round time
Memory
- Peak memory usage
- Memory per agent
- Context size
Benchmark Checklist
- Clean build before benchmarking
- Disable CPU frequency scaling
- Close other applications
- Run multiple iterations
- Compare with baseline
- Document results
CI Integration
GitHub Actions
benchmark: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: dtolnay/rust-toolchain@stable - run: cargo bench -- --save-baseline ci - uses: actions/upload-artifact@v4 with: name: benchmark-results path: target/criterion/
Interpreting Results
Good Performance
- < 1ms for task delegation
- < 100ms for full workflow
- < 10MB memory per agent
Warning Signs
- High variance (> 20%)
- Memory growth over time
- Degradation vs baseline
Optimization Workflow
- Run baseline benchmark
- Profile to identify hotspots
- Implement optimization
- Run comparison benchmark
- Document improvements