VibesOS autoresearch
install
source · Clone the upstream repo
git clone https://github.com/popmechanic/VibesOS
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/popmechanic/VibesOS "$T" && mkdir -p ~/.claude/skills && cp -r "$T/vibes-desktop/build/stable-macos-arm64/VibesOS.app/Contents/Resources/vibes-plugin/skills/autoresearch" ~/.claude/skills/popmechanic-vibesos-autoresearch && rm -rf "$T"
manifest:
vibes-desktop/build/stable-macos-arm64/VibesOS.app/Contents/Resources/vibes-plugin/skills/autoresearch/SKILL.mdsource content
Parallel Autoresearch Engine
Plan mode: This skill is ONE plan step: "Invoke /vibes:autoresearch". Do not decompose.
Prerequisites Check
VIBES_ROOT="${CLAUDE_PLUGIN_ROOT:-$(dirname "$(dirname "${CLAUDE_SKILL_DIR}")")}" echo "Checking autoresearch prerequisites..." test -f "$VIBES_ROOT/scripts/eval-ssr-check.ts" && echo "✓ Tier 1.5 SSR check" || echo "✗ Missing eval-ssr-check.ts" test -f "$VIBES_ROOT/scripts/eval-harness.ts" && echo "✓ Tier 2 harness" || echo "✗ Missing eval-harness.ts" test -f "$VIBES_ROOT/scripts/eval-parallel.ts" && echo "✓ Orchestrator" || echo "✗ Missing eval-parallel.ts" test -f "$VIBES_ROOT/scripts/eval-scoring.ts" && echo "✓ Scoring" || echo "✗ Missing eval-scoring.ts" test -f "$VIBES_ROOT/eval/config.md" && echo "✓ Config" || echo "✗ Missing eval/config.md" test -f "$VIBES_ROOT/eval/napkin.md" && echo "✓ Napkin" || echo "✗ Missing eval/napkin.md" ls "$VIBES_ROOT/eval/specs/"*.md 2>/dev/null | wc -l | xargs -I{} echo "✓ {} eval specs found" cd "$VIBES_ROOT/scripts" && bun -e "import React from 'react'; console.log('✓ React available')" 2>/dev/null || echo "✗ React not installed"
If any prerequisite is missing, stop and inform the user.
Running
Option 1: Full Autonomous Run (Recommended)
Dispatch the autoresearch orchestrator agent (
.claude/agents/autoresearch-orchestrator.md) with context from eval/config.md, eval/napkin.md, and eval/scoreboard.md.
Pass any CLI arguments from the user (e.g.,
--variants=5 --generations=10).
Option 2: Single Eval Pipeline Test
VIBES_ROOT="${CLAUDE_PLUGIN_ROOT:-$(dirname "$(dirname "${CLAUDE_SKILL_DIR}")")}" bun "$VIBES_ROOT/scripts/eval-parallel.ts" --mode=eval-only <app.jsx> <spec.md>
Option 3: Score Existing Generation
VIBES_ROOT="${CLAUDE_PLUGIN_ROOT:-$(dirname "$(dirname "${CLAUDE_SKILL_DIR}")")}" bun "$VIBES_ROOT/scripts/eval-scoring.ts" "$VIBES_ROOT/eval/results/gen-N/"
What It Does
Each generation:
- Mutate: N independent SKILL.md variants (fix-targeted, structural, adversarial, etc.)
- Generate: Each variant × each prompt × 3 runs
- Evaluate: Tier 1 (static) → Tier 1.5 (SSR) → Tier 2 (data model) — all programmatic
- Score: Triple-run averaging with consistency penalty; fitness = mean - 0.5×stddev
- Select: Best variant replaces current SKILL.md; git commit on improvement
- Repeat: Until plateau (3 gens), max generations, or score oscillation
Monitoring
— per-generation resultseval/results/gen-N/summary.json
— cumulative historyeval/results/summaries.json
— human-readable scoreboardeval/scoreboard.md
— failure log (grows monotonically)eval/napkin.md
Final Report
VIBES_ROOT="${CLAUDE_PLUGIN_ROOT:-$(dirname "$(dirname "${CLAUDE_SKILL_DIR}")")}" bun "$VIBES_ROOT/scripts/eval-report.ts" "$VIBES_ROOT/eval/results/summaries.json"