Autosearch autosearch:model-routing

Advisory skill — three-tier (Fast / Standard / Best) model routing catalog for autosearch skills. Tells the runtime AI which tier each leaf skill needs, and how to escalate or de-escalate. Autosearch does not switch models itself; the runtime AI is the decision-maker.

install
source · Clone the upstream repo
git clone https://github.com/0xmariowu/Autosearch
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/0xmariowu/Autosearch "$T" && mkdir -p ~/.claude/skills && cp -r "$T/autosearch/skills/meta/model-routing" ~/.claude/skills/0xmariowu-autosearch-autosearch-model-routing && rm -rf "$T"
manifest: autosearch/skills/meta/model-routing/SKILL.md
source content

Model Tier Routing — Advisory

Routing principle: most steps use the runtime's cheapest model; only the critical 1–2 steps use the best model.

Autosearch stamps every leaf skill with a

model_tier
suggestion. This skill tells the runtime AI what the three tiers mean, which skills default to which tier, and when to escalate or de-escalate.

Three Tiers

TierTypical runtime pickWhen usedShare of skills
FastClaude Haiku / GPT-5-mini / Gemini 2.5 Flash / Qwen localRetrieval, normalization, schema checks, URL reading, metadata~60%
StandardClaude Sonnet / GPT-5.4 / Gemini 2.5 ProSemantic ranking, evidence extraction, mid-complexity planning~25%
BestClaude Opus / GPT-5 / Gemini 2.5 UltraClarify, decompose, synthesize, evaluate delivery, skill evolution — the 1-2 steps that shape everything~15%

Tier Assignments

Each autosearch skill carries

model_tier: Fast|Standard|Best
in its frontmatter. The runtime AI reads that field before choosing which provider/model to call.

Best (~13 skills — the critical 1-2 steps per session)

  • clarify
    — disambiguate intent (wrong clarification cascades)
  • systematic-recall
    — global recall planning (missed angles compound)
  • decompose-task
    — breaking a multi-part problem
  • synthesize-knowledge
    — produce frameworks, not link lists
  • evaluate-delivery
    — quality gate on final output
  • knowledge-map
    — cross-evidence relation graph
  • check-rubrics
    /
    generate-rubrics
    — rubric-driven evaluation
  • auto-evolve
    /
    create-skill
    — anything that changes future behavior
  • goal-loop
    — multi-round goal convergence
  • graph-search-plan
    (when present) — research plan as graph
  • perspective-questioning
    (when present) — multi-persona question generation
  • reflective-search-loop
    (when present) — explicit gaps / visited / bad-URLs loop

Standard (~20 skills — semantic judgment, structurable)

  • select-channels
    — pick 5-10 channels from 41
  • gene-query
    — combinatorial query generation
  • consult-reference
    — prior art lookup
  • rerank-evidence
    — semantic ranking of results
  • llm-evaluate
    — per-item relevance score
  • anti-cheat
    — spam / score-gaming detection
  • assemble-context
    — token-budgeted context assembly
  • extract-knowledge
    — structured extraction from text
  • fetch-crawl4ai
    /
    fetch-playwright
    /
    fetch-firecrawl
    /
    follow-links
  • experience-compact
    — rule promotion
  • observe-user
    — user preference inference
  • research-mode
    — speed vs. deep choice
  • delegate-subtask
    (when present) /
    trace-harvest
    /
    citation-index
    /
    recent-signal-fusion
  • interact-user
    /
    pipeline-flow
    /
    outcome-tracker

Fast (~60 skills — bulk of every session)

  • All 41 channel skills (
    search-*
    ) — retrieval, not reasoning
  • fetch-jina
    ,
    fetch-webpage
    — URL → Markdown
  • yt-dlp
    + three video-to-text transcription skills — mechanical extraction
  • mcporter
    routing skill
  • discover-environment
    ,
    provider-health
    — env probing
  • normalize-results
    ,
    extract-dates
    — schema + dedupe
  • autosearch:router
    — routing decision, no deep reasoning
  • experience-capture
    — append-only event write, often no LLM
  • context-retention-policy
    (when present) — keep-last-k rules

Escalation Rules

Start a step at its default tier. Escalate only when:

  • Conflicting evidence from multiple sources needs semantic reconciliation →
    Standard
    from
    Fast
    .
  • Final synthesis or skill-evolution output depends on this step →
    Best
    from
    Standard
    .
  • User explicitly asks for deeper analysis / higher quality on the topic.

De-escalate when:

  • A Standard step is running on highly structured / deterministic input (e.g.
    rerank-evidence
    on 3 items with clear metadata) → drop to Fast.
  • Exploratory / draft iteration loop — first passes can be Fast, final pass Best.

Runtime Advisory (non-binding)

Autosearch cannot force the runtime to change models. The

model_tier
field is a suggestion to help the runtime AI choose a provider/model route that matches the quality bar of the step. The runtime AI may ignore the advisory if it has better information (e.g. user specified a fixed model).

Boss Rules

  • Cost: run the cheapest model that clears the quality bar; reserve best model for the 1-2 steps that shape the whole session outcome.
  • Judge: keep LLM-as-judge (pairwise A/B preference, open_deep_research pattern). Do not invent N-dim × 0/3/5 rubrics.
  • AVO: any
    Best
    -tier skill that modifies SKILL.md (like
    auto-evolve
    ) must commit via
    scripts/committer
    and be reversible via
    git revert
    .

Quality Bar

  • Evidence items have non-empty title and url.
  • No crash on empty or malformed API response.
  • Source channel field matches the channel name.