Autosearch autosearch:reflective-search-loop

Explicit loop state for multi-round research — maintains gaps, visited URLs, bad URLs, failed-evaluator feedback, and decides whether to fire another round of queries or stop. Inspired by WebThinker's in-band search protocol, node-deepresearch's agent.ts state, and Scira's extreme-search.

install

source · Clone the upstream repo

git clone https://github.com/0xmariowu/Autosearch

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/0xmariowu/Autosearch "$T" && mkdir -p ~/.claude/skills && cp -r "$T/autosearch/skills/meta/reflective-search-loop" ~/.claude/skills/0xmariowu-autosearch-autosearch-reflective-search-loop && rm -rf "$T"

manifest: autosearch/skills/meta/reflective-search-loop/SKILL.md

source content

Reflective Search Loop — Explicit Loop State

Most autosearch leaf skills are one-shot: call, get evidence, done. This meta skill codifies a multi-round loop that the runtime AI drives when the task is deeper than one query round can settle.

Loop State

loop:
  round: int                         # 0, 1, 2, ...
  budget_remaining:
    rounds: int                      # hard cap, e.g. 5
    cost_usd: float
    latency_seconds: int
  context:
    query: str                       # original user query
    rubrics: list[str]               # from run_clarify
  progress:
    gaps: list[str]                  # outstanding sub-questions, grows/shrinks
    answered_gaps: list[str]         # resolved this round
    all_questions: list[str]         # every question asked across all rounds
    visited_urls: list[str]          # URLs already fetched
    bad_urls: list[str]              # URLs that 404'd / were blocked / returned junk
    evidence: list[dict]             # accumulated Evidence items
    evaluator_failures: list[str]    # rubric/quality failures from previous round
  decision:
    next_action: "search_more" | "fetch_more" | "finalize" | "escalate_budget" | "abort"
    next_queries: list[str] | null
    next_urls_to_fetch: list[str] | null
    stop_reason: str | null

Round Structure

Each loop iteration:

Reflect (Best tier LLM): given current
```
loop
```
state, decide
```
next_action
```
.

Act: call

run_channel(...)

fetch-jina

fetch-crawl4ai

for the chosen next step.

Integrate: update
```
progress
```
(visited URLs, evidence list, new gaps discovered).

Evaluate (if enabled): run

check-rubrics

evaluate-delivery

; record failures in

evaluator_failures

Loop or stop: check budget + stop conditions.

Stop Conditions

ANY of:

```
round >= budget_remaining.rounds
```
(hard cap).

len(gaps) == 0

AND

len(evaluator_failures) == 0

(task complete).

```
len(new_evidence_this_round) == 0
```
for 2 consecutive rounds (stalled).
```
cost_usd
```
> allocated budget.
User explicitly aborts.

What This Skill Is NOT

Not a replacement for the deleted-wave-3 m3 iteration. That was an implicit loop inside the autosearch pipeline. This is an explicit protocol the runtime AI follows — state is visible to the user/runtime, not buried inside
```
autosearch research()
```
.
Not a sub-agent orchestrator — that's
```
delegate-subtask
```
. This is single-agent, single-session multi-round.

Interactions

Starts from →
```
run_clarify
```
result (rubrics, initial gaps from query_type).
Fans out to →
```
run_channel
```
per round.
Uses →
```
fetch-jina
```
/
```
fetch-crawl4ai
```
for URL reads in each round.
Finalizes via →
```
synthesize-knowledge
```
+
```
evaluate-delivery
```
.

Cost Discipline

Reflective loops are expensive (Best tier × N rounds). Runtime AI should:

Default to 2-3 rounds for most research tasks.
Escalate to 5 rounds only when
```
rubrics_failed > 30%
```
after round 3.
Never exceed 8 rounds without user confirmation (escalate_budget).

MCP Tool Usage

Full reflective loop using MCP tools:

# Initialize loop
result = loop_init()
state_id = result["state_id"]

# Round 1
evidence = run_channel(channel_name="arxiv", query="RAG survey 2026")["evidence"]
loop_update(state_id=state_id, evidence=evidence, query="RAG survey 2026")

# Check gaps after round 1
gaps = loop_get_gaps(state_id=state_id)["gaps"]
# gaps might be: ["no benchmarks found", "missing 2026 papers"]

# Manually add a gap if you spot one
loop_add_gap(state_id=state_id, gap="no industry deployment case studies")

# Round 2: search specifically for gaps
for gap in gaps:
    evidence2 = run_channel(channel_name="semantic-scholar", query=gap)["evidence"]
    loop_update(state_id=state_id, evidence=evidence2, query=gap)

Loop state is in-memory per server process. State is lost on server restart (use within one session).

Prior Art Anchors

WebThinker: in-band
```
<|begin_search_query|>
```
/
```
<|begin_click_link|>
```
protocol — we do not reproduce the tokens, but preserve the explicit state tracking of executed queries / clicked URLs.

node-deepresearch:

gaps / allQuestions / allKnowledge / visitedURLs / badURLs / weightedURLs / evaluationMetrics

state — adopted verbatim.

Scira extreme-search: forced
```
thinking -> tool
```
alternation and mandatory date injection for time-sensitive queries — adopt the date-anchor rule.

Quality Bar

Evidence items have non-empty title and url.
No crash on empty or malformed API response.
Source channel field matches the channel name.