Vibe-Skills webthinker-deep-research

Deep web research for VCO: multi-hop search+browse+extract with an auditable action trace and a structured report (WebThinker-style).

install
source · Clone the upstream repo
git clone https://github.com/foryourhealth111-pixel/Vibe-Skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/foryourhealth111-pixel/Vibe-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/bundled/skills/webthinker-deep-research" ~/.claude/skills/foryourhealth111-pixel-vibe-skills-webthinker-deep-research && rm -rf "$T"
manifest: bundled/skills/webthinker-deep-research/SKILL.md
source content

WebThinker Deep Research (VCO)

When to use

Use this skill when the task requires deep web research (not just one-shot search), for example:

  • Multi-hop questions (“find → open → follow links → verify”)
  • “Deep research report” / “调研报告” / “竞品调研” / “技术调研”
  • Need an auditable trace of web actions and sources
  • Need to merge findings into a structured deliverable (report / brief / spec)

Non-goals (avoid redundancy)

  • For quick citations or “give me 3 sources”, prefer
    research-lookup
    .
  • For interactive UI flows (login / forms / downloads), prefer
    playwright
    or
    turix-cua
    overlays.
  • For codebase structure / call chains, prefer GitNexus overlays (not web research).

Output contract (must)

Produce a folder with:

  • report.md
    — structured report (problem → findings → implications → next steps)
  • sources.json
    — all sources (URL/title/access time/snippet)
  • trace.jsonl
    — append-only action trace (search/open/extract/decision)
  • notes.md
    — working notes with per-source anchors

Use

scripts/init_webthinker_run.py
to scaffold the folder.

Runtime (Upstream vendoring)

This VCO skill supports a stable Lite mode by default, and keeps the upstream WebThinker repo vendored for optional advanced use.

  • Vendored upstream paths:
    • C:\Users\羽裳\.codex\_external\ruc-nlpir\WebThinker\
  • Runtime config (no secrets stored):
    • C:\Users\羽裳\.codex\skills\vibe\config\ruc-nlpir-runtime.json
  • Preflight / install (no secrets echoed):
    • pwsh C:\Users\羽裳\.codex\skills\vibe\scripts\ruc-nlpir\preflight.ps1
    • Manually create an isolated venv for the vendored runtime and install only the minimal packages you need. The old
      install-upstreams.ps1
      auto-install path has been removed on purpose.

LLM endpoint conventions (recommended):

  • Base URL:
    OPENAI_BASE_URL
    (or runtime default)
  • API key:
    OPENAI_API_KEY
    (env var only; never write into files or CLI args)

Modes

Mode A (Recommended): Lite — tool-orchestrated deep research

Use existing tools (no heavy model hosting):

  1. Scaffold outputs:
    • python C:\Users\羽裳\.codex\skills\webthinker-deep-research\scripts\init_webthinker_run.py --topic "…" --out outputs/webthinker
  2. Search (broad → narrow):
    • Use
      web.run
      search queries or
      mcp__tavily__tavily_search
      if available.
  3. Browse/extract:
    • Use
      web.run open/click/find
      for structured pages
    • Use
      playwright
      when pages require dynamic rendering / interactions
  4. Draft + iterate:
    • Update
      notes.md
      and
      sources.json
      continuously
    • Write
      report.md
      as you go (think-search-and-draft), not only at the end
  5. Verification:
    • Triangulate key claims across ≥2 sources when possible
    • Flag uncertainties explicitly

Mode B (Optional): Full WebThinker stack

Only choose this if you want to run the upstream system end-to-end and you have the environment:

  • Requires heavy deps (
    torch
    ,
    transformers
    ,
    vllm
    ) + a served reasoning model
  • Requires a search API (Serper recommended by upstream)
  • Optional: Crawl4AI parser client for JS-heavy pages

This mode is for high-throughput deep research runs; for most VCO tasks, Lite mode is enough and cheaper.

Action trace format (trace.jsonl)

Each line is one JSON object, e.g.:

  • {"ts":"…","type":"search","query":"…","provider":"web.run"}
  • {"ts":"…","type":"open","url":"…"}
  • {"ts":"…","type":"extract","url":"…","highlights":["…","…"]}
  • {"ts":"…","type":"decision","reason":"why this source matters","next":"…"}

Quality gates

  • Every major claim in
    report.md
    links back to at least one entry in
    sources.json
    .
  • sources.json
    contains the exact URLs you used (no “I saw somewhere…”).
  • Keep the report actionable: add “Next steps” with concrete verification tasks.