Autosearch fetch-crawl4ai
Deep URL fetch using crawl4ai (Playwright-powered) for JS-rendered pages, anti-bot sites, and dynamic content. Slower than fetch-jina but handles sites that block simple fetchers. Requires user-installed crawl4ai package.
install
source · Clone the upstream repo
git clone https://github.com/0xmariowu/Autosearch
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/0xmariowu/Autosearch "$T" && mkdir -p ~/.claude/skills && cp -r "$T/autosearch/skills/tools/fetch-crawl4ai" ~/.claude/skills/0xmariowu-autosearch-fetch-crawl4ai && rm -rf "$T"
manifest:
autosearch/skills/tools/fetch-crawl4ai/SKILL.mdsource content
Deep URL fetch through
crawl4ai, backed by Playwright and Chromium. Use this as the fallback when fetch-jina fails, refuses a URL, returns an empty page, or cannot see JavaScript-rendered content.
Input Fit
- JavaScript-heavy SPAs where server HTML is mostly empty.
- Anti-bot or dynamic sites that block simple HTTP fetchers.
- Pages that need a CSS selector wait before extraction.
- Public pages where a local Chromium browser can render the content.
Invocation
Call
fetch.py's sync fetch(url: str, wait_for: str | None = None, timeout_seconds: float = 30.0) function:
result = fetch("https://example.com/app", wait_for=".loaded")
Successful calls return:
{ "ok": True, "markdown": "...", "title": "Rendered page title", "url": "https://example.com/final-url", "meta": { "status_code": 200, "backend": "crawl4ai", "browser": "chromium", "elapsed_sec": 2.4, }, "source": "https://example.com/app", }
Failure Modes
- Missing
package returnscrawl4ai
. Install withreason: crawl4ai_unavailable
pluspip install crawl4ai
, or fall back toplaywright install chromium
.fetch-jina - Browser or crawl4ai runtime failures return
with the error message or stderr tail.reason: crawl4ai_runtime_error - DNS, connection, and transport failures return
.reason: network_error - Slow pages that exceed
returntimeout_seconds
.reason: timeout - HTTP 403 or detectable challenge pages return
; degrade toreason: anti_bot_blocked
orfetch-playwright
paid fallback.fetch-firecrawl - Successful crawls with empty or too-short Markdown return
.reason: empty_content
Limits
- Requires user opt-in installation:
.pip install crawl4ai - Requires a Chromium browser installed for Playwright:
.playwright install chromium - This tool is slower and heavier than
; use it only when a simple fetch cannot retrieve the useful content.fetch-jina - It does not solve sites that require authentication, strong bot mitigation, paid proxies, or long interactive flows.
Quality Bar
- Evidence items have non-empty title and url.
- No crash on empty or malformed API response.
- Source channel field matches the channel name.