Autosearch fetch-crawl4ai

Deep URL fetch using crawl4ai (Playwright-powered) for JS-rendered pages, anti-bot sites, and dynamic content. Slower than fetch-jina but handles sites that block simple fetchers. Requires user-installed crawl4ai package.

install

source · Clone the upstream repo

git clone https://github.com/0xmariowu/Autosearch

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/0xmariowu/Autosearch "$T" && mkdir -p ~/.claude/skills && cp -r "$T/autosearch/skills/tools/fetch-crawl4ai" ~/.claude/skills/0xmariowu-autosearch-fetch-crawl4ai && rm -rf "$T"

manifest: autosearch/skills/tools/fetch-crawl4ai/SKILL.md

source content

Deep URL fetch through

crawl4ai

, backed by Playwright and Chromium. Use this as the fallback when

fetch-jina

fails, refuses a URL, returns an empty page, or cannot see JavaScript-rendered content.

Input Fit

JavaScript-heavy SPAs where server HTML is mostly empty.
Anti-bot or dynamic sites that block simple HTTP fetchers.
Pages that need a CSS selector wait before extraction.
Public pages where a local Chromium browser can render the content.

Invocation

Call

fetch.py

's sync

fetch(url: str, wait_for: str | None = None, timeout_seconds: float = 30.0)

function:

result = fetch("https://example.com/app", wait_for=".loaded")

Successful calls return:

{
    "ok": True,
    "markdown": "...",
    "title": "Rendered page title",
    "url": "https://example.com/final-url",
    "meta": {
        "status_code": 200,
        "backend": "crawl4ai",
        "browser": "chromium",
        "elapsed_sec": 2.4,
    },
    "source": "https://example.com/app",
}

Failure Modes

Missing

crawl4ai

package returns

reason: crawl4ai_unavailable

. Install with

pip install crawl4ai

plus

playwright install chromium

, or fall back to

fetch-jina

Browser or crawl4ai runtime failures return
```
reason: crawl4ai_runtime_error
```
with the error message or stderr tail.
DNS, connection, and transport failures return
```
reason: network_error
```
.
Slow pages that exceed
```
timeout_seconds
```
return
```
reason: timeout
```
.
HTTP 403 or detectable challenge pages return
```
reason: anti_bot_blocked
```
; degrade to
```
fetch-playwright
```
or
```
fetch-firecrawl
```
paid fallback.
Successful crawls with empty or too-short Markdown return
```
reason: empty_content
```
.

Limits

Requires user opt-in installation:
```
pip install crawl4ai
```
.
Requires a Chromium browser installed for Playwright:
```
playwright install chromium
```
.
This tool is slower and heavier than
```
fetch-jina
```
; use it only when a simple fetch cannot retrieve the useful content.
It does not solve sites that require authentication, strong bot mitigation, paid proxies, or long interactive flows.

Quality Bar

Evidence items have non-empty title and url.
No crash on empty or malformed API response.
Source channel field matches the channel name.