Skills crawl-for-ai
Web scraping using local Crawl4AI instance. Use for fetching full page content with JavaScript rendering. Better than Tavily for complex pages. Unlimited usage.
install
source · Clone the upstream repo
git clone https://github.com/openclaw/skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/angusthefuzz/crawl-for-ai" ~/.claude/skills/openclaw-skills-crawl-for-ai && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/angusthefuzz/crawl-for-ai" ~/.openclaw/skills/openclaw-skills-crawl-for-ai && rm -rf "$T"
manifest:
skills/angusthefuzz/crawl-for-ai/SKILL.mdsource content
Crawl4AI Web Scraper
Local Crawl4AI instance for full web page extraction with JavaScript rendering.
Endpoints
Proxy (port 11234) — Clean output, OpenWebUI-compatible
- Returns:
[{page_content, metadata}] - Use for: Simple content extraction
Direct (port 11235) — Full output with all data
- Returns:
{results: [{markdown, html, links, media, ...}]} - Use for: When you need links, media, or other metadata
Usage
# Via script node {baseDir}/scripts/crawl4ai.js "url" node {baseDir}/scripts/crawl4ai.js "url" --json
Script options:
— Full JSON response--json
Output: Clean markdown from the page.
Configuration
Required environment variable:
— Your Crawl4AI instance URL (e.g.,CRAWL4AI_URL
)http://localhost:11235
Optional:
— API key if your instance requires authenticationCRAWL4AI_KEY
Features
- JavaScript rendering — Handles dynamic content
- Unlimited usage — Local instance, no API limits
- Full content — HTML, markdown, links, media, tables
- Better than Tavily for complex pages with JS
API
Uses your local Crawl4AI instance REST API. Auth header only sent if
CRAWL4AI_KEY is set.