Skills firecrawl

Name: firecrawl
Author: openclaw

Web scraping and content extraction using Firecrawl API. Use when users need to crawl websites, extract structured data, convert web pages to markdown, scrape multiple URLs, or build knowledge bases from web content. Supports single page extraction, site-wide crawling, batch processing, and structured data extraction with CSS selectors.

install

source · Clone the upstream repo

git clone https://github.com/openclaw/skills

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/antonia-sz/web-scraper-firecrawl" ~/.claude/skills/openclaw-skills-firecrawl && rm -rf "$T"

OpenClaw · Install into ~/.openclaw/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/antonia-sz/web-scraper-firecrawl" ~/.openclaw/skills/openclaw-skills-firecrawl && rm -rf "$T"

manifest: skills/antonia-sz/web-scraper-firecrawl/SKILL.md

Firecrawl Skill

Powerful web scraping powered by Firecrawl - turn websites into LLM-ready markdown.

Overview

Firecrawl provides APIs for:

Scrape - Single page extraction to markdown
Crawl - Entire site crawling with depth control
Map - URL discovery from a starting point
Batch - Multiple URL processing
Extract - Structured data extraction with schemas

Prerequisites

Firecrawl API Key - Get free tier at https://firecrawl.dev
Install Python dependencies:
```
requests
```

Configuration

Set environment variable:

export FIRECRAWL_API_KEY="fc-your-api-key"

Usage

Single Page Scraping

# Basic scrape
firecrawl scrape https://example.com

# With specific options
firecrawl scrape https://example.com --formats markdown,html --only-main-content

# Wait for JS rendering
firecrawl scrape https://spa-app.com --wait-for 2000

Site Crawling

# Crawl entire site (up to limit)
firecrawl crawl https://docs.example.com --limit 50

# With depth control
firecrawl crawl https://blog.example.com --max-depth 2 --limit 100

# Include/exclude patterns
firecrawl crawl https://site.com --include "/blog/*" --exclude "/admin/*"

# Custom formats
firecrawl crawl https://docs.example.com --formats markdown,links

URL Mapping

# Discover all URLs from a site
firecrawl map https://example.com

# With search term
firecrawl map https://docs.python.org --search "tutorial"

Batch Processing

# Scrape multiple URLs
firecrawl batch urls.txt --output ./scraped/

# From JSON list
firecrawl batch urls.json --formats markdown --concurrency 5

Structured Extraction

# Extract specific data using CSS selectors
firecrawl extract https://example.com/products \
  --schema '{"name": ".product-title", "price": ".price", "description": ".desc"}'

# Extract to JSON
firecrawl extract https://news.example.com/article --schema article-schema.json

Output Formats

Markdown

Clean, LLM-ready markdown with:

Headings preserved
Links converted to markdown format
Images with alt text
Tables formatted as markdown tables

HTML

Raw or cleaned HTML

Links

Extracted link lists for further crawling

Screenshot

Page screenshot (if requested)

Use Cases

Knowledge Base Building

# Crawl documentation site
firecrawl crawl https://docs.framework.com --limit 200 -o ./kb/

# Merge into single file for RAG
cat ./kb/*.md > knowledge-base.md

Research & Analysis

# Scrape competitor pricing
firecrawl batch competitors.txt --extract pricing-schema.json

# Monitor blog updates
firecrawl map https://blog.company.com --since 2024-01-01

Content Migration

# Export old CMS content
firecrawl crawl https://old-site.com --formats markdown,html -o ./export/

Scripts

All functionality via

scripts/firecrawl.py

Handles API authentication
Automatic rate limiting
Retry logic for failures
Progress tracking for large crawls

Integration

Works well with:

```
markdown-sync-pro
```
- Sync scraped content to Notion/GitHub
```
arxiv-paper
```
- Combine with academic paper downloads
```
maybe-finance
```
- Scrape financial data for analysis