Claude-skills firecrawl-scraper

Firecrawl v2.5 API for web scraping/crawling to LLM-ready markdown. Use for site extraction, dynamic content, or encountering JavaScript rendering, bot detection, content loading errors.

install
source · Clone the upstream repo
git clone https://github.com/secondsky/claude-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/secondsky/claude-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/firecrawl-scraper/skills/firecrawl-scraper" ~/.claude/skills/secondsky-claude-skills-firecrawl-scraper && rm -rf "$T"
manifest: plugins/firecrawl-scraper/skills/firecrawl-scraper/SKILL.md
source content

Firecrawl Web Scraper Skill

Status: Production Ready ✅ Last Updated: 2025-11-21 Official Docs: https://docs.firecrawl.dev API Version: v2.5


What is Firecrawl?

Firecrawl is a Web Data API for AI that turns entire websites into LLM-ready markdown or structured data. It handles:

  • JavaScript rendering - Executes client-side JavaScript to capture dynamic content
  • Anti-bot bypass - Gets past CAPTCHA and bot detection systems
  • Format conversion - Outputs as markdown, JSON, or structured data
  • Screenshot capture - Saves visual representations of pages
  • Browser automation - Full headless browser capabilities

API Endpoints

1.
/v2/scrape
- Single Page Scraping

Scrapes a single webpage and returns clean, structured content.

Use Cases:

  • Extract article content
  • Get product details
  • Scrape specific pages
  • Convert HTML to markdown

Key Options:

  • formats
    : ["markdown", "html", "screenshot"]
  • onlyMainContent
    : true/false (removes nav, footer, ads)
  • waitFor
    : milliseconds to wait before scraping
  • actions
    : browser automation actions (click, scroll, etc.)

2.
/v2/crawl
- Full Site Crawling

Crawls all accessible pages from a starting URL.

Use Cases:

  • Index entire documentation sites
  • Archive website content
  • Build knowledge bases
  • Scrape multi-page content

Key Options:

  • limit
    : max pages to crawl
  • maxDepth
    : how many links deep to follow
  • allowedDomains
    : restrict to specific domains
  • excludePaths
    : skip certain URL patterns

3.
/v2/map
- URL Discovery

Maps all URLs on a website without scraping content.

Use Cases:

  • Find sitemap
  • Discover all pages
  • Plan crawling strategy
  • Audit website structure

4.
/v2/extract
- Structured Data Extraction

Uses AI to extract specific data fields from pages.

Use Cases:

  • Extract product prices and names
  • Parse contact information
  • Build structured datasets
  • Custom data schemas

Key Options:

  • schema
    : Zod or JSON schema defining desired structure
  • systemPrompt
    : guide AI extraction behavior

Authentication

Firecrawl requires an API key for all requests.

Get API Key

  1. Sign up at https://www.firecrawl.dev
  2. Go to dashboard → API Keys
  3. Copy your API key (starts with
    fc-
    )

Store Securely

NEVER hardcode API keys in code!

# .env file
FIRECRAWL_API_KEY=fc-your-api-key-here
# .env.local (for local development)
FIRECRAWL_API_KEY=fc-your-api-key-here

SDK Quick Start

Python

pip install firecrawl-py  # v4.5.0+
from firecrawl import FirecrawlApp
import os

app = FirecrawlApp(api_key=os.environ.get("FIRECRAWL_API_KEY"))
result = app.scrape_url("https://example.com", params={"formats": ["markdown"], "onlyMainContent": True})
print(result.get("markdown"))

TypeScript/Node.js

bun add @mendable/firecrawl-js  # v4.4.1+
import FirecrawlApp from '@mendable/firecrawl-js';

const app = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY });
const result = await app.scrapeUrl('https://example.com', { formats: ['markdown'], onlyMainContent: true });
console.log(result.markdown);

See:

templates/
for crawl, extract, and advanced examples


Common Use Cases

Use CaseEndpointKey Options
Documentation scraping
crawl_url()
limit: 500
,
allowedDomains
Product data extraction
extract()
Zod schema +
systemPrompt
News article scraping
scrape_url()
onlyMainContent: true
,
removeBase64Images
URL discovery
map()
Find all pages before crawling

See:

references/common-patterns.md
for complete examples.


Error Handling

# Python
try:
    result = app.scrape_url("https://example.com")
except FirecrawlException as e:
    print(f"Firecrawl error: {e}")
// TypeScript
try {
  const result = await app.scrapeUrl('https://example.com');
} catch (error) {
  console.error('Error:', error.message);
}

Rate Limits & Best Practices

Best PracticeWhy
Use
onlyMainContent: true
Reduces credits, cleaner output
Set reasonable
limit
Avoid excessive costs
Use
map
endpoint first
Plan crawling strategy
Cache resultsAvoid re-scraping
Batch extract callsMore efficient for multiple URLs

Credits: Free tier = 500/month, paid tiers higher.


Cloudflare Workers Integration

⚠️ SDK cannot run in Workers (Node.js dependencies). Use direct REST API:

const response = await fetch('https://api.firecrawl.dev/v2/scrape', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${env.FIRECRAWL_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({ url, formats: ['markdown'], onlyMainContent: true })
});

See:

references/common-patterns.md
for complete Workers example with caching.


When to Use This Skill

✅ Use Firecrawl❌ Don't Use
Modern JS-rendered sitesSimple static HTML (use cheerio)
Clean markdown for LLMsExisting Puppeteer setup works
RAG/chatbot contentDirect API available
Structured data extractionBudget constraints
Bot protection bypass

Common Issues

IssueCauseFix
"Invalid API Key"Key not setCheck
$FIRECRAWL_API_KEY
starts with
fc-
"Rate limit exceeded"Monthly credits usedCheck dashboard, upgrade plan
"Timeout error"Page slow to loadAdd
waitFor: 10000
"Content is empty"JS loads lateAdd
actions: [{type: "wait", milliseconds: 3000}]

Advanced Features

FeatureUsage
Browser actions
actions: [{type: "click", selector: "button"}]
Custom headers
headers: {"User-Agent": "Custom Bot"}
Webhooks
webhook: "https://your-domain.com/webhook"
Screenshots
formats: ["screenshot"]

See:

references/endpoints.md
for complete API reference.


When to Load References

ReferenceLoad When...
endpoints.md
Need complete API endpoint documentation
common-patterns.md
Cloudflare Workers, caching, batch processing, error handling

Package Versions

PackageVersion
firecrawl-py4.5.0+
@mendable/firecrawl-js4.4.1+
APIv2

Note: Node.js SDK requires Node.js >=22.0.0, cannot run in Workers.


Official Docs: https://docs.firecrawl.dev | GitHub: https://github.com/mendableai/firecrawl

Token Savings: ~60% | Production Ready: ✅