Claude-seo seo-firecrawl
git clone https://github.com/AgriciDaniel/claude-seo
T=$(mktemp -d) && git clone --depth=1 https://github.com/AgriciDaniel/claude-seo "$T" && mkdir -p ~/.claude/skills && cp -r "$T/extensions/firecrawl/skills/seo-firecrawl" ~/.claude/skills/agricidaniel-claude-seo-seo-firecrawl && rm -rf "$T"
extensions/firecrawl/skills/seo-firecrawl/SKILL.mdFirecrawl Extension for Claude SEO
This skill requires the Firecrawl extension to be installed:
./extensions/firecrawl/install.sh
Check availability: Before using any Firecrawl tool, verify the MCP server is connected by checking if
firecrawl_scrape or any Firecrawl tool
is available. If tools are not available, inform the user the extension is not
installed and provide install instructions.
Quick Reference
| Command | Purpose |
|---|---|
| Full-site crawl with content extraction |
| Discover site structure (URLs only, fast) |
| Single-page scrape with JS rendering |
| Search within a crawled site |
Commands
crawl -- Full-Site Crawl
Crawl an entire website starting from the given URL. Returns page content, metadata, and links for all discovered pages.
MCP Tool:
firecrawl_crawl
Parameters:
(required): Starting URL to crawlurl
: Max pages to crawl (default: 100, max: 500)limit
: Max link depth from start URL (default: 3)maxDepth
: Array of glob patterns to include (e.g.,includePaths
)["/blog/*"]
: Array of glob patterns to exclude (e.g.,excludePaths
)["/admin/*", "/api/*"]
: Output formats --scrapeOptions.formats["markdown", "html", "links"]
SEO Usage Patterns:
- Comprehensive audit crawl: Crawl full site, extract all pages for subagent analysis
- Section-focused crawl: Use
to audit onlyincludePaths
or/blog/*/products/* - Broken link detection: Crawl with
format, check all hrefs for 404s["links"] - Content inventory: Extract all page titles, meta descriptions, H1s at scale
- SPA/JS-rendered sites: Firecrawl renders JavaScript, solving the Issue #11 problem
Example orchestration for
:/seo audit
1. firecrawl_map(url) -> get all URLs (fast, no content) 2. Filter to top 50 most important pages (homepage, key sections) 3. firecrawl_crawl(url, limit=50) -> get full content 4. Feed content to seo-technical, seo-content, seo-schema agents
Cost awareness:
- Free tier: 500 credits/month
- 1 credit = 1 page crawled or scraped
- Map operations are cheaper (0.5 credits per URL discovered)
- Always inform user of estimated credit usage before large crawls
map -- Site Structure Discovery
Discover all URLs on a website without fetching content. Fast and credit-efficient.
MCP Tool:
firecrawl_map
Parameters:
(required): Website URL to mapurl
: Max URLs to discover (default: 5000)limit
: Optional search term to filter URLssearch
SEO Usage Patterns:
- Sitemap comparison: Map site, compare discovered URLs vs XML sitemap
- Orphan page detection: URLs in sitemap but not linked from any page
- Crawl budget analysis: Total indexable pages vs pages linked from homepage
- URL pattern analysis: Identify URL structure patterns, duplicates, parameter bloat
- Pre-audit discovery: Run map first, then targeted crawl on key sections
Output: Array of URLs. Present as:
Site: example.com Pages discovered: 342 URL Pattern Breakdown: /blog/* - 128 pages (37%) /products/* - 89 pages (26%) /category/* - 45 pages (13%) /pages/* - 32 pages (9%) / (root pages) - 48 pages (14%)
scrape -- Single-Page Deep Scrape
Scrape a single page with full JavaScript rendering. More thorough than
fetch_page.py because it executes JS and waits for dynamic content.
MCP Tool:
firecrawl_scrape
Parameters:
(required): Page URL to scrapeurl
: Output formats --formats["markdown", "html", "links", "screenshot"]
: Strip nav/footer/sidebar (default: true)onlyMainContent
: CSS selector or milliseconds to wait for contentwaitFor
: Request timeout in ms (default: 30000)timeout
: Browser actions before scraping (click, scroll, wait)actions
SEO Usage Patterns:
- SPA content extraction: Scrape JS-rendered React/Vue/Angular pages
- Dynamic content audit: Pages with lazy-loaded content below the fold
- Paywall/login detection: Identify content behind authentication walls
- Main content extraction: Use
for clean E-E-A-T analysisonlyMainContent - Screenshot capture: Use
format for visual analysisscreenshot
When to use scrape vs fetch_page.py:
| Scenario | Use |
|---|---|
| Static HTML page | (no API cost) |
| JS-rendered SPA | (renders JS) |
| Need response headers | (returns headers) |
| Need clean markdown | (better extraction) |
| Rate-limited/blocked | (handles anti-bot) |
search -- Site-Scoped Search
Search within a website for specific content. Useful for finding pages related to a topic without crawling everything.
MCP Tool:
firecrawl_search
Parameters:
(required): Search queryquery
(required): Website to search withinurl
: Max results (default: 10)limit
: Output format for matched pagesscrapeOptions.formats
SEO Usage Patterns:
- Content gap validation: Search for a keyword on the site to check if content exists
- Internal linking opportunities: Find pages mentioning a topic that could link to each other
- Duplicate content detection: Search for key phrases to find near-duplicates
- Competitor content research: Search competitor site for specific topics
Cross-Skill Integration
With seo-audit (full audit)
When Firecrawl is available during
/seo audit:
- Use
to discover all site URLsfirecrawl_map - Compare with XML sitemap (seo-sitemap) to find orphan/missing pages
- Select top pages for deep analysis
- Feed crawled content to all subagents (technical, content, schema, geo)
- Report total crawlable pages, URL patterns, and crawl depth
With seo-technical
- Broken link detection: crawl all internal links, check for 404s
- Redirect chain mapping: follow all redirects, flag chains > 2 hops
- Mixed content detection: check HTTP resources on HTTPS pages
- Canonical verification: compare canonical URLs with actual URLs
With seo-sitemap
- Sitemap coverage: % of crawled pages present in sitemap
- Orphan pages: pages found by crawl but missing from sitemap
- Stale sitemap entries: URLs in sitemap that return 404/410
With seo-content
- Content extraction: feed clean markdown to E-E-A-T analysis
- Thin content detection: identify pages with < 300 words at scale
- Duplicate content: compare content across pages for near-duplicates
With seo-schema
- Schema extraction: pull JSON-LD from all crawled pages
- Schema coverage: % of pages with structured data
- Schema validation: batch-validate extracted schemas
Error Handling
| Error | Cause | Resolution |
|---|---|---|
| MCP not configured | Run |
| Credits exhausted | Check usage at firecrawl.dev/app, upgrade plan |
| Rate limited | Wait 60s, reduce crawl concurrency |
| Page too slow to render | Increase , try without JS rendering |
| Site blocks crawling | Check robots.txt, may need to skip this site |
Graceful fallback: If Firecrawl is unavailable, inform the user and suggest:
- Use
for single-page analysis (no API cost)fetch_page.py - Use
tool for basic HTML retrievalWebFetch - Install Firecrawl:
./extensions/firecrawl/install.sh