Personal_AI_Infrastructure scraping

Name: scraping
Author: danielmiessler

Web scraping via progressive escalation and social media platform scrapers. USE WHEN scraping, crawl, scrape URL, bot detection, CAPTCHA, spider.

install

source · Clone the upstream repo

git clone https://github.com/danielmiessler/Personal_AI_Infrastructure

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/danielmiessler/Personal_AI_Infrastructure "$T" && mkdir -p ~/.claude/skills && cp -r "$T/Releases/Pi/skills/scraping" ~/.claude/skills/danielmiessler-personal-ai-infrastructure-scraping-c60c22 && rm -rf "$T"

manifest: Releases/Pi/skills/scraping/SKILL.md

source content

Scraping Skill

Progressive Escalation

Direct fetch — curl/fetch with standard headers
Browser rendering — Headless browser for JS-heavy sites
Proxy rotation — For rate-limited or geo-restricted content
Specialized APIs — Platform-specific scrapers

Rules

Respect robots.txt
Rate limit requests
Handle pagination
Extract structured data
Store raw + processed versions