Personal_AI_Infrastructure scraping
Web scraping via progressive escalation and social media platform scrapers. USE WHEN scraping, crawl, scrape URL, bot detection, CAPTCHA, spider.
install
source · Clone the upstream repo
git clone https://github.com/danielmiessler/Personal_AI_Infrastructure
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/danielmiessler/Personal_AI_Infrastructure "$T" && mkdir -p ~/.claude/skills && cp -r "$T/Releases/Pi/skills/scraping" ~/.claude/skills/danielmiessler-personal-ai-infrastructure-scraping-c60c22 && rm -rf "$T"
manifest:
Releases/Pi/skills/scraping/SKILL.mdsource content
Scraping Skill
Progressive Escalation
- Direct fetch — curl/fetch with standard headers
- Browser rendering — Headless browser for JS-heavy sites
- Proxy rotation — For rate-limited or geo-restricted content
- Specialized APIs — Platform-specific scrapers
Rules
- Respect robots.txt
- Rate limit requests
- Handle pagination
- Extract structured data
- Store raw + processed versions