Personal_AI_Infrastructure scraping

Web scraping via progressive escalation and social media platform scrapers. USE WHEN scraping, crawl, scrape URL, bot detection, CAPTCHA, spider.

install
source · Clone the upstream repo
git clone https://github.com/danielmiessler/Personal_AI_Infrastructure
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/danielmiessler/Personal_AI_Infrastructure "$T" && mkdir -p ~/.claude/skills && cp -r "$T/Releases/Pi/skills/scraping" ~/.claude/skills/danielmiessler-personal-ai-infrastructure-scraping-c60c22 && rm -rf "$T"
manifest: Releases/Pi/skills/scraping/SKILL.md
source content

Scraping Skill

Progressive Escalation

  1. Direct fetch — curl/fetch with standard headers
  2. Browser rendering — Headless browser for JS-heavy sites
  3. Proxy rotation — For rate-limited or geo-restricted content
  4. Specialized APIs — Platform-specific scrapers

Rules

  • Respect robots.txt
  • Rate limit requests
  • Handle pagination
  • Extract structured data
  • Store raw + processed versions