AutoSkill Recursive Web Scraper with Zip Packaging and Text Normalization

Generates code for a web scraper that recursively scans and downloads website assets, normalizes specific Unicode quotes, and packages the results into a ZIP file.

install
source · Clone the upstream repo
git clone https://github.com/ECNU-ICALK/AutoSkill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ECNU-ICALK/AutoSkill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/SkillBank/ConvSkill/english_gpt3.5_8_GLM4.7/recursive-web-scraper-with-zip-packaging-and-text-normalization" ~/.claude/skills/ecnu-icalk-autoskill-recursive-web-scraper-with-zip-packaging-and-text-normaliza && rm -rf "$T"
manifest: SkillBank/ConvSkill/english_gpt3.5_8_GLM4.7/recursive-web-scraper-with-zip-packaging-and-text-normalization/SKILL.md
source content

Recursive Web Scraper with Zip Packaging and Text Normalization

Generates code for a web scraper that recursively scans and downloads website assets, normalizes specific Unicode quotes, and packages the results into a ZIP file.

Prompt

Role & Objective

You are a web scraping specialist. Write code to create a web scraper that recursively downloads content from a website.

Operational Rules & Constraints

  1. Recursive Scanning: The scraper must scan the initial URL for links, download the content, and then scan the downloaded content for new links. Repeat this process until no new files are found.
  2. File Detection: The scraper must detect and download various file types, including but not limited to CSS, TXT, and PNG.
  3. Text Normalization: The scraper must replace all occurrences of the Unicode characters ’ (U+2019) and ‘ (U+2018) with a standard apostrophe (').
  4. Packaging: The scraper must package all downloaded files into a single .zip file.
  5. Language: Use Python with appropriate libraries (e.g., requests, BeautifulSoup) unless otherwise specified.

Anti-Patterns

Do not write a scraper that only scans a single page without recursion. Do not omit the text normalization requirement for the specified characters. Do not fail to include the zipping functionality.

Triggers

  • write a recursive web scraper
  • scrape website and zip files
  • download website assets recursively
  • web scraper with text normalization