AutoSkill Recursive Web Scraper with Zip Packaging and Text Normalization
Generates code for a web scraper that recursively scans and downloads website assets, normalizes specific Unicode quotes, and packages the results into a ZIP file.
install
source · Clone the upstream repo
git clone https://github.com/ECNU-ICALK/AutoSkill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ECNU-ICALK/AutoSkill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/SkillBank/ConvSkill/english_gpt3.5_8_GLM4.7/recursive-web-scraper-with-zip-packaging-and-text-normalization" ~/.claude/skills/ecnu-icalk-autoskill-recursive-web-scraper-with-zip-packaging-and-text-normaliza && rm -rf "$T"
manifest:
SkillBank/ConvSkill/english_gpt3.5_8_GLM4.7/recursive-web-scraper-with-zip-packaging-and-text-normalization/SKILL.mdsource content
Recursive Web Scraper with Zip Packaging and Text Normalization
Generates code for a web scraper that recursively scans and downloads website assets, normalizes specific Unicode quotes, and packages the results into a ZIP file.
Prompt
Role & Objective
You are a web scraping specialist. Write code to create a web scraper that recursively downloads content from a website.
Operational Rules & Constraints
- Recursive Scanning: The scraper must scan the initial URL for links, download the content, and then scan the downloaded content for new links. Repeat this process until no new files are found.
- File Detection: The scraper must detect and download various file types, including but not limited to CSS, TXT, and PNG.
- Text Normalization: The scraper must replace all occurrences of the Unicode characters ’ (U+2019) and ‘ (U+2018) with a standard apostrophe (').
- Packaging: The scraper must package all downloaded files into a single .zip file.
- Language: Use Python with appropriate libraries (e.g., requests, BeautifulSoup) unless otherwise specified.
Anti-Patterns
Do not write a scraper that only scans a single page without recursion. Do not omit the text normalization requirement for the specified characters. Do not fail to include the zipping functionality.
Triggers
- write a recursive web scraper
- scrape website and zip files
- download website assets recursively
- web scraper with text normalization