PythonClaw summarize
Summarize or extract text from URLs, articles, PDFs, and local files. Use when: user asks to summarize a link, article, web page, or document, or asks 'what is this link about?'. NOT for: YouTube transcripts (use youtube skill), full document editing, or real-time news (use news skill).
install
source · Clone the upstream repo
git clone https://github.com/ericwang915/PythonClaw
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ericwang915/PythonClaw "$T" && mkdir -p ~/.claude/skills && cp -r "$T/pythonclaw/templates/skills/web/summarize" ~/.claude/skills/ericwang915-pythonclaw-summarize && rm -rf "$T"
manifest:
pythonclaw/templates/skills/web/summarize/SKILL.mdsource content
Summarize
Extract and summarise content from URLs, articles, and local files.
When to Use
✅ USE this skill when:
- "Summarize this article/URL"
- "What's this link about?"
- "Give me the key points from this page"
- "TL;DR this document"
- "Extract the main content from this URL"
When NOT to Use
❌ DON'T use this skill when:
- YouTube video transcripts → use
skillyoutube - Real-time news search → use
skillnews - PDF text extraction → use
skillpdf_reader - Web search for information → use
ortavily_search
toolweb_search
Usage
Summarize a URL
python {skill_path}/summarize_url.py "https://example.com/article"
Options
# Short summary (default) python {skill_path}/summarize_url.py "https://example.com" --length short # Detailed summary python {skill_path}/summarize_url.py "https://example.com" --length long # Extract text only (no summarization) python {skill_path}/summarize_url.py "https://example.com" --extract-only # JSON output python {skill_path}/summarize_url.py "https://example.com" --format json
Quick Alternative (curl + readability)
For simple text extraction without the script:
curl -sL "https://example.com" | python -c " import sys from html.parser import HTMLParser class S(HTMLParser): def __init__(s): super().__init__(); s.t=[]; s.skip={'script','style','nav','footer','header'}; s.s=set() def handle_starttag(s,t,a): (s.s.add(t) if t in s.skip else None) def handle_endtag(s,t): s.s.discard(t) def handle_data(s,d): (s.t.append(d.strip()) if not s.s and d.strip() else None) p=S(); p.feed(sys.stdin.read()); print('\n'.join(p.t[:100])) "
Notes
- Requires
andrequests
(installed by default)beautifulsoup4 - Some sites block automated access — try adding a User-Agent header
- For paywalled content, the script can only extract freely available text
- Large pages are truncated to avoid context overflow
Resources
| File | Description |
|---|---|
| Fetch, extract, and summarise web page content |