PythonClaw summarize

Summarize or extract text from URLs, articles, PDFs, and local files. Use when: user asks to summarize a link, article, web page, or document, or asks 'what is this link about?'. NOT for: YouTube transcripts (use youtube skill), full document editing, or real-time news (use news skill).

install
source · Clone the upstream repo
git clone https://github.com/ericwang915/PythonClaw
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ericwang915/PythonClaw "$T" && mkdir -p ~/.claude/skills && cp -r "$T/pythonclaw/templates/skills/web/summarize" ~/.claude/skills/ericwang915-pythonclaw-summarize && rm -rf "$T"
manifest: pythonclaw/templates/skills/web/summarize/SKILL.md
source content

Summarize

Extract and summarise content from URLs, articles, and local files.

When to Use

USE this skill when:

  • "Summarize this article/URL"
  • "What's this link about?"
  • "Give me the key points from this page"
  • "TL;DR this document"
  • "Extract the main content from this URL"

When NOT to Use

DON'T use this skill when:

  • YouTube video transcripts → use
    youtube
    skill
  • Real-time news search → use
    news
    skill
  • PDF text extraction → use
    pdf_reader
    skill
  • Web search for information → use
    tavily_search
    or
    web_search
    tool

Usage

Summarize a URL

python {skill_path}/summarize_url.py "https://example.com/article"

Options

# Short summary (default)
python {skill_path}/summarize_url.py "https://example.com" --length short

# Detailed summary
python {skill_path}/summarize_url.py "https://example.com" --length long

# Extract text only (no summarization)
python {skill_path}/summarize_url.py "https://example.com" --extract-only

# JSON output
python {skill_path}/summarize_url.py "https://example.com" --format json

Quick Alternative (curl + readability)

For simple text extraction without the script:

curl -sL "https://example.com" | python -c "
import sys
from html.parser import HTMLParser
class S(HTMLParser):
    def __init__(s): super().__init__(); s.t=[]; s.skip={'script','style','nav','footer','header'}; s.s=set()
    def handle_starttag(s,t,a): (s.s.add(t) if t in s.skip else None)
    def handle_endtag(s,t): s.s.discard(t)
    def handle_data(s,d): (s.t.append(d.strip()) if not s.s and d.strip() else None)
p=S(); p.feed(sys.stdin.read()); print('\n'.join(p.t[:100]))
"

Notes

  • Requires
    requests
    and
    beautifulsoup4
    (installed by default)
  • Some sites block automated access — try adding a User-Agent header
  • For paywalled content, the script can only extract freely available text
  • Large pages are truncated to avoid context overflow

Resources

FileDescription
summarize_url.py
Fetch, extract, and summarise web page content