Awesome-openclaw-skills pymupdf-pdf
Fast local PDF parsing with PyMuPDF (fitz) for Markdown/JSON outputs and optional images/tables. Use when speed matters more than robustness, or as a fallback while heavier parsers are unavailable. Default to single-PDF parsing with per-document output folders.
install
source · Clone the upstream repo
git clone https://github.com/sundial-org/awesome-openclaw-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/sundial-org/awesome-openclaw-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/pymupdf-pdf" ~/.claude/skills/sundial-org-awesome-openclaw-skills-pymupdf-pdf && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/sundial-org/awesome-openclaw-skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/pymupdf-pdf" ~/.openclaw/skills/sundial-org-awesome-openclaw-skills-pymupdf-pdf && rm -rf "$T"
manifest:
skills/pymupdf-pdf/SKILL.mdsource content
PyMuPDF PDF
Overview
Parse PDFs locally using PyMuPDF for fast, lightweight extraction into Markdown by default, with optional JSON and image/table outputs in a per-document directory.
Prereqs / when to read references
If you hit import errors (PyMuPDF not installed) or Nix
libstdc++ issues, read:
references/pymupdf-notes.md
Quick start (single PDF)
# Run from the skill directory ./scripts/pymupdf_parse.py /path/to/file.pdf \ --format md \ --outroot ./pymupdf-output
Options
(default:--format md|json|both
)md
to extract images--images
to extract a simple line-based table JSON (quick/rough)--tables
to change output root--outroot DIR
adds a language hint into JSON output metadata--lang
Output conventions
- Create
by default../pymupdf-output/<pdf-basename>/ - Markdown output:
output.md - JSON output:
(includesoutput.json
)lang - Images:
subdirimages/ - Tables:
(rough line-based)tables.json
Notes
- PyMuPDF is fast but less robust on complex PDFs.
- For more robust parsing, use a heavy-duty OCR parser (e.g., MinerU) if installed.