Awesome-openclaw-skills mineru-pdf
Parse PDFs locally (CPU) into Markdown/JSON using MinerU. Assumes MinerU creates per‑doc output folders; supports table/image extraction.
install
source · Clone the upstream repo
git clone https://github.com/sundial-org/awesome-openclaw-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/sundial-org/awesome-openclaw-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/mineru-pdf" ~/.claude/skills/sundial-org-awesome-openclaw-skills-mineru-pdf && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/sundial-org/awesome-openclaw-skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/mineru-pdf" ~/.openclaw/skills/sundial-org-awesome-openclaw-skills-mineru-pdf && rm -rf "$T"
manifest:
skills/mineru-pdf/SKILL.mdsource content
MinerU PDF
Overview
Parse a PDF locally with MinerU (CPU). Default output is Markdown + JSON. Use tables/images only when requested.
Quick start (single PDF)
# Run from the skill directory ./scripts/mineru_parse.sh /path/to/file.pdf
Optional examples:
./scripts/mineru_parse.sh /path/to/file.pdf --format json ./scripts/mineru_parse.sh /path/to/file.pdf --tables --images
When to read references
If flags differ from your wrapper or you need advanced defaults (backend/method/device/threads/format mapping), read:
references/mineru-cli.md
Output conventions
- Output root defaults to
../mineru-output/ - MinerU creates the per-document subfolder under the output root (e.g.,
)../mineru-output/<basename>/...
Batching
Default is single-PDF parsing. Only implement batch folder parsing if explicitly requested.