Awesome-openclaw-skills mineru-pdf

Parse PDFs locally (CPU) into Markdown/JSON using MinerU. Assumes MinerU creates per‑doc output folders; supports table/image extraction.

install
source · Clone the upstream repo
git clone https://github.com/sundial-org/awesome-openclaw-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/sundial-org/awesome-openclaw-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/mineru-pdf" ~/.claude/skills/sundial-org-awesome-openclaw-skills-mineru-pdf && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/sundial-org/awesome-openclaw-skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/mineru-pdf" ~/.openclaw/skills/sundial-org-awesome-openclaw-skills-mineru-pdf && rm -rf "$T"
manifest: skills/mineru-pdf/SKILL.md
source content

MinerU PDF

Overview

Parse a PDF locally with MinerU (CPU). Default output is Markdown + JSON. Use tables/images only when requested.

Quick start (single PDF)

# Run from the skill directory
./scripts/mineru_parse.sh /path/to/file.pdf

Optional examples:

./scripts/mineru_parse.sh /path/to/file.pdf --format json
./scripts/mineru_parse.sh /path/to/file.pdf --tables --images

When to read references

If flags differ from your wrapper or you need advanced defaults (backend/method/device/threads/format mapping), read:

  • references/mineru-cli.md

Output conventions

  • Output root defaults to
    ./mineru-output/
    .
  • MinerU creates the per-document subfolder under the output root (e.g.,
    ./mineru-output/<basename>/...
    ).

Batching

Default is single-PDF parsing. Only implement batch folder parsing if explicitly requested.