Openakita openakita/skills@baidu-paddleocr-doc

PaddleOCR document parsing skill based on PaddleOCR-VL-1.5. Provides SOTA-level document understanding with ultra-high precision recognition and parsing. Use when user needs to parse, extract, or understand document content.

install
source · Clone the upstream repo
git clone https://github.com/openakita/openakita
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/openakita/openakita "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/baidu-paddleocr-doc" ~/.claude/skills/openakita-openakita-openakita-skills-baidu-paddleocr-doc && rm -rf "$T"
manifest: skills/baidu-paddleocr-doc/SKILL.md
source content

文心衍生 · PaddleOCR 文档解析

基于 SOTA 文档解析模型 PaddleOCR-VL-1.5 构建,为 Agent 加上"眼睛",对文档进行超高精度识别、解析。

配置

export BAIDU_API_KEY="your_key"

功能

  • 文档结构识别
  • 表格提取与还原
  • 公式识别
  • 图文混排解析
  • 多语言文档支持

预置脚本

scripts/baidu_ocr_doc.py

百度文档/表格 OCR 识别,需设置 BAIDU_OCR_AK 和 BAIDU_OCR_SK。

python3 scripts/baidu_ocr_doc.py doc /path/to/document.jpg
python3 scripts/baidu_ocr_doc.py table /path/to/table.png