Skills general-ocr-struct
General-purpose offline OCR and post-processing for Chinese/English screenshots, scanned images, receipts, tables, chat screenshots, statement screenshots, and other text-heavy images. Use when you need to: (1) extract text from an image locally, (2) return raw OCR text before interpretation, (3) clean broken OCR lines into structured content, (4) reorganize recognized text into rows/fields for downstream use, or (5) separate recognition from later table entry, summarization, or document drafting.
install
source · Clone the upstream repo
git clone https://github.com/openclaw/skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/9penny/general-ocr-struct" ~/.claude/skills/openclaw-skills-general-ocr-struct && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/9penny/general-ocr-struct" ~/.openclaw/skills/openclaw-skills-general-ocr-struct && rm -rf "$T"
manifest:
skills/9penny/general-ocr-struct/SKILL.mdsource content
General OCR Struct
Use this skill to separate OCR recognition from downstream content整理.
Workflow
- Run the local OCR script on the image first.
- Return the raw OCR text before making business interpretations when accuracy matters.
- If the image is a transaction-detail screenshot, run structuring mode to group rows into fields.
- Mark uncertain fields explicitly as
; do not guess missing content.待确认 - Only after the user confirms recognition quality, use the result for tables, summaries, or documents.
Commands
Raw OCR
python3 scripts/general_ocr.py raw /path/to/image.jpg
Structured transaction extraction
python3 scripts/general_ocr.py transactions /path/to/image.jpg
JSON output
python3 scripts/general_ocr.py transactions /path/to/image.jpg --json
Output rules
- Prefer showing the recognition result first, then the cleaned structure.
- Preserve source wording where possible.
- For uncertain content, use
instead of inferring.待确认 - Adapt the structure to the source image type. For statement-like screenshots, common fields are:
,card_last4
,date
,time
,currency
,merchant
.amount
Notes
- This skill uses RapidOCR locally.
- First install may need Python packages; after setup it runs offline.
- If OCR quality is weak, request a higher-resolution original screenshot before doing deeper整理.