Chatgpt-skills ocr-document-processor
Extract text and structure from scans, images, and scanned PDFs. Use for OCR, searchable PDFs, table extraction, receipt parsing, and business card parsing.
install
source · Clone the upstream repo
git clone https://github.com/dkyazzentwatwa/chatgpt-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/dkyazzentwatwa/chatgpt-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/ocr-document-processor" ~/.claude/skills/dkyazzentwatwa-chatgpt-skills-ocr-document-processor && rm -rf "$T"
manifest:
ocr-document-processor/SKILL.mdsource content
OCR Document Processor
Handle OCR-heavy inputs where text must be recovered from images or scanned pages.
Use This For
- OCR on images and scanned PDFs
- Searchable PDF export
- Structured extraction to text, markdown, JSON, or HTML
- Table extraction from scanned material
- Receipt parsing and business card parsing
Workflow
- Decide whether plain OCR, structured extraction, or document-specific parsing is needed.
- Preprocess noisy inputs before extraction when skew, blur, or shadows are present.
- Use
for core OCR tasks.scripts/ocr_processor.py - Use the focused helpers when the input is specialized:
scripts/business_card_scanner.pyscripts/receipt_scanner.py
- Return confidence caveats when the source is low quality, rotated, handwritten, or multilingual.
Guardrails
- Prefer explicit language selection when accuracy matters.
- Do not claim fields are exact when OCR confidence is weak.
- Route non-scanned digital PDFs to
instead of OCR by default.document-converter-suite