Full-stack-skills ocrmypdf-api
OCRmyPDF Python API and plugin skill — use OCRmyPDF programmatically from Python, integrate with applications, and extend with plugins (EasyOCR, PaddleOCR, AppleOCR). Use when the user needs to call OCRmyPDF from Python code, build OCR pipelines, or use alternative OCR engines.
git clone https://github.com/partme-ai/full-stack-skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/partme-ai/full-stack-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/ocrmypdf-skills/ocrmypdf-api" ~/.claude/skills/partme-ai-full-stack-skills-ocrmypdf-api && rm -rf "$T"
skills/ocrmypdf-skills/ocrmypdf-api/SKILL.mdOCRmyPDF — Python API & Plugins Guide
Overview
OCRmyPDF provides a Python API for programmatic use and a plugin interface for extending or replacing OCR engines. This skill covers the Python API, integration patterns, and the plugin ecosystem.
For CLI usage, see the ocrmypdf skill. For batch scripting, see ocrmypdf-batch.
Python API
Basic usage
import ocrmypdf # Basic OCR exit_code = ocrmypdf.ocr('input.pdf', 'output.pdf') # With options exit_code = ocrmypdf.ocr( 'input.pdf', 'output.pdf', language='eng+fra', deskew=True, rotate_pages=True, skip_text=True, optimize=2, jobs=4, )
Return codes
import ocrmypdf result = ocrmypdf.ocr('input.pdf', 'output.pdf') if result == ocrmypdf.ExitCode.ok: print("OCR completed successfully") elif result == ocrmypdf.ExitCode.already_done_ocr: print("PDF already has OCR text") elif result == ocrmypdf.ExitCode.input_file: print("Input file issue") else: print(f"Error: {result}")
Common API parameters
| Parameter | Type | Description |
|---|---|---|
| str | Tesseract language(s), e.g. |
| bool | Straighten crooked pages |
| bool | Auto-rotate pages |
| bool | Skip pages that already have text |
| bool | Force OCR on all pages |
| bool | Replace existing OCR |
| int | Optimization level (0-3) |
| str | , , , |
| int | Number of parallel workers |
| str | Path for sidecar text file |
| int | DPI for image inputs |
| bool | Clean pages with unpaper (OCR only) |
| bool | Clean pages and use in output |
| bool | Remove noisy backgrounds |
| int | Oversample DPI for low-res images |
| str | Page range, e.g. |
| str | Output PDF title |
| str | Output PDF author |
Integration example: Flask web service
from flask import Flask, request, send_file import ocrmypdf import tempfile import os app = Flask(__name__) @app.route('/ocr', methods=['POST']) def ocr_endpoint(): """OCR a PDF via HTTP POST.""" if 'file' not in request.files: return {'error': 'No file uploaded'}, 400 uploaded = request.files['file'] with tempfile.NamedTemporaryFile(suffix='.pdf', delete=False) as inp: uploaded.save(inp.name) out_path = inp.name.replace('.pdf', '_ocr.pdf') try: result = ocrmypdf.ocr( inp.name, out_path, language='eng', skip_text=True, optimize=2, ) if result == ocrmypdf.ExitCode.ok: return send_file(out_path, as_attachment=True, download_name='ocr_output.pdf') return {'error': f'OCR failed: {result}'}, 500 finally: os.unlink(inp.name) if os.path.exists(out_path): os.unlink(out_path) if __name__ == '__main__': app.run(port=5000)
Streamlit web UI
OCRmyPDF provides an optional Streamlit-based web UI:
pip install ocrmypdf[webservice] # See OCRmyPDF docs for launching the web service
Plugin Ecosystem
OCRmyPDF's plugin interface allows replacing the OCR engine. Available plugins:
OCRmyPDF-EasyOCR
Replaces Tesseract with EasyOCR (PyTorch-based). GPU strongly recommended.
pip install ocrmypdf-easyocr # Usage ocrmypdf --plugin ocrmypdf_easyocr -l en input.pdf output.pdf
OCRmyPDF-PaddleOCR
Replaces Tesseract with PaddleOCR. Powerful GPU-accelerated engine.
pip install ocrmypdf-paddleocr # Usage ocrmypdf --plugin ocrmypdf_paddleocr input.pdf output.pdf
OCRmyPDF-AppleOCR
Replaces Tesseract with Apple Vision Framework. macOS only.
pip install ocrmypdf-appleocr # Usage ocrmypdf --plugin ocrmypdf_appleocr input.pdf output.pdf
paperless-ngx Integration
paperless-ngx uses OCRmyPDF internally for searchable document management. See paperless-ngx docs for configuration.
Custom Plugins
Create a custom OCR plugin by implementing the OCRmyPDF plugin interface:
# my_ocr_plugin.py from ocrmypdf import OcrEngine, hookimpl class MyOcrEngine(OcrEngine): """Custom OCR engine implementation.""" @staticmethod def version(): return "1.0.0" @staticmethod def creator_tag(options): return "MyOCR" def recognize(self, input_file, output_file, output_text, options): # Implement OCR logic here pass @hookimpl def get_ocr_engine(): return MyOcrEngine()
# Use custom plugin ocrmypdf --plugin my_ocr_plugin input.pdf output.pdf
Quick Reference
| Task | Code / Command |
|---|---|
| Python API basic | |
| With options | |
| Check result | |
| EasyOCR plugin | |
| PaddleOCR plugin | |
| AppleOCR plugin | |
Troubleshooting
- Import error: Ensure
in your Python environment.pip install ocrmypdf - Plugin not found: Check plugin is installed (
).pip install ocrmypdf-easyocr - GPU not used (EasyOCR/PaddleOCR): Ensure CUDA/GPU drivers are installed.
- Memory issues: Use
for large files; process in batches.jobs=1