Desktop pdf
PDF document processing — reading, creating, table extraction, merging/splitting, form handling, watermarking, encryption.
install
source · Clone the upstream repo
git clone https://github.com/openyak/openyak
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/openyak/openyak "$T" && mkdir -p ~/.claude/skills && cp -r "$T/backend/app/data/skills/pdf" ~/.claude/skills/openyak-desktop-pdf && rm -rf "$T"
manifest:
backend/app/data/skills/pdf/SKILL.mdsource content
PDF Document Processing
When the user needs to read, create, or manipulate PDF files, follow this guide.
1. Reading PDFs
Quick text extraction
Use the
read tool directly — it natively extracts text from PDF files:
read(file_path="report.pdf")
This returns page-by-page text content. Sufficient for most reading tasks.
Table extraction (advanced)
For structured table data, use
pdfplumber via a Python script:
import pdfplumber with pdfplumber.open("report.pdf") as pdf: for page in pdf.pages: tables = page.extract_tables() for table in tables: for row in table: print(row)
Run this directly with
code_execute. Only use write + bash if you need to save the script for reuse.
Image-based / scanned PDFs
If
read returns "(No text content)", the PDF is likely image-based. Use OCR:
# Requires: pip install pytesseract pdf2image from pdf2image import convert_from_path import pytesseract images = convert_from_path("scanned.pdf") for i, img in enumerate(images, 1): text = pytesseract.image_to_string(img, lang="chi_sim+eng") print(f"--- Page {i} ---") print(text)
2. Creating PDFs
Simple text PDFs with reportlab
from reportlab.lib.pagesizes import A4 from reportlab.pdfgen import canvas c = canvas.Canvas("output.pdf", pagesize=A4) width, height = A4 c.setFont("Helvetica", 12) c.drawString(72, height - 72, "Hello, World!") c.save()
Structured documents with Platypus
For multi-page documents with paragraphs, tables, and images:
from reportlab.lib.pagesizes import A4 from reportlab.lib.styles import getSampleStyleSheet from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, Table, TableStyle from reportlab.lib import colors from reportlab.lib.units import inch doc = SimpleDocTemplate("report.pdf", pagesize=A4) styles = getSampleStyleSheet() story = [] # Title story.append(Paragraph("Quarterly Report", styles["Title"])) story.append(Spacer(1, 12)) # Body text story.append(Paragraph("This is the report content.", styles["Normal"])) story.append(Spacer(1, 12)) # Table data = [ ["Quarter", "Revenue", "Growth"], ["Q1", "$1.2M", "+5%"], ["Q2", "$1.4M", "+17%"], ] table = Table(data, colWidths=[1.5 * inch, 1.5 * inch, 1.5 * inch]) table.setStyle(TableStyle([ ("BACKGROUND", (0, 0), (-1, 0), colors.grey), ("TEXTCOLOR", (0, 0), (-1, 0), colors.whitesmoke), ("GRID", (0, 0), (-1, -1), 1, colors.black), ])) story.append(table) doc.build(story)
Important: subscript/superscript
Never use Unicode subscript/superscript characters (like
₂, ³) in reportlab. They cause font errors. Use XML markup instead:
Paragraph("H<sub>2</sub>O and x<super>2</super>", style)
Chinese / CJK text
Register a CJK font before using Chinese text:
from reportlab.pdfbase import pdfmetrics from reportlab.pdfbase.ttfonts import TTFont pdfmetrics.registerFont(TTFont("SimSun", "simsun.ttc")) c.setFont("SimSun", 12) c.drawString(72, height - 72, "中文文本")
3. Manipulating PDFs
Merge multiple PDFs
from pypdf import PdfWriter writer = PdfWriter() for pdf_file in ["part1.pdf", "part2.pdf", "part3.pdf"]: writer.append(pdf_file) writer.write("merged.pdf") writer.close()
Split / extract pages
from pypdf import PdfReader, PdfWriter reader = PdfReader("large.pdf") writer = PdfWriter() # Extract pages 5-10 (0-indexed) for page in reader.pages[4:10]: writer.add_page(page) writer.write("pages_5_to_10.pdf") writer.close()
Rotate pages
from pypdf import PdfReader, PdfWriter reader = PdfReader("input.pdf") writer = PdfWriter() for page in reader.pages: page.rotate(90) # 90, 180, or 270 degrees writer.add_page(page) writer.write("rotated.pdf") writer.close()
Add watermark
from pypdf import PdfReader, PdfWriter reader = PdfReader("document.pdf") watermark = PdfReader("watermark.pdf") writer = PdfWriter() for page in reader.pages: page.merge_page(watermark.pages[0]) writer.add_page(page) writer.write("watermarked.pdf") writer.close()
Password protection
from pypdf import PdfWriter writer = PdfWriter() writer.append("document.pdf") writer.encrypt(user_password="read_pass", owner_password="admin_pass") writer.write("encrypted.pdf") writer.close()
4. PDF form handling
See
forms.md for detailed form field extraction and filling.
Quick overview:
- Read form fields:
→pypdf.PdfReaderreader.get_fields() - Fill fillable forms:
→pypdf.PdfWriterwriter.update_page_form_field_values() - Non-fillable forms: Add text annotations at specific coordinates
5. Quick reference
| Task | Tool / Library |
|---|---|
| Read text | tool (built-in) |
| Extract tables | |
| OCR scanned PDFs | + |
| Create PDFs | |
| Merge / Split | |
| Rotate pages | |
| Watermark | |
| Encrypt / Decrypt | |
| Form fields | |
6. Dependencies
Core (included with Muse):
— read, merge, split, rotate, encrypt, formspypdf
Optional (install as needed):
— table extraction:pdfplumberpip install pdfplumber
— PDF creation:reportlabpip install reportlab
— OCR:pytesseract
(requires Tesseract binary)pip install pytesseract
— PDF to image:pdf2image
(requires Poppler)pip install pdf2image