Skillsbench marker
Convert PDF documents to Markdown using marker_single. Use when Claude needs to extract text content from PDFs while preserving LaTeX formulas, equations, and document structure. Ideal for academic papers and technical documents containing mathematical notation.
install
source · Clone the upstream repo
git clone https://github.com/benchflow-ai/skillsbench
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/benchflow-ai/skillsbench "$T" && mkdir -p ~/.claude/skills && cp -r "$T/tasks/latex-formula-extraction/environment/skills/marker" ~/.claude/skills/benchflow-ai-skillsbench-marker && rm -rf "$T"
manifest:
tasks/latex-formula-extraction/environment/skills/marker/SKILL.mdsource content
Marker PDF-to-Markdown Converter
Convert PDFs to Markdown while preserving LaTeX formulas and document structure. Uses the
marker_single CLI from the marker-pdf package.
Dependencies
on PATH (marker_single
if missing)pip install marker-pdf- Python 3.10+ (available in the task image)
Quick Start
from scripts.marker_to_markdown import pdf_to_markdown markdown_text = pdf_to_markdown("paper.pdf") print(markdown_text)
Python API
pdf_to_markdown(pdf_path, *, timeout=600, cleanup=True) -> str- Runs
marker_single --output_format markdown --disable_image_extraction
: use a temp directory and delete after reading the Markdowncleanup=True
: keep outputs incleanup=False
next to the PDF<pdf_stem>_marker/- Exceptions:
if the PDF is missing,FileNotFoundError
for marker failures,RuntimeError
if it exceeds the timeoutTimeoutError
- Runs
- Tips: bump
for large PDFs; settimeout
to inspect intermediate filescleanup=False
Command-Line Usage
# Basic conversion (prints markdown to stdout) python scripts/marker_to_markdown.py paper.pdf # Keep temporary files python scripts/marker_to_markdown.py paper.pdf --keep-temp # Custom timeout python scripts/marker_to_markdown.py paper.pdf --timeout 600
Output Locations
: outputs stored in a temporary directory and removed automaticallycleanup=True
: outputs saved tocleanup=False
; markdown lives at<pdf_stem>_marker/
when present (otherwise the first<pdf_stem>_marker/<pdf_stem>/<pdf_stem>.md
file is used).md
Troubleshooting
not found: installmarker_single
or ensure the CLI is on PATHmarker-pdf- No Markdown output: re-run with
/--keep-temp
and checkcleanup=False
/stdout
saved in the output folderstderr