Learn-skills.dev markitdown

Convert local documents to Markdown using Microsoft's markitdown CLI. Best for: PDF, Word, Excel, PowerPoint, images (OCR), audio. Can fetch URLs but Jina is faster for web. Triggers on: convert to markdown, read PDF, parse document, extract text from, docx, xlsx, pptx, OCR image, local file.

install
source · Clone the upstream repo
git clone https://github.com/NeverSight/learn-skills.dev
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/NeverSight/learn-skills.dev "$T" && mkdir -p ~/.claude/skills && cp -r "$T/data/skills-md/0xdarkmatter/claude-mods/markitdown" ~/.claude/skills/neversight-learn-skills-dev-markitdown && rm -rf "$T"
manifest: data/skills-md/0xdarkmatter/claude-mods/markitdown/SKILL.md
source content

markitdown - Document to Markdown

Convert local documents to clean Markdown. One tool for PDF, Word, Excel, PowerPoint, images, and more.

When to Use markitdown

Use CaseRecommendation
Local files (PDF, Word, Excel)Use markitdown - unique capability
Web pages❌ Use Jina (
r.jina.ai/
) - 5x faster
Blocked/anti-bot sites❌ Use Firecrawl
OCR on imagesUse markitdown
Audio transcriptionUse markitdown

Basic Usage

# Local files (primary use case)
markitdown document.pdf
markitdown report.docx
markitdown data.xlsx
markitdown slides.pptx
markitdown screenshot.png    # OCR

# URLs (works, but Jina is faster)
markitdown https://example.com

# Save output
markitdown document.pdf > document.md

Supported Formats

FormatExtensionsNotes
PDF
.pdf
Text extraction, tables
Word
.docx
Formatting preserved
Excel
.xlsx
Tables to markdown
PowerPoint
.pptx
Slides as sections
Images
.jpg
,
.png
OCR text extraction
HTML
.html
Clean conversion
Audio
.mp3
,
.wav
Speech-to-text
Text
.txt
,
.csv
,
.json
,
.xml
Pass-through/structure
URLs
https://...
Works but slower than Jina

Benchmarked Performance (URLs)

ToolAvg SpeedSuccess Rate
Jina0.5s10/10
markitdown2.5s9/10
Firecrawl4.5s10/10

Verdict: For URLs, use Jina. For local files, markitdown is the only option.

Examples

# PDF to markdown (primary use case)
markitdown report.pdf > report.md

# Excel spreadsheet
markitdown financials.xlsx

# Image with text (OCR)
markitdown screenshot.png

# PowerPoint deck
markitdown presentation.pptx > slides.md

# Audio transcription
markitdown meeting.mp3 > transcript.md

Comparison with Alternatives

TaskmarkitdownAlternative
PDF text
markitdown file.pdf
PyMuPDF, pdfplumber
Word docs
markitdown file.docx
python-docx
Excel
markitdown file.xlsx
pandas, openpyxl
OCR
markitdown image.png
Tesseract
Web pagesUse Jina instead
r.jina.ai/URL
(5x faster)

markitdown's advantage: One CLI for all local document formats. No code needed.