Learn-skills.dev markitdown
Convert local documents to Markdown using Microsoft's markitdown CLI. Best for: PDF, Word, Excel, PowerPoint, images (OCR), audio. Can fetch URLs but Jina is faster for web. Triggers on: convert to markdown, read PDF, parse document, extract text from, docx, xlsx, pptx, OCR image, local file.
install
source · Clone the upstream repo
git clone https://github.com/NeverSight/learn-skills.dev
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/NeverSight/learn-skills.dev "$T" && mkdir -p ~/.claude/skills && cp -r "$T/data/skills-md/0xdarkmatter/claude-mods/markitdown" ~/.claude/skills/neversight-learn-skills-dev-markitdown && rm -rf "$T"
manifest:
data/skills-md/0xdarkmatter/claude-mods/markitdown/SKILL.mdsource content
markitdown - Document to Markdown
Convert local documents to clean Markdown. One tool for PDF, Word, Excel, PowerPoint, images, and more.
When to Use markitdown
| Use Case | Recommendation |
|---|---|
| Local files (PDF, Word, Excel) | ✅ Use markitdown - unique capability |
| Web pages | ❌ Use Jina () - 5x faster |
| Blocked/anti-bot sites | ❌ Use Firecrawl |
| OCR on images | ✅ Use markitdown |
| Audio transcription | ✅ Use markitdown |
Basic Usage
# Local files (primary use case) markitdown document.pdf markitdown report.docx markitdown data.xlsx markitdown slides.pptx markitdown screenshot.png # OCR # URLs (works, but Jina is faster) markitdown https://example.com # Save output markitdown document.pdf > document.md
Supported Formats
| Format | Extensions | Notes |
|---|---|---|
| Text extraction, tables | |
| Word | | Formatting preserved |
| Excel | | Tables to markdown |
| PowerPoint | | Slides as sections |
| Images | , | OCR text extraction |
| HTML | | Clean conversion |
| Audio | , | Speech-to-text |
| Text | , , , | Pass-through/structure |
| URLs | | Works but slower than Jina |
Benchmarked Performance (URLs)
| Tool | Avg Speed | Success Rate |
|---|---|---|
| Jina | 0.5s | 10/10 |
| markitdown | 2.5s | 9/10 |
| Firecrawl | 4.5s | 10/10 |
Verdict: For URLs, use Jina. For local files, markitdown is the only option.
Examples
# PDF to markdown (primary use case) markitdown report.pdf > report.md # Excel spreadsheet markitdown financials.xlsx # Image with text (OCR) markitdown screenshot.png # PowerPoint deck markitdown presentation.pptx > slides.md # Audio transcription markitdown meeting.mp3 > transcript.md
Comparison with Alternatives
| Task | markitdown | Alternative |
|---|---|---|
| PDF text | | PyMuPDF, pdfplumber |
| Word docs | | python-docx |
| Excel | | pandas, openpyxl |
| OCR | | Tesseract |
| Web pages | Use Jina instead | (5x faster) |
markitdown's advantage: One CLI for all local document formats. No code needed.