Learn-skills.dev translate-book
Translate a book or long document (epub, docx, pdf, markdown) to another language using the agent's built-in intelligence. Interactive — asks user for file, language, and style.
git clone https://github.com/NeverSight/learn-skills.dev
T=$(mktemp -d) && git clone --depth=1 https://github.com/NeverSight/learn-skills.dev "$T" && mkdir -p ~/.claude/skills && cp -r "$T/data/skills-md/ac1982/translate-book/translate-book" ~/.claude/skills/neversight-learn-skills-dev-translate-book && rm -rf "$T"
data/skills-md/ac1982/translate-book/translate-book/SKILL.mdTranslate Book
Use the agent's built-in intelligence (like Claude Code, OpenAI Codex, Gemini CLI, GitHub Copilot) to translate an entire book or long document. No external translation API — you ARE the translator. Supports epub, docx, pdf (text-based), and markdown.
Interaction Language
Detect the user's language from their message and use it as the primary TUI response language throughout the entire workflow. For example, if the user writes in Chinese, all prompts, status updates, and results should be in Chinese.
Interactive Setup
Ask the user 3 questions before starting (glob for supported files in cwd to offer choices):
- Which file? (scan for
,*.epub
,*.docx
,*.pdf
)*.md - Target language?
- Translation style? (e.g. 信达雅、口语化、学术、直译、儿童读物…)
Workflow
1. Unpack
Most book formats are ZIP archives internally (epub, docx). Use
uv + Python zipfile to extract to work/. For markdown, read directly. For PDF, extract text via PyMuPDF (warn and stop if scanned/image-only).
2. Translate content — only text, preserve everything else
Analyze structure, split content files into ~6 balanced batches, launch parallel background agents.
Each agent only translates human-readable text. Everything else stays untouched:
| Translate | Preserve as-is |
|---|---|
| Paragraphs, headings, TOC entries | Markup structure (HTML, XML, markdown syntax) |
| Chapter titles, section names | Tags, attributes, CSS classes, IDs, links |
| Body text, dialogue, quotes | Images (, , , ...) |
| Metadata: book title, description | Fonts (, , , ...) |
Stylesheets (, , ...) | |
Config files (, , , , ...) | |
| Footnote citations, URLs, ISBN, copyright notices |
Format-specific notes:
- EPUB: content is in
; also translateOEBPS/*.xhtml
TOC andnav.xhtml
; updatetoc.ncx
language codecontent.opf - DOCX: text lives in
elements within<w:t>
,word/document.xml
,word/header*.xml
; updateword/footer*.xmldocProps/core.xml - PDF: lossy — export to markdown first, translate, output as
or.md
(inform user).docx - Markdown: preserve all syntax (links, images, code blocks, front matter)
3. Repack
Rebuild in the original format. Output as
<name>-<lang>.<ext>.
- EPUB special rule:
must be first ZIP entry, uncompressed (mimetype
)ZIP_STORED
4. Validate & cleanup
Structure check, spot-check 2-3 chapters, clean up
work/