article-translation

Translate web pages and PDF documents to Korean, save as markdown files. Supports image/table preservation, JS-rendered pages, large PDF chunk processing. Use when 번역, translation, translate, 한글로, Korean

install

source · Clone the upstream repo

git clone https://github.com/seanlion/awesome-skills-for-claude

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/seanlion/awesome-skills-for-claude "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/article-translation" ~/.claude/skills/seanlion-awesome-skills-for-claude-article-translation && rm -rf "$T"

manifest: .claude/skills/article-translation/SKILL.md

source content

Article Translation

Translate web pages or PDF documents to Korean and save as markdown files.

Quick Start

Extract content: Determine source type (URL or PDF path), use appropriate extraction method
Translate: Create paragraph-level plan with TodoWrite, translate to Korean
Save: Generate
```
.md
```
file with original title, review and finalize

Workflow

Step 0: Check for Existing Checkpoint

ALWAYS check for checkpoint file before starting:

Look for
```
.translation-checkpoint-{filename}.json
```
in current directory
If checkpoint exists, read it and resume from last completed section
If no checkpoint, start fresh from Step 1

Step 1: Source Analysis

Determine source type and select extraction method:

Source Type	Detection Criteria	Processing Method
PDF	`.pdf` extension	Use pdfplumber script (NEVER use Read tool for PDF)
Static Web	`curl` returns content	Download with curl
JS-rendered Web	`curl` returns empty/incomplete	Use Playwright script

Step 2: Content Extraction

For PDF:

Check file info (size, page count)
ALWAYS use pdfplumber script from scripts.md - Read tool will fail with "Too large" error
Use page range for chunk processing if needed

For Web pages:

Try
```
curl
```
first
If content is empty/incomplete, use Playwright script for JS rendering
Verify full content is captured

Step 3: Translation with Checkpointing

Create paragraph-level translation plan with TodoWrite
Before translating each section: Update checkpoint file with current progress
Translate paragraph by paragraph to Korean
After completing each section: Save checkpoint with completed sections list
Download images to local storage
Preserve table structure in markdown, translate cell content only
Run large file downloads in background

Checkpoint file format (

.translation-checkpoint-{filename}.json

{
  "source_url": "https://example.com/article",
  "source_type": "web",
  "output_file": "Article Title.md",
  "total_sections": 10,
  "completed_sections": [1, 2, 3],
  "current_section": 4,
  "last_updated": "2024-01-15T10:30:00",
  "partial_content": "... translated content so far ..."
}

Step 4: Review & Correction

Compare each paragraph with original
Identify and fix awkward translations
Check for missing content
Save final markdown file
Delete checkpoint file after successful completion

Output Format

Filename Rule

Use original title (remove special characters)
예:
```
Understanding React Hooks.md
```
한글 제목은 원문 제목이 한글일 때만 사용 가능:
```
리액트 훅 이해하기.md
```

Markdown Structure

# 번역된 제목

> 원문: [원문 제목](원문 URL)

## 섹션 1

본문 내용...

![이미지 설명](./images/image1.png)

| 컬럼1 | 컬럼2 |
|------|------|
| 내용1 | 내용2 |

## 섹션 2

...

Best Practices

Paragraph-level translation: Never translate entire document at once. Process paragraph by paragraph for quality control
Technical terms: Include original term in parentheses when needed (예: "상태 관리(State Management)")
Image alt text: Translate image descriptions for accessibility
Table structure: Preserve markdown table format, translate cell content only
Code blocks: Never translate code. Only translate comments if necessary

Common Issues

Issue	Cause	Solution
PDF "Too large" error	Attempted to read large PDF with Read tool	Use pdfplumber script instead (scripts.md)
pdfplumber not working	Not installed	`pip3 install pdfplumber`
Empty web page content	JS-rendered page	Use Playwright script
Broken images	Relative path issue	Convert to absolute path or download locally
Awkward translation	Literal translation	Consider context, paraphrase in review step
Playwright not working	chromium not installed	`playwright install chromium`
Translation interrupted	Session timeout or error	Resume from checkpoint file (Step 0)
Duplicate translation work	Didn't check for checkpoint	Always run Step 0 first to check existing progress

Examples

예시 1: 웹페이지 번역 (신규)

요청:

https://example.com/react-hooks-guide 번역해줘

처리 흐름:

체크포인트 파일 확인 → 없음, 신규 시작
curl로 페이지 다운로드 시도
콘텐츠 확인 후 문단별 TodoWrite 작성
각 섹션 번역 (섹션 완료마다 체크포인트 저장)
```
React Hooks Guide.md
```
파일 생성
원문 비교 검토 후 최종 저장
체크포인트 파일 삭제

예시 2: 번역 재개 (체크포인트에서)

요청:

https://example.com/react-hooks-guide 번역해줘

처리 흐름:

체크포인트 파일 확인 →

.translation-checkpoint-react-hooks-guide.json

발견

체크포인트 읽기: 섹션 1-3 완료, 현재 섹션 4
섹션 4부터 번역 재개
나머지 섹션 번역 (섹션 완료마다 체크포인트 갱신)
최종 저장 후 체크포인트 파일 삭제

예시 3: PDF 번역

요청:

/path/to/whitepaper.pdf 한글로 번역해줘

처리 흐름:

체크포인트 파일 확인 → 없음, 신규 시작
PDF 정보 확인 (50페이지, 5MB)
pdfplumber로 전체 텍스트 추출
문단별 TodoWrite 작성 (10개 섹션)
섹션별 번역 진행 (섹션 완료마다 체크포인트 저장)
이미지 다운로드 (백그라운드)
```
Whitepaper.md
```
파일 생성
검토 및 교정 후 체크포인트 삭제

Code Reference

자세한 스크립트는 scripts.md 참조:

PDF 추출 (pdfplumber) - 전체/부분 페이지
JS 렌더링 페이지 추출 (Playwright)
설치 가이드