Claude-skill-registry document-docx
Create, edit, and analyze Microsoft Word .docx files (reports, contracts, proposals) with styles, tables, headers/footers, template filling, content extraction, and conversion to HTML; support review workflows (comments/highlights) and inspect tracked changes via OOXML when needed using Python/Node.js (python-docx, docxtpl, mammoth.js, docx).
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/document-docx" ~/.claude/skills/majiayu000-claude-skill-registry-document-docx && rm -rf "$T"
manifest:
skills/data/document-docx/SKILL.mdsource content
Document DOCX Skill - Quick Reference
This skill enables creation, editing, and analysis of
.docx files for reports, contracts, proposals, documentation, and template-driven outputs.
Modern best practices (2026):
- Prefer templates + styles over manual formatting.
- Treat
as the editable source; treat PDF as a release artifact..docx - If distributing externally, include basic accessibility hygiene (headings, table headers, alt text).
Quick Reference
| Task | Tool/Library | Language | When to Use |
|---|---|---|---|
| Create DOCX | python-docx | Python | Reports, contracts, proposals |
| Create DOCX | docx | Node.js | Server-side document generation |
| Convert to HTML | mammoth.js | Node.js | Web display, content extraction |
| Parse DOCX | python-docx | Python | Extract text, tables, metadata |
| Template fill | docxtpl | Python | Mail merge, template-based generation |
| Review workflow | Word compare, comments/highlights | Any | Human review without OOXML surgery |
| Tracked changes | OOXML inspection, docx4j/OpenXML SDK/Aspose | Any | True redlines or parsing tracked changes |
Tool Selection
- Prefer
when non-developers must edit layout/design in Word.docxtpl - Prefer
for structural edits (paragraphs/tables/headers/footers) when formatting complexity is moderate.python-docx - Prefer
(Node.js) for server-side generation in TypeScript-heavy stacks.docx - Prefer
for text-first extraction or DOCX-to-HTML (best effort; may drop some layout fidelity).mammoth
Known Limits (Plan Around These)
(legacy) is not supported by these libraries; convert to.doc
first (e.g., LibreOffice)..docx
cannot reliably create true tracked changes; use Word compare or specialized OOXML tooling.python-docx- Tables of Contents and many fields are placeholders until opened/updated in Word.
Core Operations
Create Document (Python - python-docx)
from docx import Document from docx.shared import Inches, Pt from docx.enum.text import WD_ALIGN_PARAGRAPH doc = Document() # Title title = doc.add_heading('Document Title', 0) title.alignment = WD_ALIGN_PARAGRAPH.CENTER # Paragraph with formatting para = doc.add_paragraph() run = para.add_run('Bold and ') run.bold = True run = para.add_run('italic text.') run.italic = True # Table table = doc.add_table(rows=3, cols=3) table.style = 'Table Grid' for i, row in enumerate(table.rows): for j, cell in enumerate(row.cells): cell.text = f'Row {i+1}, Col {j+1}' # Image doc.add_picture('image.png', width=Inches(4)) # Save doc.save('output.docx')
Create Document (Node.js - docx)
import { Document, Packer, Paragraph, TextRun, Table, TableRow, TableCell } from 'docx'; import * as fs from 'fs'; const doc = new Document({ sections: [{ properties: {}, children: [ new Paragraph({ children: [ new TextRun({ text: 'Bold text', bold: true }), new TextRun({ text: ' and normal text.' }), ], }), new Table({ rows: [ new TableRow({ children: [ new TableCell({ children: [new Paragraph('Cell 1')] }), new TableCell({ children: [new Paragraph('Cell 2')] }), ], }), ], }), ], }], }); Packer.toBuffer(doc).then((buffer) => { fs.writeFileSync('output.docx', buffer); });
Template-Based Generation (Python - docxtpl)
from docxtpl import DocxTemplate doc = DocxTemplate('template.docx') context = { 'company_name': 'Acme Corp', 'date': '2025-01-15', 'items': [ {'name': 'Widget A', 'price': 100}, {'name': 'Widget B', 'price': 200}, ] } doc.render(context) doc.save('filled_template.docx')
Extract Content (Python - python-docx)
from docx import Document doc = Document('input.docx') # Extract all text full_text = [] for para in doc.paragraphs: full_text.append(para.text) # Extract tables for table in doc.tables: for row in table.rows: row_data = [cell.text for cell in row.cells] print(row_data)
Styling Reference
| Element | Python Method | Node.js Class |
|---|---|---|
| Heading 1 | | |
| Bold | | |
| Italic | | |
| Font size | | (half-points) |
| Alignment | | |
| Page break | | |
Do / Avoid (Dec 2025)
Do
- Use consistent heading levels and a table of contents for long docs.
- Capture decisions and action items with owners and due dates.
- Store docs in a versioned, searchable system.
Avoid
- Manual formatting instead of styles (breaks consistency).
- Docs with no owner or review cadence (stale quickly).
- Copy/pasting without updating definitions and links.
Output Quality Checklist
- Structure: consistent heading hierarchy, styles, and (when needed) an auto-generated table of contents.
- Decisions: decisions/actions captured with owner + due date (not buried in prose).
- Versioning: doc ID + version + change summary; review cadence defined.
- Accessibility hygiene: headings/reading order are correct; table headers are marked; alt text for non-decorative images.
- Reuse: use
for decision logs and recurring doc types.assets/doc-template-pack.md
Optional: AI / Automation
Use only when explicitly requested and policy-compliant.
- Summarize meeting notes into decisions/actions; humans verify accuracy.
- Draft first-pass docs from outlines; do not invent facts or quotes.
Navigation
Resources
- references/docx-patterns.md - Advanced formatting, styles, headers/footers
- references/template-workflows.md - Mail merge, batch generation
- references/tracked-changes.md - Tracked changes: what is feasible, and what is not
- data/sources.json - Library documentation links
Scripts
- Dependency-free OOXML inspection (including tracked changes signals)scripts/docx_inspect_ooxml.py
- Extract text/tables to JSON (requiresscripts/docx_extract.py
)python-docx
- Render ascripts/docx_render_template.py
template (requiresdocxtpl
)docxtpl
- Convertscripts/docx_to_html.mjs
to HTML (requires.docx
)mammoth
Templates
- assets/report-template.md - Standard report structure
- assets/contract-template.md - Legal document structure
- assets/doc-template-pack.md - Decision log, meeting notes, changelog templates
Related Skills
- ../document-pdf/SKILL.md - PDF generation and conversion
- ../docs-codebase/SKILL.md - Technical writing patterns