Claude-skill-registry document-docx

Create, edit, and analyze Microsoft Word .docx files (reports, contracts, proposals) with styles, tables, headers/footers, template filling, content extraction, and conversion to HTML; support review workflows (comments/highlights) and inspect tracked changes via OOXML when needed using Python/Node.js (python-docx, docxtpl, mammoth.js, docx).

install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/document-docx" ~/.claude/skills/majiayu000-claude-skill-registry-document-docx && rm -rf "$T"
manifest: skills/data/document-docx/SKILL.md
source content

Document DOCX Skill - Quick Reference

This skill enables creation, editing, and analysis of

.docx
files for reports, contracts, proposals, documentation, and template-driven outputs.

Modern best practices (2026):

  • Prefer templates + styles over manual formatting.
  • Treat
    .docx
    as the editable source; treat PDF as a release artifact.
  • If distributing externally, include basic accessibility hygiene (headings, table headers, alt text).

Quick Reference

TaskTool/LibraryLanguageWhen to Use
Create DOCXpython-docxPythonReports, contracts, proposals
Create DOCXdocxNode.jsServer-side document generation
Convert to HTMLmammoth.jsNode.jsWeb display, content extraction
Parse DOCXpython-docxPythonExtract text, tables, metadata
Template filldocxtplPythonMail merge, template-based generation
Review workflowWord compare, comments/highlightsAnyHuman review without OOXML surgery
Tracked changesOOXML inspection, docx4j/OpenXML SDK/AsposeAnyTrue redlines or parsing tracked changes

Tool Selection

  • Prefer
    docxtpl
    when non-developers must edit layout/design in Word.
  • Prefer
    python-docx
    for structural edits (paragraphs/tables/headers/footers) when formatting complexity is moderate.
  • Prefer
    docx
    (Node.js) for server-side generation in TypeScript-heavy stacks.
  • Prefer
    mammoth
    for text-first extraction or DOCX-to-HTML (best effort; may drop some layout fidelity).

Known Limits (Plan Around These)

  • .doc
    (legacy) is not supported by these libraries; convert to
    .docx
    first (e.g., LibreOffice).
  • python-docx
    cannot reliably create true tracked changes; use Word compare or specialized OOXML tooling.
  • Tables of Contents and many fields are placeholders until opened/updated in Word.

Core Operations

Create Document (Python - python-docx)

from docx import Document
from docx.shared import Inches, Pt
from docx.enum.text import WD_ALIGN_PARAGRAPH

doc = Document()

# Title
title = doc.add_heading('Document Title', 0)
title.alignment = WD_ALIGN_PARAGRAPH.CENTER

# Paragraph with formatting
para = doc.add_paragraph()
run = para.add_run('Bold and ')
run.bold = True
run = para.add_run('italic text.')
run.italic = True

# Table
table = doc.add_table(rows=3, cols=3)
table.style = 'Table Grid'
for i, row in enumerate(table.rows):
    for j, cell in enumerate(row.cells):
        cell.text = f'Row {i+1}, Col {j+1}'

# Image
doc.add_picture('image.png', width=Inches(4))

# Save
doc.save('output.docx')

Create Document (Node.js - docx)

import { Document, Packer, Paragraph, TextRun, Table, TableRow, TableCell } from 'docx';
import * as fs from 'fs';

const doc = new Document({
  sections: [{
    properties: {},
    children: [
      new Paragraph({
        children: [
          new TextRun({ text: 'Bold text', bold: true }),
          new TextRun({ text: ' and normal text.' }),
        ],
      }),
      new Table({
        rows: [
          new TableRow({
            children: [
              new TableCell({ children: [new Paragraph('Cell 1')] }),
              new TableCell({ children: [new Paragraph('Cell 2')] }),
            ],
          }),
        ],
      }),
    ],
  }],
});

Packer.toBuffer(doc).then((buffer) => {
  fs.writeFileSync('output.docx', buffer);
});

Template-Based Generation (Python - docxtpl)

from docxtpl import DocxTemplate

doc = DocxTemplate('template.docx')
context = {
    'company_name': 'Acme Corp',
    'date': '2025-01-15',
    'items': [
        {'name': 'Widget A', 'price': 100},
        {'name': 'Widget B', 'price': 200},
    ]
}
doc.render(context)
doc.save('filled_template.docx')

Extract Content (Python - python-docx)

from docx import Document

doc = Document('input.docx')

# Extract all text
full_text = []
for para in doc.paragraphs:
    full_text.append(para.text)

# Extract tables
for table in doc.tables:
    for row in table.rows:
        row_data = [cell.text for cell in row.cells]
        print(row_data)

Styling Reference

ElementPython MethodNode.js Class
Heading 1
add_heading(text, 1)
HeadingLevel.HEADING_1
Bold
run.bold = True
TextRun({ bold: true })
Italic
run.italic = True
TextRun({ italics: true })
Font size
run.font.size = Pt(12)
TextRun({ size: 24 })
(half-points)
Alignment
WD_ALIGN_PARAGRAPH.CENTER
AlignmentType.CENTER
Page break
doc.add_page_break()
new PageBreak()

Do / Avoid (Dec 2025)

Do

  • Use consistent heading levels and a table of contents for long docs.
  • Capture decisions and action items with owners and due dates.
  • Store docs in a versioned, searchable system.

Avoid

  • Manual formatting instead of styles (breaks consistency).
  • Docs with no owner or review cadence (stale quickly).
  • Copy/pasting without updating definitions and links.

Output Quality Checklist

  • Structure: consistent heading hierarchy, styles, and (when needed) an auto-generated table of contents.
  • Decisions: decisions/actions captured with owner + due date (not buried in prose).
  • Versioning: doc ID + version + change summary; review cadence defined.
  • Accessibility hygiene: headings/reading order are correct; table headers are marked; alt text for non-decorative images.
  • Reuse: use
    assets/doc-template-pack.md
    for decision logs and recurring doc types.

Optional: AI / Automation

Use only when explicitly requested and policy-compliant.

  • Summarize meeting notes into decisions/actions; humans verify accuracy.
  • Draft first-pass docs from outlines; do not invent facts or quotes.

Navigation

Resources

Scripts

  • scripts/docx_inspect_ooxml.py
    - Dependency-free OOXML inspection (including tracked changes signals)
  • scripts/docx_extract.py
    - Extract text/tables to JSON (requires
    python-docx
    )
  • scripts/docx_render_template.py
    - Render a
    docxtpl
    template (requires
    docxtpl
    )
  • scripts/docx_to_html.mjs
    - Convert
    .docx
    to HTML (requires
    mammoth
    )

Templates

Related Skills