Hve-core powerpoint
PowerPoint slide deck generation and management using python-pptx with YAML-driven content and styling - Brought to you by microsoft/hve-core
git clone https://github.com/microsoft/hve-core
T=$(mktemp -d) && git clone --depth=1 https://github.com/microsoft/hve-core "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.github/skills/experimental/powerpoint" ~/.claude/skills/microsoft-hve-core-powerpoint && rm -rf "$T"
.github/skills/experimental/powerpoint/SKILL.mdPowerPoint Skill
Generates, updates, and manages PowerPoint slide decks using
python-pptx with YAML-driven content and styling definitions.
Overview
This skill provides Python scripts that consume YAML configuration files to produce PowerPoint slide decks. Each slide is defined by a
content.yaml file describing its layout, text, and shapes. A style.yaml file defines dimensions, template configuration, layout mappings, metadata, and defaults.
SKILL.md covers technical reference: prerequisites, commands, script architecture, API constraints, and troubleshooting. For conventions and design rules (element positioning, visual quality, color and contrast, contextual styling), follow
pptx.instructions.md.
Prerequisites
PowerShell
The
Invoke-PptxPipeline.ps1 script handles virtual environment creation and dependency installation automatically via uv sync. Requires uv, Python 3.11+, and PowerShell 7+.
Installing uv
If
uv is not installed:
# macOS / Linux curl -LsSf https://astral.sh/uv/install.sh | sh # Windows powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex" # Via pip (fallback) pip install uv
System Dependencies (Export and Validation)
The Export and Validate actions require LibreOffice for PPTX-to-PDF conversion and optionally
pdftoppm from poppler for PDF-to-JPG rendering. When pdftoppm is not available, PyMuPDF handles the image rendering.
The Validate action's vision-based checks require the GitHub Copilot CLI for model access.
# macOS brew install --cask libreoffice brew install poppler # optional, provides pdftoppm # Linux sudo apt-get install libreoffice poppler-utils # Windows (winget preferred, choco fallback) winget install TheDocumentFoundation.LibreOffice # choco install libreoffice-still # alternative # poppler: no winget package; use choco install poppler (optional, provides pdftoppm)
Copilot CLI (Vision Validation)
The
validate_slides.py script uses the GitHub Copilot SDK to send slide images to vision-capable models. The Copilot CLI must be installed and authenticated:
# Install Copilot CLI npm install -g @github/copilot-cli # Authenticate (uses the same GitHub account as VS Code Copilot) copilot auth login # Verify copilot --version
Required Files
— Dimensions, defaults, template configuration, and metadatastyle.yaml
— Per-slide content definition (text, shapes, images, layout)content.yaml- (Optional)
— Custom Python for complex slide drawingscontent-extra.py
Content Directory Structure
All slide content lives under the working directory's
content/ folder:
content/ ├── global/ │ ├── style.yaml # Dimensions, defaults, template config, and theme metadata │ └── voice-guide.md # Voice and tone guidelines ├── slide-001/ │ ├── content.yaml # Slide 1 content and layout │ └── images/ # Slide-specific images │ ├── background.png │ └── background.yaml # Image metadata sidecar ├── slide-002/ │ ├── content.yaml # Slide 2 content and layout │ ├── content-extra.py # Custom Python for complex drawings │ └── images/ │ └── screenshot.png ├── slide-003/ │ ├── content.yaml │ └── images/ │ ├── diagram.png │ └── diagram.yaml └── ...
Global Style Definition (style.yaml
)
style.yamlThe global
style.yaml defines dimensions, template configuration, layout mappings, metadata, and defaults. Color and font choices are specified per-element in each slide's content.yaml rather than centralized in the style file.
See the style.yaml template for the full template, field reference, and usage instructions.
Per-Slide Content Definition (content.yaml
)
content.yamlEach slide's
content.yaml defines layout, text, shapes, and positioning. All position and size values are in inches. Color values use #RRGGBB hex format or @theme_name references.
See the content.yaml template for the full template, supported element types, supported shape types, and usage instructions.
Complex Drawings (content-extra.py
)
content-extra.pyWhen a slide requires complex drawings that cannot be expressed through
content.yaml element definitions, create a content-extra.py file in the slide folder. The render() function signature is fixed. The build script calls it after placing standard content.yaml elements.
See the content-extra.py template for the full template, function parameters, and usage guidelines.
Security Validation
Before executing a
content-extra.py file, the build script performs AST-based static analysis to reject dangerous code. Validation runs automatically unless the --allow-scripts flag is passed.
Allowed imports:
and allpptx
submodulespptx.*- Safe standard-library modules (e.g.,
,math
,copy
,json
,re
,pathlib
,collections
,itertools
,functools
,typing
,enum
,dataclasses
,decimal
,fractions
,string
)textwrap
Blocked imports:
,subprocess
,os
,shutil
,socket
,ctypes
,signal
,multiprocessing
,threading
,http
,urllib
,ftplib
,smtplib
,imaplib
,poplib
,xmlrpc
,webbrowser
,code
,codeop
,compileall
,py_compile
,zipimport
,pkgutil
,runpy
,ensurepip
,venv
,sqlite3
,tempfile
,shelve
,dbm
,pickle
,marshal
,importlib
,systelnetlib- Any third-party package not on the allowlist
Blocked builtins:
- Dangerous:
,eval
,exec
,__import__
,compilebreakpoint - Indirect bypass:
,getattr
,setattr
,delattr
,globals
,localsvars
Runtime namespace restriction:
Even after AST validation passes, the executed module runs in a restricted namespace where
__builtins__ is limited to safe builtins only. The dangerous and indirect-bypass builtins listed above are removed from the module namespace before execution (__import__ is kept because the import machinery requires it; the AST checker blocks direct __import__() calls).
flag:--allow-scripts
Pass
--allow-scripts to skip AST validation and namespace restriction for trusted content. This flag is required when a content-extra.py script legitimately needs blocked imports or builtins.
python scripts/build_deck.py \ --content-dir content/ \ --style content/global/style.yaml \ --output slide-deck/presentation.pptx \ --allow-scripts
When validation fails, the build raises
ContentExtraError with a message identifying the violation and file path.
Script Reference
All operations are available through the PowerShell orchestrator (
Invoke-PptxPipeline.ps1) or directly via the Python scripts. The PowerShell script manages the Python virtual environment and dependency installation automatically via uv sync.
Build a Slide Deck
./scripts/Invoke-PptxPipeline.ps1 -Action Build ` -ContentDir content/ ` -StylePath content/global/style.yaml ` -OutputPath slide-deck/presentation.pptx
python scripts/build_deck.py \ --content-dir content/ \ --style content/global/style.yaml \ --output slide-deck/presentation.pptx
Reads all
content/slide-*/content.yaml files in numeric order and generates the complete deck. Executes content-extra.py files when present.
Build from a Template
[!WARNING]
creates a NEW presentation inheriting only slide masters, layouts, and theme from the template. All existing slides are discarded. Use--templatefor partial rebuilds.--source
./scripts/Invoke-PptxPipeline.ps1 -Action Build ` -ContentDir content/ ` -StylePath content/global/style.yaml ` -OutputPath slide-deck/presentation.pptx ` -TemplatePath corporate-template.pptx
python scripts/build_deck.py \ --content-dir content/ \ --style content/global/style.yaml \ --output slide-deck/presentation.pptx \ --template corporate-template.pptx
Loads slide masters and layouts from the template PPTX. Layout names in each slide's
content.yaml resolve against the template's layouts, with optional name mapping via the layouts section in style.yaml. Populate themed layout placeholders using the placeholders section in content YAML.
Update Specific Slides
[!IMPORTANT] Use
(not--source) for partial rebuilds. Combining--templateand--templateis not supported.--source
./scripts/Invoke-PptxPipeline.ps1 -Action Build ` -ContentDir content/ ` -StylePath content/global/style.yaml ` -OutputPath slide-deck/presentation.pptx ` -SourcePath slide-deck/presentation.pptx ` -Slides "3,7,15"
python scripts/build_deck.py \ --content-dir content/ \ --style content/global/style.yaml \ --source slide-deck/presentation.pptx \ --output slide-deck/presentation.pptx \ --slides 3,7,15
Opens the existing deck, clears shapes on the specified slides, rebuilds them in-place from their
content.yaml, and saves. All other slides remain untouched. After building, verify the output slide count matches the original deck.
Extract Content from Existing PPTX
./scripts/Invoke-PptxPipeline.ps1 -Action Extract ` -InputPath existing-deck.pptx ` -OutputDir content/
python scripts/extract_content.py \ --input existing-deck.pptx \ --output-dir content/
Extracts text, shapes, images, and styling from an existing PPTX into the
content/ folder structure. Creates content.yaml files for each slide and populates the global/style.yaml from detected patterns.
Extract Specific Slides
./scripts/Invoke-PptxPipeline.ps1 -Action Extract ` -InputPath existing-deck.pptx ` -OutputDir content/ ` -Slides "3,7,15"
python scripts/extract_content.py \ --input existing-deck.pptx \ --output-dir content/ \ --slides 3,7,15
Extracts only the specified slides (plus the global style). Useful for targeted updates on large decks.
Extraction Limitations
- Picture shapes that reference external (linked) images instead of embedded blobs are recorded with
. The script does not crash but the image must be re-embedded manually.path: LINKED_IMAGE_NOT_EMBEDDED - When text elements inherit font, size, or color from the slide master or layout, the extraction records no inline styling. Content YAML for these elements needs explicit font properties added before rebuild.
- The
function uses frequency analysis across all slides. For decks with mixed styling, review and adjustdetect_global_style()
values manually after extraction.style.yaml
Validate a Deck
./scripts/Invoke-PptxPipeline.ps1 -Action Validate ` -InputPath slide-deck/presentation.pptx ` -ContentDir content/
The Validate action runs a two- or three-step pipeline:
- Export — Clears stale slide images from the output directory, then renders slides to JPG images via LibreOffice (PPTX → PDF → JPG). When
is used, output images are named to match original slide numbers (e.g.,-Slides
for slide 23), not sequential PDF page numbers.slide-023.jpg - PPTX validation — Checks PPTX-only properties (
) for speaker notes and slide count.validate_deck.py - Vision validation (optional) — Sends slide images to a vision-capable model via the Copilot SDK (
) for visual quality checks. Runs whenvalidate_slides.py
or-ValidationPrompt
is provided.-ValidationPromptFile
For validation criteria (element positioning, visual quality, color contrast, content completeness), see
pptx.instructions.md Validation Criteria.
Built-in System Message
The
validate_slides.py script includes a built-in system message that focuses on issue detection only (not full slide description). It checks overlapping elements, text overflow/cutoff, decorative line mismatch after title wraps, citation/footer collisions, tight spacing, uneven gaps, insufficient edge margins, alignment inconsistencies, low contrast, narrow text boxes, and leftover placeholders. For dense slides, near-edge placement or tight boundaries are acceptable when readability is not materially affected. The -ValidationPrompt parameter provides supplementary user-level context and does not need to repeat these checks.
Validate with Vision Checks
./scripts/Invoke-PptxPipeline.ps1 -Action Validate ` -InputPath slide-deck/presentation.pptx ` -ContentDir content/ ` -ValidationPrompt "Validate visual quality. Focus on recently modified slides for content accuracy." ` -ValidationModel claude-haiku-4.5
Vision validation results are written to
validation-results.json in the image output directory, containing raw model responses per slide with quality findings. Per-slide response text is also written to slide-NNN-validation.txt files next to each slide image.
Validate Specific Slides
./scripts/Invoke-PptxPipeline.ps1 -Action Validate ` -InputPath slide-deck/presentation.pptx ` -ContentDir content/ ` -Slides "3,7,15"
Validates only the specified slides. When content directories cover fewer slides than the PPTX, the slide count check reports an informational note rather than an error.
validate_slides.py CLI Reference
| Flag | Required | Default | Description |
|---|---|---|---|
| Yes | — | Directory containing images |
| One of / | — | Validation prompt text |
| One of / | — | Path to file containing the validation prompt |
| No | | Vision model ID |
| No | stdout | JSON results file path |
| No | all | Comma-separated slide numbers to validate |
, | No | — | Enable debug-level logging |
validate_deck.py CLI Reference
| Flag | Required | Default | Description |
|---|---|---|---|
| Yes | — | Input PPTX file path |
| No | — | Content directory for slide count comparison |
| No | all | Comma-separated slide numbers to validate |
| No | stdout | JSON results file path |
| No | — | Markdown report file path |
| No | — | Directory for per-slide JSON files () |
Validation Outputs
When run through the pipeline, validation produces these files in the image output directory:
| File | Format | Content |
|---|---|---|
| JSON | Per-slide PPTX property issues (speaker notes, slide count) |
| Markdown | Human-readable report for PPTX property validation |
| JSON | Consolidated vision model responses with quality findings |
| Text | Per-slide vision response text (next to ) |
| JSON | Per-slide PPTX property validation result (next to ) |
Per-slide vision text files are written alongside their corresponding
slide-NNN.jpg images, enabling agents to read validation findings for individual slides without parsing the consolidated JSON file.
Validation Scope for Changed Slides
When validating after modifying or adding specific slides, always validate a block that includes one slide before and one slide after the changed or added slides. This catches edge-proximity issues, transition inconsistencies, and spacing problems that arise between adjacent slides.
For example, when slides 5 and 6 were changed, validate slides 4 through 7:
./scripts/Invoke-PptxPipeline.ps1 -Action Validate ` -InputPath slide-deck/presentation.pptx ` -ContentDir content/ ` -Slides "4,5,6,7" ` -ValidationPrompt "Check for text overlay, overflow, margin issues, color contrast"
Export Slides to Images
./scripts/Invoke-PptxPipeline.ps1 -Action Export ` -InputPath slide-deck/presentation.pptx ` -ImageOutputDir slide-deck/validation/ ` -Slides "1,3,5" ` -Resolution 150
# Step 1: PPTX to PDF python scripts/export_slides.py \ --input slide-deck/presentation.pptx \ --output slide-deck/validation/slides.pdf \ --slides 1,3,5 # Step 2: PDF to JPG (pdftoppm from poppler) pdftoppm -jpeg -r 150 slide-deck/validation/slides.pdf slide-deck/validation/slide
Converts specified slides to JPG images for visual inspection. The PowerShell orchestrator handles both steps automatically, clears stale images before exporting, names output images to match original slide numbers when
-Slides is used, and uses a PyMuPDF fallback when pdftoppm is not installed.
When running the two-step process manually (outside the pipeline), note that
render_pdf_images.py uses sequential numbering by default. Pass --slide-numbers to map output images to original slide positions:
python scripts/render_pdf_images.py \ --input slide-deck/validation/slides.pdf \ --output-dir slide-deck/validation/ \ --dpi 150 \ --slide-numbers 1,3,5
Dependencies: Requires LibreOffice for PPTX-to-PDF conversion and either
pdftoppm (from poppler) or pymupdf (pip) for PDF-to-JPG rendering.
Script Architecture
The build and extraction scripts use shared modules in the
scripts/ directory:
| Module | Purpose |
|---|---|
| Shared utilities: exit codes, logging configuration, slide filter parsing, unit conversion (), YAML loading |
| Color resolution (, , dict with brightness), theme color map (16 entries) |
| Font resolution, family normalization, weight suffix handling, alignment mapping |
| Shape constant map (29 entries + circle alias), auto-shape name mapping, rotation utilities |
| Solid, gradient, and pattern fill application/extraction; line/border styling with dash styles |
| Text frame properties (margins, auto-size, vertical anchor), paragraph properties (spacing, level), run properties (underline, hyperlink) |
| Table element creation and extraction with cell merging, banding, and per-cell styling |
| Chart element creation and extraction for 12 chart types (column, bar, line, pie, scatter, bubble, etc.) |
| PPTX-only validation for speaker notes and slide count |
| Vision-based slide issue detection and quality validation via Copilot SDK with built-in checks and plain-text per-slide output |
| PDF-to-JPG rendering via PyMuPDF with optional slide-number-based naming |
python-pptx Constraints
- python-pptx does NOT support SVG images. Always convert to PNG via
orcairosvg
.Pillow - python-pptx cannot create new slide masters or layouts programmatically. Use blank layouts or start from a template PPTX with the
argument.--template - Transitions and animations are preserved when opening and saving existing files, but cannot be created or modified via the API.
- When extracting content, slide master and layout inheritance means many text elements have no inline styling. Add explicit font properties in content YAML before rebuilding.
- The Export and Validate actions require LibreOffice for PPTX-to-PDF conversion. The PowerShell orchestrator checks for LibreOffice availability before starting and provides platform-specific install instructions if missing.
- Accessing
on slides with inherited backgrounds replaces them withbackground.fill
. CheckNoFill
before accessing the fill property.slide.follow_master_background - Gradient fills use the python-pptx
API withGradientFill
objects. Each stop specifies a position (0–100) and a color.GradientStop - Theme colors resolve via
enum. Brightness adjustments apply through the color format'sMSO_THEME_COLOR
property.brightness - Template-based builds load layouts by name or index. Layout name resolution falls back to index 6 (blank) when no match is found.
Troubleshooting
| Issue | Cause | Solution |
|---|---|---|
| SVG runtime error | python-pptx cannot embed SVG | Convert to PNG via before adding |
| Text overlay between elements | Insufficient vertical spacing | Follow element positioning conventions in |
| Width overflow off-slide | Element extends beyond slide boundary | Follow element positioning conventions in |
| Bright accent color unreadable as fill | White text on bright background | Darken accent to ~60% saturation for box fills |
| Background fill replaced with NoFill | Accessed on inherited background | Check before accessing |
| Missing speaker notes | Notes not specified in | Add field to every content slide |
| LibreOffice not found during Validate | Validate exports slides to images first | Install LibreOffice: (macOS) |
not found | uv package manager not installed | Install uv: (macOS/Linux) or |
| Python not found by uv | No Python 3.11+ on PATH | Install via or |
fails | Missing or corrupt | Delete at the skill root and re-run |
| Import errors in scripts | Dependencies not installed or stale venv | Run from the skill root to recreate the environment |
Environment Recovery
When scripts fail due to missing modules, import errors, or a corrupt virtual environment, recover with:
cd .github/skills/experimental/powerpoint rm -rf .venv uv sync
This recreates the virtual environment from scratch using
pyproject.toml as the single source of truth. The Invoke-PptxPipeline.ps1 orchestrator runs uv sync automatically on each invocation unless -SkipVenvSetup is passed.
When
uv itself is not available, install it first (see Installing uv above), then retry. When Python 3.11+ is not available, run uv python install 3.11 to have uv fetch and manage the interpreter.
Brought to you by microsoft/hve-core
🤖 Crafted with precision by ✨Copilot following brilliant human instruction, then carefully refined by our team of discerning human reviewers.