git clone https://github.com/ComeOnOliver/skillshub
T=$(mktemp -d) && git clone --depth=1 https://github.com/ComeOnOliver/skillshub "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/luwill/research-skills/paper-slide-deck" ~/.claude/skills/comeonoliver-skillshub-paper-slide-deck && rm -rf "$T"
skills/luwill/research-skills/paper-slide-deck/SKILL.mdPaper Slide Deck Generator
Transform academic papers and content into professional slide deck images with automatic figure extraction.
Usage
/paper-slide-deck path/to/paper.pdf /paper-slide-deck path/to/paper.pdf --style academic-paper /paper-slide-deck path/to/content.md --style sketch-notes /paper-slide-deck path/to/content.md --audience executives /paper-slide-deck path/to/content.md --lang zh /paper-slide-deck path/to/content.md --slides 10 /paper-slide-deck path/to/content.md --outline-only /paper-slide-deck # Then paste content
Script Directory
Important: All scripts are located in the
scripts/ subdirectory of this skill.
Agent Execution Instructions:
- Determine this SKILL.md file's directory path as
SKILL_DIR - Script path =
${SKILL_DIR}/scripts/<script-name>.ts - Replace all
in this document with the actual path${SKILL_DIR}
Script Reference:
| Script | Purpose |
|---|---|
| Generate AI slides via Gemini API (Python) |
| Merge slides into PowerPoint |
| Merge slides into PDF |
| Auto-detect figures/tables in PDF |
| Extract figure from PDF page (uses PyMuPDF fallback) |
| Apply figure container template |
Options
| Option | Description |
|---|---|
| Visual style (see Style Gallery) |
| Target audience: beginners, intermediate, experts, executives, general |
| Output language (en, zh, ja, etc.) |
| Target slide count |
| Generate outline only, skip image generation |
Style Gallery
| Style | Description | Best For |
|---|---|---|
| Clean professional, precise charts | Conference talks, thesis defense |
(Default) | Technical schematics, grid texture | Architecture, system design |
| Black chalkboard, colorful chalk | Education, tutorials, classroom |
| SaaS dashboard, card-based layouts | Product demos, SaaS, B2B |
| Magazine cover, bold typography, dark | Product launches, keynotes |
| Navy/gold, structured layouts | Investor decks, proposals |
| Cinematic dark mode, glowing accents | Entertainment, gaming |
| Magazine explainers, flat illustrations | Tech explainers, research |
| Ghibli/Disney style, hand-drawn | Educational, storytelling |
| Technical briefing, bilingual labels | Technical docs, academic |
| Ultra-clean, maximum whitespace | Executive briefings, premium |
| Retro 8-bit, chunky pixels | Gaming, developer talks |
| Academic diagrams, precise labeling | Biology, chemistry, medical |
| Hand-drawn, warm & friendly | Educational, tutorials |
| Flat vector, retro & cute | Creative, children's content |
| Aged-paper, historical styling | Historical, heritage, biography |
| Hand-painted textures, natural warmth | Lifestyle, wellness, travel |
Auto Style Selection
| Content Signals | Selected Style |
|---|---|
| paper, thesis, defense, conference, ieee, acm, icml, neurips, cvpr, acl, aaai, iclr | |
| tutorial, learn, education, guide, intro, beginner | |
| classroom, teaching, school, chalkboard, blackboard | |
| architecture, system, data, analysis, technical | |
| creative, children, kids, cute, illustration | |
| briefing, bilingual, infographic, concept | |
| executive, minimal, clean, simple, elegant | |
| saas, product, dashboard, metrics, productivity | |
| investor, quarterly, business, corporate, proposal | |
| launch, marketing, keynote, bold, impact, magazine | |
| entertainment, music, gaming, creative, atmospheric | |
| explainer, journalism, science communication | |
| story, fantasy, animation, magical, whimsical | |
| gaming, retro, pixel, developer, nostalgia | |
| biology, chemistry, medical, pathway, scientific | |
| history, heritage, vintage, expedition, historical | |
| lifestyle, wellness, travel, artistic, natural | |
| Default | |
Layout Gallery
Optional layout hints for individual slides. Specify in outline's
// LAYOUT section.
Slide-Specific Layouts
| Layout | Description | Best For |
|---|---|---|
| Large centered title + subtitle | Cover slides, section breaks |
| Featured quote with attribution | Testimonials, key insights |
| Single large number as focal point | Impact statistics, metrics |
| Half image, half text | Feature highlights, comparisons |
| Grid of icons with labels | Features, capabilities, benefits |
| Content in balanced columns | Paired information, dual points |
| Content in three columns | Triple comparisons, categories |
| Full-bleed image + text overlay | Visual storytelling, emotional |
| Numbered list with highlights | Session overview, roadmap |
| Structured bullet points | Simple content, lists |
Infographic-Derived Layouts
| Layout | Description | Best For |
|---|---|---|
| Sequential flow left-to-right | Timelines, step-by-step |
| Side-by-side A vs B | Before/after, pros-cons |
| Multi-factor grid | Feature comparisons |
| Pyramid or stacked levels | Priority, importance |
| Central node with radiating items | Concept maps, ecosystems |
| Varied-size tiles | Overview, summary |
| Narrowing stages | Conversion, filtering |
| Metrics with charts/numbers | KPIs, data display |
| Overlapping circles | Relationships, intersections |
| Continuous cycle | Recurring processes |
| Curved path with milestones | Journey, timeline |
| Parent-child hierarchy | Org charts, taxonomies |
| Visible vs hidden layers | Surface vs depth |
| Gap with connection | Problem-solution |
Academic-Specific Layouts
| Layout | Description | Best For |
|---|---|---|
| Title, authors, affiliations, venue | Conference paper cover |
| Numbered section list with highlights | Talk structure overview |
| Central architecture/pipeline diagram | Methods, system design |
| Chart area + data annotations | Quantitative results |
| Centered equation + variable definitions | Mathematical derivations |
| 2x2 or 3x2 image comparison grid | Visual results, ablations |
| Numbered citation list | Key references slide |
| Numbered contribution points | Contributions summary |
Usage: Add
Layout: <name> in slide's // LAYOUT section to guide visual composition.
Design Philosophy
This deck is designed for reading and sharing, not live presentation:
- Each slide must be self-explanatory without verbal commentary
- Structure content for logical flow when scrolling
- Include all necessary context within each slide
- Optimize for social media sharing and offline reading
File Management
Output Directory
Each session creates an independent directory named by content slug:
slide-deck/{topic-slug}/ ├── source-{slug}.{ext} # Source files (text, images, etc.) ├── outline.md ├── outline-{style}.md # Style variant outlines ├── prompts/ │ └── 01-slide-cover.md, 02-slide-{slug}.md, ... ├── 01-slide-cover.png, 02-slide-{slug}.png, ... ├── {topic-slug}.pptx └── {topic-slug}.pdf
Slug Generation:
- Extract main topic from content (2-4 words, kebab-case)
- Example: "Introduction to Machine Learning" →
intro-machine-learning
Conflict Resolution
If
slide-deck/{topic-slug}/ already exists:
- Append timestamp:
{topic-slug}-YYYYMMDD-HHMMSS - Example:
exists →intro-mlintro-ml-20260118-143052
Source Files
Copy all sources with naming
source-{slug}.{ext}:
(main text content)source-article.md
(image from conversation)source-diagram.png
(additional file)source-data.xlsx
Multiple sources supported: text, images, files from conversation.
Workflow
Step 1: Analyze Content
- Save source content (if pasted, save as
)source.md - Follow
for deep content analysisreferences/analysis-framework.md - Determine style (use
or auto-select from signals)--style - Detect languages (source vs. user preference)
- Plan slide count (
or dynamic)--slides - For academic papers (PDF with figures): Run automatic figure detection:
This outputs a JSON file with all detected figures/tables, their page numbers, and captions.npx -y bun ${SKILL_DIR}/scripts/detect-figures.ts --pdf source-paper.pdf --output figures.json
Step 2: Generate Outline Variants
- Generate 3 style variant outlines based on content analysis
- Follow
for structurereferences/outline-template.md - Auto-populate IMAGE_SOURCE for academic papers:
- Read
from Step 1figures.json - Map figures to slides using rules in
Section 8references/analysis-framework.md - Automatically add
blocks to appropriate slides:// IMAGE_SOURCE- Architecture/pipeline figures → Methods slides (
)Source: extract - Results tables → Quantitative results slides (
)Source: extract - Comparison images → Qualitative results slides (
)Source: extract - Conceptual/simple diagrams → Leave for AI generation (
or omit)Source: generate
- Architecture/pipeline figures → Methods slides (
- Read
- Save as
for each variantoutline-{style}.md
Step 3: User Confirmation
Single AskUserQuestion with all applicable options:
| Question | When to Ask |
|---|---|
| Style variant | Always (3 options + custom) |
| Language | Only if source ≠ user language |
After selection:
- Copy selected
tooutline-{style}.mdoutline.md - Regenerate in different language if requested
- User may edit
for fine-tuningoutline.md
If
--outline-only, stop here.
Step 4: Generate Prompts
- Read
references/base-prompt.md - Combine with style instructions from outline
- Add slide-specific content
- If
specified in outline, include layout guidance in prompt:Layout:- Reference layout characteristics for image composition
- Example:
→ "Central concept in middle with related items radiating outward"Layout: hub-spoke
- Save to
directoryprompts/
Step 5: Image Generation Method Selection
Before generating images, ask user to choose generation method:
Use AskUserQuestion with options:
| Option | Label | Description |
|---|---|---|
| 1 | Gemini API (Recommended) | Official Google API via Python. Requires GOOGLE_API_KEY env var. |
| 2 | Gemini Web (Browser-based) | ⚠️ Uses reverse-engineered web API. No API key needed but may break. |
Based on selection:
Option 1: Gemini API (Python)
- Verify API key: Check
orGOOGLE_API_KEY
environment variableGEMINI_API_KEY - Run generation script:
python ${SKILL_DIR}/scripts/generate-slides.py <slide-deck-dir> --model gemini-3-pro-image-preview
Script Features:
- Auto-installs
package if missinggoogle-genai - Retry logic with exponential backoff (3 retries)
- Skips already-generated slides (> 10KB)
- Supports custom model via
flag--model - Outputs to
subdirectoryslides/
Troubleshooting:
- If server disconnection errors occur, script auto-retries
- For persistent failures, re-run the script (it skips completed slides)
- Check API quota if many failures occur
Option 2: Gemini Web Skill
-
Consent Check: Read consent file at:
- Windows:
$APPDATA/baoyu-skills/gemini-web/consent.json - macOS:
~/Library/Application Support/baoyu-skills/gemini-web/consent.json - Linux:
~/.local/share/baoyu-skills/gemini-web/consent.json
- Windows:
-
If no consent or version mismatch, display disclaimer and ask:
⚠️ DISCLAIMER: This uses a reverse-engineered Gemini Web API (NOT official). Risks: May break anytime, no support, possible account risk. -
For each slide, run:
npx -y bun ${GEMINI_WEB_SKILL_DIR}/scripts/main.ts \ --promptfiles prompts/01-slide-cover.md \ --image 01-slide-cover.png \ --sessionId slides-{topic-slug}-{timestamp}Where
= path toGEMINI_WEB_SKILL_DIR
skill directory.baoyu-danger-gemini-web -
Proxy support: If user is in restricted network, prepend:
HTTP_PROXY=http://127.0.0.1:7890 HTTPS_PROXY=http://127.0.0.1:7890
Step 5.5: Process IMAGE_SOURCE (Automatic Figure Extraction)
For academic presentations, IMAGE_SOURCE metadata was auto-populated in Step 2 based on figure detection from Step 1.
Automatic Execution:
-
Parse outline to identify slides with
Source: extract -
Create figures directory:
mkdir -p figures -
For each extract slide, automatically:
- Read the Figure number, Page, and Caption from metadata
- Run figure extraction script:
npx -y bun ${SKILL_DIR}/scripts/extract-figure.ts \ --pdf source-paper.pdf \ --page <page-number> \ --output figures/figure-<N>.png - Run template application script:
npx -y bun ${SKILL_DIR}/scripts/apply-template.ts \ --figure figures/figure-<N>.png \ --title "<slide-headline>" \ --caption "Figure <N>: <caption-text>" \ --output <NN>-slide-<slug>.png - Report: "Extracted: Figure N → slide NN"
-
For slides with
(or no IMAGE_SOURCE):Source: generate- Proceed to Step 6 for AI generation
Note: Source PDF must be saved as
source-paper.pdf in output directory.
Troubleshooting:
- If figure detection missed a figure: manually add
block to outline// IMAGE_SOURCE - If wrong figure mapped: edit the
andFigure:
values in outlinePage: - If extraction fails: check PDF page number (1-indexed)
PyMuPDF Fallback for Page Extraction: If
extract-figure.ts fails with "Image or Canvas expected" error (common with complex PDFs), use PyMuPDF:
import fitz doc = fitz.open("source-paper.pdf") page = doc[page_num - 1] # 0-indexed mat = fitz.Matrix(3, 3) # 3x scale for high resolution pix = page.get_pixmap(matrix=mat) pix.save(f"extracted/page-{page_num}.png")
Then apply template using
apply-template.ts.
Step 6: Generate Images
- Use selected method from Step 5
- Skip slides already processed in Step 5.5 (those with
)Source: extract - Generate session ID:
slides-{topic-slug}-{timestamp} - Generate each remaining slide with same session ID
- Report progress: "Generated X/N"
- Auto-retry once on generation failure
Step 7: Merge to PPTX and PDF
npx -y bun ${SKILL_DIR}/scripts/merge-to-pptx.ts <slide-deck-dir> npx -y bun ${SKILL_DIR}/scripts/merge-to-pdf.ts <slide-deck-dir>
Step 8: Output Summary
Slide Deck Complete! Topic: [topic] Style: [style name] Location: [directory path] Slides: N total - 01-slide-cover.png ✓ Cover - 02-slide-intro.png ✓ Content - ... - {NN}-slide-back-cover.png ✓ Back Cover Outline: outline.md PPTX: {topic-slug}.pptx PDF: {topic-slug}.pdf
Slide Modification
See
references/modification-guide.md for:
- Edit single slide workflow
- Add new slide (with renumbering)
- Delete slide (with renumbering)
- File naming conventions
Image Generation Dependencies
Gemini API (Option 1 - Recommended)
Requires:
orGOOGLE_API_KEY
environment variableGEMINI_API_KEY- Python 3.8+ with pip
package (auto-installed by script)google-genai
Model:
gemini-3-pro-image-preview (default)
Gemini Web Skill (Option 2)
Requires:
skill installed atbaoyu-danger-gemini-web.claude/skills/baoyu-danger-gemini-web- Google Chrome browser with logged-in Google account
- User consent for reverse-engineered API disclaimer
PDF Figure Extraction
Requires:
- Primary:
npm package (use legacy build for Node.js)pdfjs-dist - Fallback:
Python package (more reliable for complex PDFs)pymupdf
npm package for apply-template.tscanvas
References
| File | Content |
|---|---|
| Deep content analysis for presentations |
| Outline structure and STYLE_INSTRUCTIONS format |
| Edit, add, delete slide workflows |
| Content and style guidelines |
| Base prompt for image generation |
| Visual specs for extracted figure containers |
| Full style specifications |
Notes
Image Generation
- Nano Banana Pro API: Recommended. Stable, reliable, requires API key
- Gemini Web: No API key needed, but uses reverse-engineered API with account risk
- Generation time: 10-30 seconds per slide
- Auto-retry once on generation failure
- Maintain style consistency via session ID
Content Guidelines
- Use stylized alternatives for sensitive public figures
- Both methods use the same underlying Gemini model for image generation
Extension Support
Custom styles and configurations via EXTEND.md.
Check paths (priority order):
(project).paper-skills/paper-slide-deck/EXTEND.md
(user)~/.paper-skills/paper-slide-deck/EXTEND.md
If found, load before Step 1. Extension content overrides defaults.