Skillshub paper-slide-deck

Paper Slide Deck Generator

install
source · Clone the upstream repo
git clone https://github.com/ComeOnOliver/skillshub
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ComeOnOliver/skillshub "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/luwill/research-skills/paper-slide-deck" ~/.claude/skills/comeonoliver-skillshub-paper-slide-deck && rm -rf "$T"
manifest: skills/luwill/research-skills/paper-slide-deck/SKILL.md
source content

Paper Slide Deck Generator

Transform academic papers and content into professional slide deck images with automatic figure extraction.

Usage

/paper-slide-deck path/to/paper.pdf
/paper-slide-deck path/to/paper.pdf --style academic-paper
/paper-slide-deck path/to/content.md --style sketch-notes
/paper-slide-deck path/to/content.md --audience executives
/paper-slide-deck path/to/content.md --lang zh
/paper-slide-deck path/to/content.md --slides 10
/paper-slide-deck path/to/content.md --outline-only
/paper-slide-deck  # Then paste content

Script Directory

Important: All scripts are located in the

scripts/
subdirectory of this skill.

Agent Execution Instructions:

  1. Determine this SKILL.md file's directory path as
    SKILL_DIR
  2. Script path =
    ${SKILL_DIR}/scripts/<script-name>.ts
  3. Replace all
    ${SKILL_DIR}
    in this document with the actual path

Script Reference:

ScriptPurpose
scripts/generate-slides.py
Generate AI slides via Gemini API (Python)
scripts/merge-to-pptx.ts
Merge slides into PowerPoint
scripts/merge-to-pdf.ts
Merge slides into PDF
scripts/detect-figures.ts
Auto-detect figures/tables in PDF
scripts/extract-figure.ts
Extract figure from PDF page (uses PyMuPDF fallback)
scripts/apply-template.ts
Apply figure container template

Options

OptionDescription
--style <name>
Visual style (see Style Gallery)
--audience <type>
Target audience: beginners, intermediate, experts, executives, general
--lang <code>
Output language (en, zh, ja, etc.)
--slides <number>
Target slide count
--outline-only
Generate outline only, skip image generation

Style Gallery

StyleDescriptionBest For
academic-paper
Clean professional, precise chartsConference talks, thesis defense
blueprint
(Default)
Technical schematics, grid textureArchitecture, system design
chalkboard
Black chalkboard, colorful chalkEducation, tutorials, classroom
notion
SaaS dashboard, card-based layoutsProduct demos, SaaS, B2B
bold-editorial
Magazine cover, bold typography, darkProduct launches, keynotes
corporate
Navy/gold, structured layoutsInvestor decks, proposals
dark-atmospheric
Cinematic dark mode, glowing accentsEntertainment, gaming
editorial-infographic
Magazine explainers, flat illustrationsTech explainers, research
fantasy-animation
Ghibli/Disney style, hand-drawnEducational, storytelling
intuition-machine
Technical briefing, bilingual labelsTechnical docs, academic
minimal
Ultra-clean, maximum whitespaceExecutive briefings, premium
pixel-art
Retro 8-bit, chunky pixelsGaming, developer talks
scientific
Academic diagrams, precise labelingBiology, chemistry, medical
sketch-notes
Hand-drawn, warm & friendlyEducational, tutorials
vector-illustration
Flat vector, retro & cuteCreative, children's content
vintage
Aged-paper, historical stylingHistorical, heritage, biography
watercolor
Hand-painted textures, natural warmthLifestyle, wellness, travel

Auto Style Selection

Content SignalsSelected Style
paper, thesis, defense, conference, ieee, acm, icml, neurips, cvpr, acl, aaai, iclr
academic-paper
tutorial, learn, education, guide, intro, beginner
sketch-notes
classroom, teaching, school, chalkboard, blackboard
chalkboard
architecture, system, data, analysis, technical
blueprint
creative, children, kids, cute, illustration
vector-illustration
briefing, bilingual, infographic, concept
intuition-machine
executive, minimal, clean, simple, elegant
minimal
saas, product, dashboard, metrics, productivity
notion
investor, quarterly, business, corporate, proposal
corporate
launch, marketing, keynote, bold, impact, magazine
bold-editorial
entertainment, music, gaming, creative, atmospheric
dark-atmospheric
explainer, journalism, science communication
editorial-infographic
story, fantasy, animation, magical, whimsical
fantasy-animation
gaming, retro, pixel, developer, nostalgia
pixel-art
biology, chemistry, medical, pathway, scientific
scientific
history, heritage, vintage, expedition, historical
vintage
lifestyle, wellness, travel, artistic, natural
watercolor
Default
blueprint

Layout Gallery

Optional layout hints for individual slides. Specify in outline's

// LAYOUT
section.

Slide-Specific Layouts

LayoutDescriptionBest For
title-hero
Large centered title + subtitleCover slides, section breaks
quote-callout
Featured quote with attributionTestimonials, key insights
key-stat
Single large number as focal pointImpact statistics, metrics
split-screen
Half image, half textFeature highlights, comparisons
icon-grid
Grid of icons with labelsFeatures, capabilities, benefits
two-columns
Content in balanced columnsPaired information, dual points
three-columns
Content in three columnsTriple comparisons, categories
image-caption
Full-bleed image + text overlayVisual storytelling, emotional
agenda
Numbered list with highlightsSession overview, roadmap
bullet-list
Structured bullet pointsSimple content, lists

Infographic-Derived Layouts

LayoutDescriptionBest For
linear-progression
Sequential flow left-to-rightTimelines, step-by-step
binary-comparison
Side-by-side A vs BBefore/after, pros-cons
comparison-matrix
Multi-factor gridFeature comparisons
hierarchical-layers
Pyramid or stacked levelsPriority, importance
hub-spoke
Central node with radiating itemsConcept maps, ecosystems
bento-grid
Varied-size tilesOverview, summary
funnel
Narrowing stagesConversion, filtering
dashboard
Metrics with charts/numbersKPIs, data display
venn-diagram
Overlapping circlesRelationships, intersections
circular-flow
Continuous cycleRecurring processes
winding-roadmap
Curved path with milestonesJourney, timeline
tree-branching
Parent-child hierarchyOrg charts, taxonomies
iceberg
Visible vs hidden layersSurface vs depth
bridge
Gap with connectionProblem-solution

Academic-Specific Layouts

LayoutDescriptionBest For
paper-title
Title, authors, affiliations, venueConference paper cover
outline-agenda
Numbered section list with highlightsTalk structure overview
methods-diagram
Central architecture/pipeline diagramMethods, system design
results-chart
Chart area + data annotationsQuantitative results
equation-focus
Centered equation + variable definitionsMathematical derivations
qualitative-grid
2x2 or 3x2 image comparison gridVisual results, ablations
references-list
Numbered citation listKey references slide
contributions
Numbered contribution pointsContributions summary

Usage: Add

Layout: <name>
in slide's
// LAYOUT
section to guide visual composition.

Design Philosophy

This deck is designed for reading and sharing, not live presentation:

  • Each slide must be self-explanatory without verbal commentary
  • Structure content for logical flow when scrolling
  • Include all necessary context within each slide
  • Optimize for social media sharing and offline reading

File Management

Output Directory

Each session creates an independent directory named by content slug:

slide-deck/{topic-slug}/
├── source-{slug}.{ext}    # Source files (text, images, etc.)
├── outline.md
├── outline-{style}.md     # Style variant outlines
├── prompts/
│   └── 01-slide-cover.md, 02-slide-{slug}.md, ...
├── 01-slide-cover.png, 02-slide-{slug}.png, ...
├── {topic-slug}.pptx
└── {topic-slug}.pdf

Slug Generation:

  1. Extract main topic from content (2-4 words, kebab-case)
  2. Example: "Introduction to Machine Learning" →
    intro-machine-learning

Conflict Resolution

If

slide-deck/{topic-slug}/
already exists:

  • Append timestamp:
    {topic-slug}-YYYYMMDD-HHMMSS
  • Example:
    intro-ml
    exists →
    intro-ml-20260118-143052

Source Files

Copy all sources with naming

source-{slug}.{ext}
:

  • source-article.md
    (main text content)
  • source-diagram.png
    (image from conversation)
  • source-data.xlsx
    (additional file)

Multiple sources supported: text, images, files from conversation.

Workflow

Step 1: Analyze Content

  1. Save source content (if pasted, save as
    source.md
    )
  2. Follow
    references/analysis-framework.md
    for deep content analysis
  3. Determine style (use
    --style
    or auto-select from signals)
  4. Detect languages (source vs. user preference)
  5. Plan slide count (
    --slides
    or dynamic)
  6. For academic papers (PDF with figures): Run automatic figure detection:
    npx -y bun ${SKILL_DIR}/scripts/detect-figures.ts --pdf source-paper.pdf --output figures.json
    
    This outputs a JSON file with all detected figures/tables, their page numbers, and captions.

Step 2: Generate Outline Variants

  1. Generate 3 style variant outlines based on content analysis
  2. Follow
    references/outline-template.md
    for structure
  3. Auto-populate IMAGE_SOURCE for academic papers:
    • Read
      figures.json
      from Step 1
    • Map figures to slides using rules in
      references/analysis-framework.md
      Section 8
    • Automatically add
      // IMAGE_SOURCE
      blocks to appropriate slides:
      • Architecture/pipeline figures → Methods slides (
        Source: extract
        )
      • Results tables → Quantitative results slides (
        Source: extract
        )
      • Comparison images → Qualitative results slides (
        Source: extract
        )
      • Conceptual/simple diagrams → Leave for AI generation (
        Source: generate
        or omit)
  4. Save as
    outline-{style}.md
    for each variant

Step 3: User Confirmation

Single AskUserQuestion with all applicable options:

QuestionWhen to Ask
Style variantAlways (3 options + custom)
LanguageOnly if source ≠ user language

After selection:

  • Copy selected
    outline-{style}.md
    to
    outline.md
  • Regenerate in different language if requested
  • User may edit
    outline.md
    for fine-tuning

If

--outline-only
, stop here.

Step 4: Generate Prompts

  1. Read
    references/base-prompt.md
  2. Combine with style instructions from outline
  3. Add slide-specific content
  4. If
    Layout:
    specified in outline, include layout guidance in prompt:
    • Reference layout characteristics for image composition
    • Example:
      Layout: hub-spoke
      → "Central concept in middle with related items radiating outward"
  5. Save to
    prompts/
    directory

Step 5: Image Generation Method Selection

Before generating images, ask user to choose generation method:

Use AskUserQuestion with options:

OptionLabelDescription
1Gemini API (Recommended)Official Google API via Python. Requires GOOGLE_API_KEY env var.
2Gemini Web (Browser-based)⚠️ Uses reverse-engineered web API. No API key needed but may break.

Based on selection:

Option 1: Gemini API (Python)

  1. Verify API key: Check
    GOOGLE_API_KEY
    or
    GEMINI_API_KEY
    environment variable
  2. Run generation script:
    python ${SKILL_DIR}/scripts/generate-slides.py <slide-deck-dir> --model gemini-3-pro-image-preview
    

Script Features:

  • Auto-installs
    google-genai
    package if missing
  • Retry logic with exponential backoff (3 retries)
  • Skips already-generated slides (> 10KB)
  • Supports custom model via
    --model
    flag
  • Outputs to
    slides/
    subdirectory

Troubleshooting:

  • If server disconnection errors occur, script auto-retries
  • For persistent failures, re-run the script (it skips completed slides)
  • Check API quota if many failures occur

Option 2: Gemini Web Skill

  1. Consent Check: Read consent file at:

    • Windows:
      $APPDATA/baoyu-skills/gemini-web/consent.json
    • macOS:
      ~/Library/Application Support/baoyu-skills/gemini-web/consent.json
    • Linux:
      ~/.local/share/baoyu-skills/gemini-web/consent.json
  2. If no consent or version mismatch, display disclaimer and ask:

    ⚠️ DISCLAIMER: This uses a reverse-engineered Gemini Web API (NOT official).
    Risks: May break anytime, no support, possible account risk.
    
  3. For each slide, run:

    npx -y bun ${GEMINI_WEB_SKILL_DIR}/scripts/main.ts \
      --promptfiles prompts/01-slide-cover.md \
      --image 01-slide-cover.png \
      --sessionId slides-{topic-slug}-{timestamp}
    

    Where

    GEMINI_WEB_SKILL_DIR
    = path to
    baoyu-danger-gemini-web
    skill directory.

  4. Proxy support: If user is in restricted network, prepend:

    HTTP_PROXY=http://127.0.0.1:7890 HTTPS_PROXY=http://127.0.0.1:7890
    

Step 5.5: Process IMAGE_SOURCE (Automatic Figure Extraction)

For academic presentations, IMAGE_SOURCE metadata was auto-populated in Step 2 based on figure detection from Step 1.

Automatic Execution:

  1. Parse outline to identify slides with

    Source: extract

  2. Create figures directory:

    mkdir -p figures

  3. For each extract slide, automatically:

    • Read the Figure number, Page, and Caption from metadata
    • Run figure extraction script:
      npx -y bun ${SKILL_DIR}/scripts/extract-figure.ts \
        --pdf source-paper.pdf \
        --page <page-number> \
        --output figures/figure-<N>.png
      
    • Run template application script:
      npx -y bun ${SKILL_DIR}/scripts/apply-template.ts \
        --figure figures/figure-<N>.png \
        --title "<slide-headline>" \
        --caption "Figure <N>: <caption-text>" \
        --output <NN>-slide-<slug>.png
      
    • Report: "Extracted: Figure N → slide NN"
  4. For slides with

    Source: generate
    (or no IMAGE_SOURCE):

    • Proceed to Step 6 for AI generation

Note: Source PDF must be saved as

source-paper.pdf
in output directory.

Troubleshooting:

  • If figure detection missed a figure: manually add
    // IMAGE_SOURCE
    block to outline
  • If wrong figure mapped: edit the
    Figure:
    and
    Page:
    values in outline
  • If extraction fails: check PDF page number (1-indexed)

PyMuPDF Fallback for Page Extraction: If

extract-figure.ts
fails with "Image or Canvas expected" error (common with complex PDFs), use PyMuPDF:

import fitz
doc = fitz.open("source-paper.pdf")
page = doc[page_num - 1]  # 0-indexed
mat = fitz.Matrix(3, 3)  # 3x scale for high resolution
pix = page.get_pixmap(matrix=mat)
pix.save(f"extracted/page-{page_num}.png")

Then apply template using

apply-template.ts
.

Step 6: Generate Images

  1. Use selected method from Step 5
  2. Skip slides already processed in Step 5.5 (those with
    Source: extract
    )
  3. Generate session ID:
    slides-{topic-slug}-{timestamp}
  4. Generate each remaining slide with same session ID
  5. Report progress: "Generated X/N"
  6. Auto-retry once on generation failure

Step 7: Merge to PPTX and PDF

npx -y bun ${SKILL_DIR}/scripts/merge-to-pptx.ts <slide-deck-dir>
npx -y bun ${SKILL_DIR}/scripts/merge-to-pdf.ts <slide-deck-dir>

Step 8: Output Summary

Slide Deck Complete!

Topic: [topic]
Style: [style name]
Location: [directory path]
Slides: N total

- 01-slide-cover.png ✓ Cover
- 02-slide-intro.png ✓ Content
- ...
- {NN}-slide-back-cover.png ✓ Back Cover

Outline: outline.md
PPTX: {topic-slug}.pptx
PDF: {topic-slug}.pdf

Slide Modification

See

references/modification-guide.md
for:

  • Edit single slide workflow
  • Add new slide (with renumbering)
  • Delete slide (with renumbering)
  • File naming conventions

Image Generation Dependencies

Gemini API (Option 1 - Recommended)

Requires:

  • GOOGLE_API_KEY
    or
    GEMINI_API_KEY
    environment variable
  • Python 3.8+ with pip
  • google-genai
    package (auto-installed by script)

Model:

gemini-3-pro-image-preview
(default)

Gemini Web Skill (Option 2)

Requires:

  • baoyu-danger-gemini-web
    skill installed at
    .claude/skills/baoyu-danger-gemini-web
  • Google Chrome browser with logged-in Google account
  • User consent for reverse-engineered API disclaimer

PDF Figure Extraction

Requires:

  • Primary:
    pdfjs-dist
    npm package (use legacy build for Node.js)
  • Fallback:
    pymupdf
    Python package (more reliable for complex PDFs)
  • canvas
    npm package for apply-template.ts

References

FileContent
references/analysis-framework.md
Deep content analysis for presentations
references/outline-template.md
Outline structure and STYLE_INSTRUCTIONS format
references/modification-guide.md
Edit, add, delete slide workflows
references/content-rules.md
Content and style guidelines
references/base-prompt.md
Base prompt for image generation
references/figure-container-template.md
Visual specs for extracted figure containers
references/styles/<style>.md
Full style specifications

Notes

Image Generation

  • Nano Banana Pro API: Recommended. Stable, reliable, requires API key
  • Gemini Web: No API key needed, but uses reverse-engineered API with account risk
  • Generation time: 10-30 seconds per slide
  • Auto-retry once on generation failure
  • Maintain style consistency via session ID

Content Guidelines

  • Use stylized alternatives for sensitive public figures
  • Both methods use the same underlying Gemini model for image generation

Extension Support

Custom styles and configurations via EXTEND.md.

Check paths (priority order):

  1. .paper-skills/paper-slide-deck/EXTEND.md
    (project)
  2. ~/.paper-skills/paper-slide-deck/EXTEND.md
    (user)

If found, load before Step 1. Extension content overrides defaults.