Claude-skills gemini-visual

Visual and front-end development assistant powered by Google Gemini's multimodal models. Use for UI analysis, design comparison, accessibility audits, color palette extraction, screenshot-to-code conversion, generating UI assets, and text-based design assistance from briefs.

install

source · Clone the upstream repo

git clone https://github.com/ckorhonen/claude-skills

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/ckorhonen/claude-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/gemini-visual" ~/.claude/skills/ckorhonen-claude-skills-gemini-visual && rm -rf "$T"

manifest: skills/gemini-visual/SKILL.md

source content

Gemini Visual - Front-End & Visual Development Assistant

Overview

A comprehensive toolkit leveraging Google Gemini's advanced visual reasoning capabilities for front-end development and design tasks. Gemini provides state-of-the-art multimodal understanding with spatial reasoning, document understanding, and high-resolution image processing.

When to Use

UI/UX Analysis: Analyze screenshots for layout issues, visual hierarchy, and design patterns
Accessibility Audits: Check contrast ratios, text readability, and WCAG compliance
Design Comparison: Compare mockups, before/after screenshots, or different design variations
Color Palette Extraction: Extract colors from images with HEX, RGB, and HSL values
Screenshot to Code: Generate HTML/CSS from design screenshots
UI Asset Generation: Create icons, backgrounds, gradients, and UI graphics
Responsive Design Review: Analyze multi-device screenshots for consistency
Visual Debugging: Identify rendering issues, broken layouts, or visual bugs
Design from Brief: Generate designs, code, and components from text descriptions
Interactive Design Sessions: Multi-turn conversations for iterative design refinement

Prerequisites

Python 3.9+
```
google-genai
```
package
```
GEMINI_API_KEY
```
environment variable

Installation

pip install google-genai

Getting an API Key

Go to Google AI Studio
Sign in with your Google account
Click "Create API Key"
Copy the key and set it as an environment variable:

export GEMINI_API_KEY="your-api-key"

Add to your shell profile (

~/.zshrc

~/.bashrc

) for persistence:

echo 'export GEMINI_API_KEY="your-api-key"' >> ~/.zshrc

Scripts Overview

Script	Purpose
`analyze_ui.py`	Analyze UI screenshots for issues, patterns, and suggestions
`generate_ui_assets.py`	Generate icons, backgrounds, and UI graphics
`compare_designs.py`	Compare two designs and highlight differences
`extract_colors.py`	Extract color palettes from images
`screenshot_to_code.py`	Convert screenshots to HTML/CSS code
`design_from_brief.py`	Generate designs and code from text briefs (no image required)

Available Models

Model	Best For	Notes
`gemini-2.5-flash`	Visual analysis, code generation, fast reasoning	Best cost/quality for analysis
`gemini-2.5-pro`	Complex visual reasoning, code generation	Highest quality analysis
`gemini-2.5-flash-image-preview`	Asset generation, image editing	Native image output
`gemini-3.1-flash-image-preview`	High-quality image generation, 4K output	Latest image generation model
`gemini-3-pro-image-preview`	Professional image generation	Previous-gen high quality

Note: For image analysis tasks (analyze_ui, compare_designs, screenshot_to_code, extract_colors), use the text+vision models (
gemini-2.5-flash
or
gemini-2.5-pro
). For image generation tasks (generate_ui_assets), use the image-output models.

Media Resolution Options

Control token usage and detail level with

--resolution

Resolution	Tokens/Image	Best For
`low`	70	Quick scans, thumbnails
`medium`	560	Standard screenshots, OCR
`high`	1120	Detailed UI analysis
`ultra_high`	2240+	Fine text, complex layouts

Script Reference

1. analyze_ui.py - UI Analysis

Analyze UI screenshots for design issues, accessibility problems, and improvement suggestions.

python scripts/analyze_ui.py [options] IMAGE

Required:
  IMAGE                    Path to UI screenshot to analyze

Options:
  -m, --mode MODE          Analysis mode (default: comprehensive)
                           Modes: comprehensive, accessibility, layout, ux
  -r, --resolution RES     Media resolution (default: high)
  -o, --output FILE        Save analysis to file (JSON or text)
  -f, --format FORMAT      Output format: text, json, markdown (default: text)
  --thinking LEVEL         Thinking level: low, high (default: high)
  -v, --verbose            Show detailed progress

Examples:

# Comprehensive UI analysis
python scripts/analyze_ui.py screenshot.png

# Accessibility-focused analysis
python scripts/analyze_ui.py -m accessibility app_screen.png

# Layout analysis with JSON output
python scripts/analyze_ui.py -m layout -f json -o report.json mockup.png

# Quick UX review
python scripts/analyze_ui.py -m ux --thinking low mobile_app.png

Analysis Modes:

comprehensive: Full analysis including layout, colors, typography, accessibility, and UX
accessibility: WCAG compliance, contrast ratios, screen reader compatibility, touch targets
layout: Visual hierarchy, spacing, alignment, responsive issues
ux: User flow, affordances, cognitive load, interaction patterns

2. generate_ui_assets.py - Asset Generation

Generate UI assets like icons, backgrounds, patterns, and graphics.

python scripts/generate_ui_assets.py [options]

Required:
  -p, --prompt TEXT        Description of asset to generate

Options:
  -t, --type TYPE          Asset type (default: icon)
                           Types: icon, background, pattern, illustration, badge
  -s, --style STYLE        Design style (default: modern)
                           Styles: modern, minimal, flat, gradient, glassmorphism,
                                   neumorphism, material, ios, outlined
  -c, --colors COLORS      Color palette (comma-separated HEX or names)
  -a, --aspect-ratio RATIO Aspect ratio (default: 1:1)
  --size SIZE              Resolution: 1K, 2K, 4K (default: 1K)
  -o, --output FILE        Output file path
  -r, --reference IMAGE    Reference image for style guidance
  -v, --verbose            Show detailed progress

Examples:

# Generate app icon
python scripts/generate_ui_assets.py -p "Weather app icon with sun and clouds" -t icon

# Create gradient background
python scripts/generate_ui_assets.py -p "Soft gradient for login screen" -t background \
  -c "#667eea,#764ba2" -a 9:16 -o login_bg.png

# Generate pattern for UI
python scripts/generate_ui_assets.py -p "Subtle geometric pattern for card backgrounds" \
  -t pattern -s minimal -o pattern.png

# Create illustration from reference
python scripts/generate_ui_assets.py -p "Onboarding illustration, person using phone" \
  -t illustration -r brand_style.png -o onboarding.png

# Generate badge/label
python scripts/generate_ui_assets.py -p "Premium badge with star" -t badge \
  -c "gold,white" -s gradient

3. compare_designs.py - Design Comparison

Compare two design screenshots and analyze differences.

python scripts/compare_designs.py [options] IMAGE1 IMAGE2

Required:
  IMAGE1                   First design image (before/version A)
  IMAGE2                   Second design image (after/version B)

Options:
  -m, --mode MODE          Comparison mode (default: full)
                           Modes: full, visual, content, accessibility
  -f, --format FORMAT      Output format: text, json, markdown (default: text)
  -o, --output FILE        Save comparison to file
  -r, --resolution RES     Media resolution (default: high)
  -v, --verbose            Show detailed progress

Examples:

# Full design comparison
python scripts/compare_designs.py before.png after.png

# Visual-only comparison
python scripts/compare_designs.py -m visual old_design.png new_design.png

# Compare for accessibility changes
python scripts/compare_designs.py -m accessibility v1.png v2.png -o report.md

# A/B test comparison as JSON
python scripts/compare_designs.py -m full variant_a.png variant_b.png -f json

Comparison Modes:

full: Comprehensive comparison of all visual and functional aspects
visual: Focus on colors, typography, spacing, visual hierarchy
content: Text changes, layout shifts, information architecture
accessibility: Contrast, readability, touch targets, WCAG compliance changes

4. extract_colors.py - Color Palette Extraction

Extract color palettes from images with multiple output formats.

python scripts/extract_colors.py [options] IMAGE

Required:
  IMAGE                    Image to extract colors from

Options:
  -n, --count COUNT        Number of colors to extract (default: 6)
  -f, --format FORMAT      Output format: text, json, css, tailwind, scss (default: text)
  -o, --output FILE        Save palette to file
  --named                  Include closest CSS color names
  --contrast               Calculate contrast ratios between colors
  -v, --verbose            Show detailed progress

Examples:

# Extract 6 main colors
python scripts/extract_colors.py screenshot.png

# Extract palette as CSS variables
python scripts/extract_colors.py -f css -o colors.css brand_image.png

# Get Tailwind config
python scripts/extract_colors.py -f tailwind -o tailwind.config.js design.png

# Detailed palette with contrast info
python scripts/extract_colors.py -n 8 --named --contrast hero_image.jpg

# SCSS variables output
python scripts/extract_colors.py -f scss -o _colors.scss mockup.png

5. screenshot_to_code.py - Screenshot to Code

Convert UI screenshots to HTML/CSS code.

python scripts/screenshot_to_code.py [options] IMAGE

Required:
  IMAGE                    UI screenshot to convert

Options:
  -f, --framework FRAME    CSS framework (default: tailwind)
                           Frameworks: tailwind, css, bootstrap, vanilla
  -c, --components         Extract as reusable components
  --responsive             Generate responsive code
  -o, --output DIR         Output directory for files
  -r, --resolution RES     Media resolution (default: ultra_high)
  --thinking LEVEL         Thinking level: low, high (default: high)
  -v, --verbose            Show detailed progress

Examples:

# Convert to Tailwind HTML
python scripts/screenshot_to_code.py landing_page.png

# Generate vanilla CSS
python scripts/screenshot_to_code.py -f vanilla -o ./output mockup.png

# Create responsive Bootstrap components
python scripts/screenshot_to_code.py -f bootstrap -c --responsive card.png

# Full page conversion with components
python scripts/screenshot_to_code.py -f tailwind -c --responsive -o ./components page.png

6. design_from_brief.py - Text-Based Design Assistant

Generate frontend designs, code, and components from text descriptions without needing visual input.

python scripts/design_from_brief.py [options]

Input (one required):
  -p, --prompt TEXT      Design brief or prompt text
  -b, --brief-file FILE  Read brief from a file
  --interactive          Start interactive design session

Options:
  -m, --mode MODE        Generation mode (default: code)
                         Modes: design, code, component, review, brainstorm
  -fw, --framework FW    Framework for code generation (default: tailwind)
                         Frameworks: tailwind, css, bootstrap, react, vue, svelte, vanilla
  -c, --context TEXT     Additional context (existing code, constraints)
  -f, --format FORMAT    Output format: text, json, markdown (default: text)
  -o, --output FILE      Save output to file
  -v, --verbose          Show detailed progress

Examples:

# Generate code from a brief
python scripts/design_from_brief.py -p "Create a pricing table with 3 tiers" -m code -fw tailwind

# Get design advice and guidance
python scripts/design_from_brief.py -p "Design a modern SaaS landing page" -m design

# Generate a React component
python scripts/design_from_brief.py -p "A toggle switch with smooth animation" -m component -fw react

# Review a design idea
python scripts/design_from_brief.py -p "Is a hamburger menu good for desktop navigation?" -m review

# Brainstorm creative ideas
python scripts/design_from_brief.py -p "Ideas for a fitness app dashboard" -m brainstorm

# Read brief from file
python scripts/design_from_brief.py -b project_brief.txt -m code -fw vue

# Interactive multi-turn session
python scripts/design_from_brief.py --interactive -m code -fw tailwind

Generation Modes:

design: Get design guidance including colors, typography, layout, and UX recommendations
code: Generate complete, production-ready frontend code
component: Design and code reusable UI components with props and variants
review: Get feedback and recommendations on design ideas or code
brainstorm: Generate creative ideas and multiple design directions

Supported Frameworks:

Framework	Description
`tailwind`	Tailwind CSS utility classes
`css`	Custom CSS with variables and BEM
`bootstrap`	Bootstrap 5 components
`react`	React functional components with TypeScript
`vue`	Vue 3 Composition API with TypeScript
`svelte`	Svelte components
`vanilla`	Plain HTML/CSS/JavaScript

Interactive Session Commands:

When using

--interactive

, you can use these commands:

Command	Description
`/mode <mode>`	Change generation mode
`/framework <fw>`	Change framework
`/save <file>`	Save last response to file
`/clear`	Clear conversation history
`/quit`	Exit session

Use Cases

Front-End Development Workflow

# 1. Analyze a design mockup
python scripts/analyze_ui.py mockup.png -f markdown -o analysis.md

# 2. Extract brand colors
python scripts/extract_colors.py mockup.png -f tailwind -o colors.js

# 3. Generate code from screenshot
python scripts/screenshot_to_code.py mockup.png -f tailwind -c -o ./src

# 4. Generate placeholder icons
python scripts/generate_ui_assets.py -p "Settings gear icon" -t icon -s outlined

Design Review Process

# Compare design iterations
python scripts/compare_designs.py v1.png v2.png -f markdown -o review.md

# Check accessibility
python scripts/analyze_ui.py final_design.png -m accessibility -o a11y_report.json

Asset Production Pipeline

# Generate icon set
for icon in "home" "search" "profile" "settings"; do
  python scripts/generate_ui_assets.py -p "${icon} icon" -t icon -s outlined -o icons/${icon}.png
done

# Generate backgrounds for different screens
python scripts/generate_ui_assets.py -p "Auth screen gradient" -t background -a 9:16 -o bg_auth.png
python scripts/generate_ui_assets.py -p "Dashboard header" -t background -a 21:9 -o bg_header.png

Design from Text Brief

# Start with design exploration
python scripts/design_from_brief.py -p "E-commerce product page for sneakers" -m design -o design_spec.md

# Generate the code
python scripts/design_from_brief.py -p "E-commerce product page with image gallery,
size selector, add to cart button, and reviews section" -m code -fw react -o product_page.tsx

# Create reusable components
python scripts/design_from_brief.py -p "Star rating component, 1-5 stars,
supports half stars, shows count" -m component -fw react

# Interactive refinement session
python scripts/design_from_brief.py --interactive -m code -fw tailwind
# Then iterate: "Add a sticky header", "Make the CTA more prominent", etc.

Full Project Workflow (No Images Needed)

# 1. Brainstorm ideas
python scripts/design_from_brief.py -p "Modern dashboard for analytics SaaS" -m brainstorm

# 2. Get detailed design spec
python scripts/design_from_brief.py -p "Analytics dashboard with sidebar nav,
KPI cards, charts, and data tables" -m design -f markdown -o design.md

# 3. Generate components
python scripts/design_from_brief.py -p "KPI card showing metric, trend, and sparkline" \
  -m component -fw react -o components/KPICard.tsx

python scripts/design_from_brief.py -p "Data table with sorting, filtering, pagination" \
  -m component -fw react -o components/DataTable.tsx

# 4. Generate full page layout
python scripts/design_from_brief.py -p "Dashboard layout combining sidebar, header,
and main content area with the KPI cards and data table" -m code -fw react

API Capabilities Summary

Gemini 3 Visual Features

Spatial Reasoning: Understands element positioning, alignment, and relationships
Document Understanding: OCR, text extraction, layout parsing
High Resolution: Up to ultra-high resolution for fine details and small text
Multi-Image Input: Compare up to 3,600 images per request
Thought Signatures: Maintains reasoning context for multi-turn editing

Image Support

Input Formats: PNG, JPEG, WebP, HEIC, HEIF
Max File Size: 20MB inline, larger via File API
Output Formats: PNG (generated images)
Resolutions: 1K, 2K, 4K (generation), configurable input resolution

Troubleshooting

"GEMINI_API_KEY environment variable not set"

Set your API key:

export GEMINI_API_KEY="your-api-key"

"Rate limit exceeded"

Wait a few minutes and retry. For batch operations, add delays between requests.

Low quality analysis

Use higher resolution:

python scripts/analyze_ui.py -r ultra_high screenshot.png

Missing fine details

For detailed UI analysis, use ultra_high resolution:

python scripts/analyze_ui.py -r ultra_high -m comprehensive app.png

Code generation issues

Use high thinking level and ultra_high resolution:

python scripts/screenshot_to_code.py --thinking high -r ultra_high mockup.png

Asset generation blocked by content moderation

If you see "No image content returned. The request may have been blocked by content moderation", try:

Simplifying or rephrasing your prompt
Removing specific brand names or copyrighted references
Using more generic descriptions

This is a safety filter from the Gemini API and not all prompts will be accepted.

Resolution affects token usage

The

--resolution

parameter controls how many tokens are used for image processing:

```
low
```
: ~70 tokens/image - Quick scans, thumbnails
```
medium
```
: ~560 tokens/image - Standard screenshots, OCR
```
high
```
: ~1120 tokens/image - Detailed UI analysis (default)

Higher resolution provides better accuracy but uses more tokens.

Best Practices

Use appropriate resolution: Higher resolution = more tokens but better accuracy
Match model to task: Use
```
gemini-3-pro-preview
```
for analysis,
```
gemini-3-pro-image-preview
```
for generation
Provide context: Add relevant details to prompts for better results
Iterate on assets: Use multi-turn conversations for refinement
Combine scripts: Use extraction + generation for consistent design systems
Save outputs: Use
```
-o
```
flag to save reports for documentation
Batch similar tasks: Process multiple similar items together for efficiency

Claude-skills gemini-visual

Gemini Visual - Front-End & Visual Development Assistant

Overview

When to Use

Prerequisites

Installation

Getting an API Key

Scripts Overview

Available Models

Media Resolution Options

Script Reference

1. analyze_ui.py - UI Analysis

2. generate_ui_assets.py - Asset Generation

3. compare_designs.py - Design Comparison

4. extract_colors.py - Color Palette Extraction

5. screenshot_to_code.py - Screenshot to Code

6. design_from_brief.py - Text-Based Design Assistant

Use Cases

Front-End Development Workflow

Design Review Process

Asset Production Pipeline

Design from Text Brief

Full Project Workflow (No Images Needed)

API Capabilities Summary

Gemini 3 Visual Features

Image Support

Troubleshooting

"GEMINI_API_KEY environment variable not set"

"Rate limit exceeded"

Low quality analysis

Missing fine details

Code generation issues

Asset generation blocked by content moderation

Resolution affects token usage

Best Practices

Sources