Claude-skills svg-portrait-mode

"Portrait Mode" for SVGs — foveated vectorization with 4-zone selective detail. Combines Claude vision annotations, MediaPipe segmentation/landmarks, and optional saliency. Like phone portrait mode, but vectorized.

install
source · Clone the upstream repo
git clone https://github.com/oaustegard/claude-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/oaustegard/claude-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/svg-portrait-mode" ~/.claude/skills/oaustegard-claude-skills-svg-portrait-mode && rm -rf "$T"
manifest: svg-portrait-mode/SKILL.md
source content

SVG Portrait Mode

Selective simplification: one pipeline pass at high K → zone-aware contour simplification → optional per-zone style transforms. Like phone portrait mode, but vectorized — not blur, but stylistic separation of foreground and background.

Quick Start

Agent-annotated (recommended)

The agent looks at the image first, identifies important regions with rough bounding boxes, then calls:

from portrait_mode import portrait_mode

svg, stats = portrait_mode("photo.jpg",
    focus_targets=[
        {'bbox': (215, 125, 295, 195), 'label': 'face'},
    ],
    focus_edges=[
        {'bbox': (214, 170, 310, 290), 'label': 'beard'},
        {'bbox': (210, 415, 300, 505), 'label': 'hands'},
        {'bbox': (195, 95, 330, 140), 'label': 'hat'},
    ])

With style transforms

svg, stats = portrait_mode("photo.jpg",
    focus_targets=[{'bbox': (215, 125, 295, 195), 'label': 'face'}],
    focus_edges=[{'bbox': (214, 170, 310, 290), 'label': 'beard'}],
    style_transforms={
        'background': 'desaturate:0.7',
        'periphery': 'desaturate:0.3',
    })

Backward-compatible (MP-only)

Without annotations, falls back to MediaPipe face detection:

svg, stats = portrait_mode("photo.jpg")

How It Works

image-to-svg pipeline (K=96, unified palette)
    → zone detection (agent bboxes + optional MediaPipe landmarks)
    → assign contours to zones (by centroid)
    → per-zone simplification (epsilon, min_area)
    → optional per-zone style transforms
    → single SVG output (zone-labeled <g> groups, no clipPaths)

Subtractive, not additive. One pipeline pass produces full detail everywhere. Zones that don't need detail get simplified by coarsening contour approximation and raising minimum area thresholds. Subject stays sharp; background gets abstract.

Why This Works

v0.5.0 ran four independent pipelines (one per zone) with different K-means palettes, then composited via clipPaths. This produced tonal discontinuities at zone boundaries and was 4-7x slower. v0.6.0 uses a single palette so colors harmonize naturally, and simplification is a cheap post-process on contours the pipeline already extracted.

Agent Workflow

  1. Look at the image — identify what's compositionally important
  2. Provide rough bounding boxes as
    (x1, y1, x2, y2)
    pixel coordinates
    • Precision is NOT required (±30px is fine)
    • Use
      focus_targets
      for where the eye goes first (face, eyes)
    • Use
      focus_edges
      for compositionally important areas (beard, hands, hat, props)
  3. Call
    portrait_mode()
    — skill handles zone detection, extraction, and assembly
  4. Review output — check stats for path distribution across zones

Four Zones

ZonePurposeEpsilonMin AreaExamples
TargetWhere the eye goes first0.5× (tight)15 px²Face, eyes, key subject
EdgeCompositionally important1.0× (default)40 px²Beard, hands, hat, props
PeripheryContext, not focal2.5× (loose)100 px²Torso, clothing, limbs
BackgroundAtmosphere5.0× (very loose)200 px²Sky, walls, landscape

Epsilon multiplies the base simplification factor (0.002 × perimeter). Higher = fewer vertices = more abstract. Min area filters out small shapes entirely.

Zone Assignment

Each contour is assigned to the highest-priority zone covering >30% of its area. This prevents focal shapes that straddle a zone boundary from getting simplified. Small contours (<500 px²) use centroid lookup for speed.

Periphery Generation

When focus targets or edges are specified, periphery is automatically generated as a buffer zone around the foreground (dilated union of target + edge zones). This creates a smooth detail gradient from subject to background.

Per-Zone Style Transforms

With zone-tagged shapes sharing a unified palette, backgrounds can be independently styled without affecting subject colors:

style_transforms={
    'background': 'desaturate:0.7',   # 70% desaturated
    'periphery': 'mute:0.3',          # 30% toward mid-gray
}

Available transforms:

TransformEffect
desaturate:N
Shift toward gray (0=none, 1=grayscale)
grayscale
Full grayscale
mute:N
Shift toward mid-gray (0=none, 1=flat gray)
warm:N
Warmer color temperature
cool:N
Cooler color temperature
opacity:N
Group opacity (0=invisible, 1=full)

Parameters

portrait_mode(image_path,
    # Zone annotations
    focus_targets=None,   # [{'bbox': (x1,y1,x2,y2), 'label': str}, ...]
    focus_edges=None,     # [{'bbox': (x1,y1,x2,y2), 'label': str}, ...]

    # Pipeline settings
    K=96,                 # Color clusters (higher = more tonal detail)
    smooth=None,          # ImageMagick preprocessing ("oilpaint", "kuwahara:N")
    svg_width=800,

    # MediaPipe options
    use_landmarks=True,   # Try MP face landmarks for precise face geometry

    # Per-zone simplification overrides
    zone_simplification=None,  # {ZONE_TARGET: {'epsilon_mult': 0.3, 'min_area': 10}}

    # Per-zone style transforms
    style_transforms=None,  # {'background': 'desaturate:0.7'}
)

Performance

Single pipeline pass (~8-12s for a typical photo at K=96) vs v0.5.0's four independent passes (~40-60s). Zone detection and contour assignment add <1s. Style transforms are string operations with zero computational cost.

Requirements

Cross-skill dependencies:

  • image-to-svg
    pipeline (
    /mnt/skills/user/image-to-svg/
    )
  • flowing
    DAG runner (
    /mnt/skills/user/flowing/
    )
  • seeing-images
    (
    /mnt/skills/user/seeing-images/
    ) — for agent's visual inspection

Optional MediaPipe models (auto-downloaded on first use):

  • blaze_face_short_range.tflite
    — face detection fallback
  • face_landmarker.task
    — precise face oval (478 mesh points)

Note: MediaPipe selfie segmenter is NOT used in v0.6.0. Zone detection comes from agent bboxes, with MP used only for face landmark refinement.

pip install opencv-python-headless scikit-image scipy scikit-learn --break-system-packages -q
apt-get install -y librsvg2-bin -qq

Verification Protocol

After EVERY run, render and visually compare side-by-side. Same as image-to-svg.

import subprocess
from PIL import Image

subprocess.run(['rsvg-convert', '-w', '1400', 'output.svg', '-o', 'output.png'])

orig = Image.open('source.jpg')
rendered = Image.open('output.png')
target_h = 800
orig_r = orig.resize((int(orig.width * target_h / orig.height), target_h))
rend_r = rendered.resize((int(rendered.width * target_h / rendered.height), target_h))
gap = 20
comp = Image.new('RGB', (orig_r.width + rend_r.width + gap, target_h), (255,255,255))
comp.paste(orig_r, (0, 0))
comp.paste(rend_r, (orig_r.width + gap, 0))
comp.save('comparison.png')

What Changed from v0.5.0

Deleted

  • Per-zone image-to-svg calls (4 pipeline runs)
  • Per-zone smoothing (kuwahara, oilpaint per zone)
  • ClipPath compositing
  • Opaque crop + translate trick
  • Multi-pass segmentation (21 IM transforms × MP segmenter)
  • MediaPipe selfie segmenter dependency

Kept

  • Agent annotation API (focus_targets, focus_edges with bboxes)
  • MediaPipe face landmarks for precise face ovals
  • Four-zone concept (target / edge / periphery / background)

Added

  • Single-pass pipeline with unified palette
  • Zone-aware contour simplification (epsilon + min_area per zone)
  • Per-zone style transforms (desaturate, mute, warm/cool, opacity)
  • Automatic periphery generation (dilated foreground buffer)
  • Zone-tagged shapes in SVG output (<g> groups)