Genmz-shop photo-editor
Edit, resize, crop, filter, and optimize images — backgrounds, watermarks, and batch processing.
git clone https://github.com/Darsh20009/genmz-shop
T=$(mktemp -d) && git clone --depth=1 https://github.com/Darsh20009/genmz-shop "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.local/secondary_skills/photo-editor" ~/.claude/skills/darsh20009-genmz-shop-photo-editor && rm -rf "$T"
.local/secondary_skills/photo-editor/SKILL.mdPhoto Editor
Resize, crop, filter, and optimize images. Pillow for Python, sharp for Node. Clarify intent before starting.
Clarify Intent First
When a user asks to "edit a photo" or "change an image," the request could mean two very different things. Ask before proceeding if it's ambiguous:
- Edit the existing image — crop, resize, recolor, adjust brightness/contrast, add text, remove background, apply filters, watermark, etc. → Use the tools below (Pillow, sharp, OpenCV).
- Generate a new AI image — create something from scratch or heavily reimagine the photo (e.g., "make this photo look like a painting," "put me on a beach," "create a logo from this concept"). → Use image generation tools instead, not this skill.
When to ask
-
"Can you fix this photo?" → Probably editing. Ask what specifically needs fixing.
-
"Make this look better" → Ambiguous. Ask: "Do you want me to adjust the existing photo (brightness, contrast, cropping, etc.) or generate a new version with AI?"
-
"Change the background" → Could be either. Ask: "Should I remove the current background (I can make it transparent or a solid color), or do you want an AI-generated scene behind you?"
-
"Make a profile picture from this" → Likely crop/resize, but could mean AI enhancement. Clarify.
Don't ask when it's obvious
-
"Crop this to 1080x1080" → Just crop it.
-
"Make this a PNG" → Just convert it.
-
"Remove the background" → Use rembg.
-
"Generate a photo of a sunset" → No existing photo to edit — use image generation.
Tool Selection
| Tool | Use when | Install |
|---|---|---|
| Pillow | Default: resize, crop, filters, text, format conversion |
pip install Pillow |
| OpenCV | Computer vision: face detection, perspective transform, inpainting, contours |
pip install opencv-python numpy |
| sharp (Node) | High-volume pipelines — 4-5x faster than Pillow (libvips-backed) |
npm install sharp |
| rembg | AI background removal |
pip install rembg |
| ImageMagick | CLI batch ops, 200+ formats |
apt install imagemagick |
Open — ALWAYS Fix Orientation First
from PIL import Image, ImageOps img = Image.open("photo.jpg") img = ImageOps.exif_transpose(img) \# CRITICAL: applies EXIF rotation, then strips tag # Without this, phone photos appear sideways after processing
Resize & Crop
from PIL import Image, ImageOps # --- Fit inside box, keep aspect ratio (shrink only) --- img.thumbnail((1080, 1080), Image.Resampling.LANCZOS) \# modifies in place # --- Exact size, keep aspect, center-crop overflow (best for thumbnails) --- thumb = ImageOps.fit(img, (300, 300), Image.Resampling.LANCZOS, centering=(0.5, 0.5)) # --- Exact size, keep aspect, pad with color (letterbox) --- padded = ImageOps.pad(img, (1920, 1080), color=(0, 0, 0)) # --- Exact size, ignore aspect (will distort) --- stretched = img.resize((800, 600), Image.Resampling.LANCZOS) # --- Scale by factor --- half = img.resize((img.width // 2, img.height // 2), Image.Resampling.LANCZOS) # --- Manual crop (left, upper, right, lower) — NOT (x, y, w, h) --- cropped = img.crop((100, 50, 900, 650))
Resampling filters:
LANCZOSfor photo downscale (best quality),BICUBICfor upscale,NEAREST for pixel art/icons (no smoothing).
Face-Aware Cropping
For portraits and headshots, detect the face first and crop around it instead of guessing coordinates. This produces much better results for profile pictures.
import cv2 import numpy as np from PIL import Image, ImageOps img = Image.open("portrait.jpg") img = ImageOps.exif_transpose(img) cv_img = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR) gray = cv2.cvtColor(cv_img, cv2.COLOR_BGR2GRAY) cascade = cv2.CascadeClassifier(cv2.data.haarcascades + "haarcascade_frontalface_default.xml") faces = cascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5, minSize=(100, 100)) if len(faces) > 0: # Use the largest detected face fx, fy, fw, fh = max(faces, key=lambda f: f[2] * f[3]) face_cx = fx + fw // 2 face_cy = fy + fh // 2 # Square crop centered on face with padding (3x face height for head+shoulders) crop_size = min(img.width, img.height, fh * 3) left = max(0, face_cx - crop_size // 2) top = max(0, face_cy - int(crop_size * 0.35)) \# face in upper third right = left + crop_size bottom = top + crop_size # Clamp to image bounds if right > img.width: left -= (right - img.width) right = img.width if bottom > img.height: top -= (bottom - img.height) bottom = img.height left = max(0, left) top = max(0, top) cropped = img.crop((left, top, right, bottom)) profile = cropped.resize((800, 800), Image.Resampling.LANCZOS) profile.save("profile_800x800.jpg", quality=92, optimize=True) else: # Fallback: center crop profile = ImageOps.fit(img, (800, 800), Image.Resampling.LANCZOS, centering=(0.5, 0.4)) profile.save("profile_800x800.jpg", quality=92, optimize=True)
Tips
in the fallback biases the crop slightly toward the top — better for portraits than dead center.centering=(0.5, 0.4)- For group photos with multiple faces, you may want to fit all detected faces in the crop instead of picking the largest.
Color & Exposure
from PIL import ImageEnhance, ImageOps # --- Enhancers: 1.0 = unchanged, <1 less, >1 more --- img = ImageEnhance.Brightness(img).enhance(1.15) img = ImageEnhance.Contrast(img).enhance(1.2) img = ImageEnhance.Color(img).enhance(1.1) \# saturation img = ImageEnhance.Sharpness(img).enhance(1.5) # --- Quick ops --- gray = ImageOps.grayscale(img) inverted = ImageOps.invert(img.convert("RGB")) auto = ImageOps.autocontrast(img, cutoff=1) \# stretch histogram, clip 1% extremes equalized = ImageOps.equalize(img) \# flatten histogram
Filters
from PIL import ImageFilter img.filter(ImageFilter.GaussianBlur(radius=5)) img.filter(ImageFilter.UnsharpMask(radius=2, percent=150, threshold=3)) \# better than SHARPEN img.filter(ImageFilter.BoxBlur(10)) img.filter(ImageFilter.FIND_EDGES) img.filter(ImageFilter.MedianFilter(size=3)) \# denoise, removes salt-and-pepper
Watermark / Logo Removal
Use OpenCV's
cv2.inpaint()— it fills a masked region by sampling surrounding pixels, producing seamless results. Do not use pixel-by-pixelgetpixel/putpixel loops — they are slow and produce visible artifacts.
Step 1: Find the watermark boundaries
Always inspect the image at full resolution first. Watermarks are often much larger than they appear in thumbnails. Save a crop of the watermark region to verify coordinates before attempting removal.
import cv2 import numpy as np from PIL import Image, ImageOps img = Image.open("photo.jpg") img = ImageOps.exif_transpose(img) w, h = img.size # Save a debug crop of the suspected watermark area to verify its extent debug = img.crop((0, 0, min(w, 1000), min(h, 500))) debug.save("debug_watermark_area.jpg") # IMPORTANT: View this debug image to confirm where the watermark actually is # before proceeding. Guessing coordinates wastes iterations.
Step 2: Create a mask and inpaint
cv_img = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR) # --- Option A: Color-based mask (best for colored logos on neutral backgrounds) --- hsv = cv2.cvtColor(cv_img, cv2.COLOR_BGR2HSV) # Example: detect blue watermark pixels (adjust ranges for your watermark color) lower_blue = np.array([90, 40, 40]) upper_blue = np.array([130, 255, 255]) mask = cv2.inRange(hsv, lower_blue, upper_blue) # Also catch dark text pixels in the watermark region roi_gray = cv2.cvtColor(cv_img[:wm_h, :wm_w], cv2.COLOR_BGR2GRAY) _, dark_mask = cv2.threshold(roi_gray, 80, 255, cv2.THRESH_BINARY_INV) # Combine: place dark_mask into the full-size mask mask[:wm_h, :wm_w] = cv2.bitwise_or(mask[:wm_h, :wm_w], dark_mask) # --- Option B: Region-based mask (when you know the bounding box) --- # Simpler but removes everything in the box, not just the watermark pixels mask = np.zeros(cv_img.shape[:2], dtype=np.uint8) mask[0:wm_h, 0:wm_w] = 255 \# fill the entire watermark region # Dilate the mask slightly to catch anti-aliased edges kernel = np.ones((5, 5), np.uint8) mask = cv2.dilate(mask, kernel, iterations=2) # Inpaint — fills masked area using surrounding pixel data result = cv2.inpaint(cv_img, mask, inpaintRadius=7, flags=cv2.INPAINT_TELEA) # INPAINT_TELEA: fast marching method (best for most cases) # INPAINT_NS: Navier-Stokes (better for large regions, slower) result_pil = Image.fromarray(cv2.cvtColor(result, cv2.COLOR_BGR2RGB)) result_pil.save("clean.jpg", quality=92, optimize=True)
Step 3: Verify the result
# Save a crop of the same region after removal to confirm it's clean verify = result_pil.crop((0, 0, min(w, 1000), min(h, 500))) verify.save("debug_watermark_removed.jpg") # View this image before proceeding to resize/crop
Tips (2)
-
For watermarks on uniform backgrounds (studio portraits, product photos),
withINPAINT_TELEA
works well.inpaintRadius=5-10 -
For watermarks over textured areas (landscapes, fabric), use
with a larger radius (10-15).INPAINT_NS -
If the watermark is semi-transparent, color-based masking (Option A) is more precise than a rectangular region mask.
-
Always verify at full resolution — artifacts invisible in thumbnails may be obvious when zoomed in.
Text & Watermark (Adding)
from PIL import Image, ImageDraw, ImageFont draw = ImageDraw.Draw(img) try: font = ImageFont.truetype("DejaVuSans-Bold.ttf", 48) \# Linux default except OSError: font = ImageFont.load_default() \# fallback (tiny, ugly) # --- Text with outline --- draw.text((50, 50), "Caption", font=font, fill="white", stroke_width=3, stroke_fill="black") # --- Centered text --- bbox = draw.textbbox((0, 0), "Centered", font=font) tw, th = bbox[2] - bbox[0], bbox[3] - bbox[1] draw.text(((img.width - tw) // 2, (img.height - th) // 2), "Centered", font=font, fill="white") # --- Watermark (semi-transparent PNG overlay) --- logo = Image.open("logo.png").convert("RGBA") logo.thumbnail((img.width // 5, img.height // 5)) # Fade to 40% opacity alpha = logo.split()[3].point(lambda p: int(p * 0.4)) logo.putalpha(alpha) pos = (img.width - logo.width - 20, img.height - logo.height - 20) img.paste(logo, pos, logo) \# third arg = alpha mask — REQUIRED for transparency
Save & Optimize
# --- JPEG --- img.convert("RGB").save("out.jpg", quality=85, optimize=True, progressive=True) # convert("RGB") REQUIRED if source has alpha — JPEG can't store transparency # --- PNG (lossless — quality param does nothing) --- img.save("out.png", optimize=True, compress_level=9) # --- WebP (best web format: ~30% smaller than JPEG at same quality) --- img.save("out.webp", quality=85, method=6) \# method 0-6, 6=slowest/best compression # --- AVIF (smallest files, Pillow 11+, slower encode) --- img.save("out.avif", quality=75) \# 75 ≈ JPEG 85 visually, ~50% smaller # --- Strip all metadata (privacy) --- clean = Image.new(img.mode, img.size) clean.putdata(list(img.getdata())) clean.save("stripped.jpg", quality=85)
Quality guide: JPEG/WebP 85 = sweet spot. 90+ = diminishing returns. <70 = visible artifacts. Never re-save JPEGs repeatedly — each save degrades (generation loss).
Batch Processing
from pathlib import Path from PIL import Image, ImageOps out = Path("optimized"); out.mkdir(exist_ok=True) for p in Path("photos").glob("*.[jJ][pP]*[gG]"): \# matches jpg, jpeg, JPG, JPEG img = ImageOps.exif_transpose(Image.open(p)) img.thumbnail((1920, 1920), Image.Resampling.LANCZOS) img.convert("RGB").save(out / f"{p.stem}.webp", quality=85, method=6)
sharp (Node.js — use for high throughput)
const sharp = require('sharp'); // Resize + convert + optimize, streaming (flat memory) await sharp('in.jpg') .rotate() // auto-rotate from EXIF (like exif_transpose) .resize(1080, 1080, { fit: 'cover', position: 'center' }) // = ImageOps.fit .webp({ quality: 85 }) .toFile('out.webp'); // fit options: 'cover' (crop), 'contain' (letterbox), 'inside' (shrink to fit), 'fill' (stretch) // Composite watermark await sharp('photo.jpg') .composite([{ input: 'logo.png', gravity: 'southeast' }]) .toFile('watermarked.jpg');
sharp strips all metadata by default. Use
.withMetadata() to preserve EXIF/ICC.
OpenCV (when Pillow isn't enough)
import cv2 img = cv2.imread("in.jpg") \# BGR order, not RGB! gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # Face detection cascade = cv2.CascadeClassifier(cv2.data.haarcascades + "haarcascade_frontalface_default.xml") faces = cascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5) for (x, y, w, h) in faces: cv2.rectangle(img, (x, y), (x+w, y+h), (0, 255, 0), 2) cv2.imwrite("out.jpg", img) # Pillow <-> OpenCV import numpy as np cv_img = cv2.cvtColor(np.array(pil_img), cv2.COLOR_RGB2BGR) pil_img = Image.fromarray(cv2.cvtColor(cv_img, cv2.COLOR_BGR2RGB))
Bulk Pixel Manipulation — Use numpy, Not getpixel/putpixel
Never loop over pixels with
/getpixel()
for large regions — it is extremely slow (minutes for a full image). Convert to a numpy array, operate on the array, then convert back.putpixel()
import numpy as np from PIL import Image img = Image.open("photo.jpg").convert("RGB") arr = np.array(img) \# shape: (height, width, 3), dtype: uint8 # Example: replace a region with sampled background color bg_color = arr[50, -100, :] \# sample one pixel from the right side arr[0:300, 0:800, :] = bg_color \# fill the region instantly # Example: blend two regions with a gradient mask alpha = np.linspace(1, 0, 100).reshape(1, 100, 1) \# horizontal fade over 100px region = arr[0:300, 700:800, :] bg_strip = np.full_like(region, bg_color) arr[0:300, 700:800, :] = (region * (1 - alpha) + bg_strip * alpha).astype(np.uint8) result = Image.fromarray(arr)
Speed comparison:
putpixel on a 700x250 region = ~175,000 calls = 30+ seconds. numpy array slice = instant.
Platform Dimensions
| Platform | Size | Ratio |
|---|---|---|
| Instagram post | 1080x1080 | 1:1 |
| Instagram story / TikTok | 1080x1920 | 9:16 |
| LinkedIn profile photo | 400x400 | 1:1 |
| LinkedIn banner | 1584x396 | 4:1 |
| LinkedIn post | 1200x627 | 1.91:1 |
| Twitter/X profile | 400x400 | 1:1 |
| Twitter/X post | 1200x675 | 16:9 |
| Facebook profile | 320x320 | 1:1 |
| Facebook cover | 851x315 | 2.7:1 |
| YouTube thumbnail | 1280x720 | 16:9 |
| WhatsApp profile | 500x500 | 1:1 |
| Open Graph (link preview) | 1200x630 | 1.91:1 |
Debug Workflow
When edits don't look right, follow this process instead of re-running the whole pipeline blindly:
-
Save a debug crop of the target area before and after processing. View both to confirm what changed.
-
Work at full resolution first. Watermarks and artifacts that look small in a thumbnail can be large at native resolution. Always inspect at the original size before resizing.
-
Save intermediate results. After each major processing step (watermark removal, crop, resize), save a checkpoint image so you can identify which step introduced a problem.
-
Spot-check specific pixels to verify processing took effect:
# Quick pixel check after watermark removal for (x, y) in [(60, 50), (200, 130), (400, 100)]: print(f"({x},{y}): {img.getpixel((x, y))}") # If these still show watermark colors (e.g. bright blue), removal failed
Gotchas
-
box isimg.crop()
— absolute coords, NOT(left, top, right, bottom)(x, y, width, height) -
mutates in place and returnsthumbnail()
— don't doNoneimg = img.thumbnail(...) -
Paste with transparency needs the image as the third (mask) arg:
bg.paste(fg, pos, fg) -
Palette mode ("P") breaks many filters —
firstimg.convert("RGB") -
Fonts:
needs a real font file. Linux:ImageFont.truetype
. Ship a/usr/share/fonts/truetype/dejavu/
with your code for portability..ttf -
OpenCV needs numpy — always
togetherpip install opencv-python numpy -
OpenCV uses BGR, Pillow uses RGB — convert when switching between them or colors will be wrong