Skills-for-architects product-image-processor
Download, resize, and remove backgrounds from product images at scale
git clone https://github.com/AlpacaLabsLLC/skills-for-architects
T=$(mktemp -d) && git clone --depth=1 https://github.com/AlpacaLabsLLC/skills-for-architects "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/06-materials-research/skills/product-image-processor" ~/.claude/skills/alpacalabsllc-skills-for-architects-product-image-processor && rm -rf "$T"
plugins/06-materials-research/skills/product-image-processor/SKILL.md/product-image-processor — Product Image Processor
Download product images from a Google Sheet, normalize sizing, and remove backgrounds. Saves output at each processing stage.
Works with the master Google Sheet — the 33-column schema defined in
../../schema/product-schema.md. Image URLs are in column AC, product names in column E. Read ../../schema/sheet-conventions.md for CRUD patterns with MCP tools.
Step 1: Get Input
If no arguments provided, ask the user:
- Spreadsheet ID — the Google Sheets ID (from the URL:
). 2. Image URL column — which column contains image URLs (default:docs.google.com/spreadsheets/d/{ID}/...
in the master schema, or the user can specify)AC - Name column (optional) — which column has product names for file naming (default:
in the master schema). If not provided, derive names from the image URL/filename.E - Output location — where to save the images. Suggest
as default but let the user pick any path../product-images-YYYY-MM-DD/ - Header row — whether row 1 is a header (default: yes, row 2 in master schema)
Step 2: Read URLs from Google Sheet
Use
mcp__google-sheets__list_sheets to inspect the sheet, then mcp__google-sheets__get_sheet_data to read the image URL column and optional name column.
Build a list of
{ index, url, name } entries. Skip empty rows.
Step 3: Create Output Folders
Create the output directory at the user's chosen path with 3 subfolders:
<output-path>/ ├── originals/ # Raw downloads ├── resized/ # Normalized sizing └── nobg/ # Background removed
If the folder already exists, append a suffix:
-2, -3, etc.
Step 4: Download Images
Download each image using
curl in Bash:
curl -L -o "<output-path>" "<url>"
IMPORTANT: Use
curl, NOT WebFetch. WebFetch processes content through an AI model which corrupts binary image data.
Name files as:
001-product-name.png, 002-product-name.png, etc.
- Slugify the product name: lowercase, replace spaces/special chars with hyphens, strip consecutive hyphens
- If no name column, extract a name from the URL filename (strip extension and query params)
- If the URL gives no usable name, use
,001-image.png
, etc.002-image.png
If the downloaded file is not a PNG (check extension or content type), convert it to PNG during the resize step.
Step 5: Resize Images
Run a Python script to resize all images in
originals/ → resized/:
from PIL import Image import os, sys input_dir = sys.argv[1] # originals/ output_dir = sys.argv[2] # resized/ max_edge = int(sys.argv[3]) if len(sys.argv) > 3 else 2000 for fname in sorted(os.listdir(input_dir)): if not fname.lower().endswith(('.png', '.jpg', '.jpeg', '.webp', '.gif', '.bmp', '.tiff')): continue try: img = Image.open(os.path.join(input_dir, fname)) img = img.convert("RGBA") w, h = img.size longest = max(w, h) if longest > max_edge: scale = max_edge / longest new_w, new_h = int(w * scale), int(h * scale) img = img.resize((new_w, new_h), Image.LANCZOS) out_name = os.path.splitext(fname)[0] + ".png" img.save(os.path.join(output_dir, out_name), "PNG") print(f"OK: {fname} → {out_name} ({img.size[0]}x{img.size[1]})") except Exception as e: print(f"FAIL: {fname} — {e}")
Rules:
- Max 2000px on the longest edge (configurable if user requests)
- Preserve aspect ratio
- Do NOT upscale — if already smaller than max, keep original dimensions
- Convert everything to PNG (RGBA mode for transparency support)
Step 6: Remove Backgrounds
Check if
rembg is installed. If not, install it:
pip3 install rembg onnxruntime
Then run background removal on all resized images →
nobg/:
from rembg import remove from PIL import Image import os, sys, io input_dir = sys.argv[1] # resized/ output_dir = sys.argv[2] # nobg/ for fname in sorted(os.listdir(input_dir)): if not fname.lower().endswith('.png'): continue try: input_path = os.path.join(input_dir, fname) with open(input_path, 'rb') as f: input_data = f.read() output_data = remove(input_data) img = Image.open(io.BytesIO(output_data)) img.save(os.path.join(output_dir, fname), "PNG") print(f"OK: {fname}") except Exception as e: print(f"FAIL: {fname} — {e}")
Note: The first run of rembg downloads the u2net model (~170MB). Warn the user this may take a minute.
Step 7: Report Results
After processing, print a summary:
## Product Image Processing Complete 📁 Output: ./product-images-YYYY-MM-DD/ | Stage | Success | Failed | |-------------|---------|--------| | Downloaded | 12 | 1 | | Resized | 12 | 0 | | BG Removed | 12 | 0 | ### Failures - 003-chair-arm.png: Download failed (404 Not Found)
Include the full path to the output folder so the user can open it.
Error Handling
- Download failures: Log and continue. Don't block the pipeline for one bad URL.
- Resize failures: Log and continue. Skip that image in the bg-removal step.
- rembg failures: Log and continue. Some images (vectors, icons) may not process well.
- Sheet read errors: Stop and report. Ask the user to verify the spreadsheet ID and column.
Notes
- Process images sequentially (not parallel) to avoid overwhelming the network or CPU
- For large batches (50+ images), print progress every 10 images
- The rembg model download only happens once — subsequent runs reuse the cached model