px-asset-extract
install
source · Clone the upstream repo
git clone https://github.com/JadeLiu-tech/px-asset-extract
Claude Code · Install into ~/.claude/skills/
git clone --depth=1 https://github.com/JadeLiu-tech/px-asset-extract ~/.claude/skills/jadeliu-tech-px-asset-extract-px-asset-extract
manifest:
SKILL.mdsafety · automated scan (low risk)
This is a pattern-based risk scan, not a security review. Our crawler flagged:
- pip install
Always read a skill's source content before installing. Patterns alone don't mean the skill is malicious — but they warrant attention.
source content
px-asset-extract: Image Asset Extraction
What It Does
Decomposes images into individual transparent PNG assets with classification and a JSON manifest. The full pipeline runs in 2-6 seconds on CPU with zero ML models:
- Background detection — median color from image borders
- Foreground mask — Euclidean color distance thresholding
- Character bridging — dilation connects letters into words
- Connected components — union-find with 8-connectivity
- Classification — heuristic typing into 10 categories
- Text-line merging — groups word fragments into text lines
- Alpha extraction — anti-aliased transparent cropping
- Deduplication — removes overlapping and oversized segments
When to Use This
| Scenario | Use px-asset-extract? |
|---|---|
| Extract all elements from a slide/poster | Yes — this is the primary use case |
| Get only illustrations, skip text | Yes — use or |
| Extract specific objects by description | Use with + a grounding model (e.g., Florence-2) |
| Remove background from a single photo | No — use a background removal model instead |
| Segment a photo scene | No — use SAM/FastSAM for photographic content |
| Image has textured/photographic background | Limited — works best on clean/solid backgrounds |
Installation
git clone https://github.com/JadeLiu-tech/px-asset-extract.git cd px-asset-extract pip install .
Usage
CLI
# Basic extraction px-extract <image> -o <output_dir> # Only extract illustrations and icons px-extract <image> -o <output_dir> --types illustration,icon # Extract everything except text and dots px-extract <image> -o <output_dir> --exclude-types text,dot,line # Extract from pre-computed bounding boxes (e.g. from px-ground) px-extract <image> -o <output_dir> --regions regions.json # Segment only — output JSON, no PNGs px-extract <image> --segments-only # Batch processing px-extract images/*.png -o output/ --batch # JSON output to stdout px-extract <image> -o <output_dir> --json --quiet
Python API
from px_asset_extract import extract_assets, load_regions # Full extraction result = extract_assets("slide.png", output_dir="assets/") for asset in result.assets: print(f"{asset.id}: {asset.label} at ({asset.bbox.x}, {asset.bbox.y}) -> {asset.file_path}") # Type filtering result = extract_assets("slide.png", output_dir="icons/", types=["illustration", "icon"]) result = extract_assets("slide.png", output_dir="graphics/", exclude_types=["text", "line", "dot"]) # Pre-computed regions (from grounding model output) regions = load_regions("grounded.json") result = extract_assets("slide.png", output_dir="targeted/", regions=regions) # Combine regions + type filter result = extract_assets("slide.png", output_dir="charts/", regions=regions, types=["chart"])
CLI Options
| Option | Default | Description |
|---|---|---|
, | | Output directory |
| | Background color distance (lower = more sensitive) |
| | Minimum segment area in pixels |
| | Character gap bridging passes |
| | Extra pixels around each asset |
| | Max fraction of image a segment can cover |
| Only extract these types (comma-separated) | |
| Skip these types (comma-separated) | |
| JSON file with bounding boxes (skips segmentation) | |
| Output segment JSON without extracting PNGs | |
| Skip visualization image | |
| Create subdirectories per image | |
| Output results as JSON to stdout | |
| Suppress progress messages |
Output
Each run produces:
— individual transparent PNGsasset_NNN_<type>.png
— positions, types, and metadata for all assetsmanifest.json
— input image with color-coded bounding boxesvisualization.png
Manifest format
{ "source_image": "slide.png", "source_size": {"width": 1920, "height": 1080}, "background_color": [255, 255, 255], "num_assets": 44, "assets": [ { "id": "asset_000_illustration", "label": "illustration", "file": "asset_000_illustration.png", "position": {"x": 100, "y": 50, "width": 400, "height": 300}, "pixel_area": 120000 } ] }
Regions JSON format (for --regions)
[ {"x": 100, "y": 50, "width": 400, "height": 300, "label": "chart"}, {"x1": 600, "y1": 100, "x2": 800, "y2": 300, "label": "logo"} ]
Also supports
{"regions": [...]} wrapper. Label defaults to "region" if omitted.
Asset Types
| Type | Detection Logic |
|---|---|
| dark_ratio > 0.4, uniform ink color |
| Large (>1% image area), colorful |
| Small (<3000px area, <60px max dimension) |
| Medium-sized, colored |
| Thin (min dimension <=5px, extreme aspect ratio) |
| Very small (<150px area, <20px dimension) |
| Low fill ratio (<0.25) |
| Spans >80% of image, very low fill |
| Bright (>200), low contrast, low saturation |
| Catch-all for unclassified objects |
Performance
| Image type | Assets | Time |
|---|---|---|
| Presentation slide | 22-44 | 2-6s |
| Poster | 11 | 3.9s |
| Scientific diagram | 43 | 4.2s |
| Technical diagram | 42 | 4.5s |
| Data chart | 26 | 4.8s |
Dependencies
Only
Pillow and numpy. Optional opencv-python for better alpha edges.