Pexo-skills videoagent-image-studio

install
source · Clone the upstream repo
git clone https://github.com/pexoai/pexo-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/pexoai/pexo-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/videoagent-image-studio" ~/.claude/skills/pexoai-pexo-skills-videoagent-image-studio && rm -rf "$T"
manifest: skills/videoagent-image-studio/SKILL.md
source content

🎨 VideoAgent Image Studio

Use when: User asks to generate, draw, create, or make any kind of image, photo, illustration, icon, logo, or artwork.

Generate images with 8 state-of-the-art AI models. This skill automatically picks the best model for the job and handles all the complexity — including Midjourney's async polling — so you can focus on the conversation.


Quick Reference

User IntentModelSpeed
Artistic, cinematic, painterly
midjourney
~15s
Photorealistic, portrait, product
flux-pro
~8s
General purpose, balanced
flux-dev
~10s
Quick draft, fast iteration
flux-schnell
~2s
Image with text, logo, poster
ideogram
~10s
Vector art, icon, flat design
recraft
~8s
Anime, stylized illustration
sdxl
~5s
Gemini-powered, consistent style
nano-banana
~12s

How to Generate an Image

Step 1 — Enhance the prompt

Before calling the script, expand the user's prompt with style, lighting, and quality descriptors appropriate for the chosen model.

  • Midjourney: Add
    cinematic lighting
    ,
    ultra detailed
    ,
    --v 7
    ,
    --style raw
  • Flux: Add
    masterpiece
    ,
    highly detailed
    ,
    sharp focus
    ,
    professional photography
  • Ideogram: Be explicit about text content, font style, and layout
  • Recraft: Specify
    vector illustration
    ,
    flat design
    ,
    icon style

Step 2 — Run the script

node {baseDir}/tools/generate.js \
  --model <model_id> \
  --prompt "<enhanced prompt>" \
  --aspect-ratio <ratio>

All parameters:

ParameterDefaultDescription
--model
flux-dev
Model ID from the table above
--prompt
(required)The image generation prompt
--aspect-ratio
1:1
1:1
,
16:9
,
9:16
,
4:3
,
3:4
,
3:2
,
21:9
--num-images
1
Number of images (1–4; Midjourney always returns 4)
--negative-prompt
Things to avoid (not supported by Midjourney)
--seed
Seed for reproducibility

Step 3 — Return the result

The script always waits and returns the final image URL(s). No polling required.

{
  "success": true,
  "model": "flux-pro",
  "imageUrl": "https://...",
  "images": ["https://..."]
}

Send the

imageUrl
to the user.


Midjourney Actions

After generating a 4-image grid with Midjourney, offer the user these options:

# Upscale image #2 (subtle, preserves details)
node {baseDir}/tools/generate.js \
  --model midjourney \
  --action upscale \
  --index 2 \
  --job-id <job_id>

# Create a strong variation of image #3
node {baseDir}/tools/generate.js \
  --model midjourney \
  --action variation \
  --index 3 \
  --job-id <job_id> \
  --variation-type 1

# Regenerate with same prompt
node {baseDir}/tools/generate.js \
  --model midjourney \
  --action reroll \
  --job-id <job_id>

Upscale types:

0
= Subtle (default, best for photos),
1
= Creative (best for illustrations)

Variation types:

0
= Subtle (default),
1
= Strong (dramatic changes)


Example Conversations

User: "Draw a snow leopard on a snowy mountain with cinematic lighting"

# Choose midjourney for artistic quality
node {baseDir}/tools/generate.js \
  --model midjourney \
  --prompt "a majestic snow leopard on a snowy mountain peak, cinematic lighting, dramatic atmosphere, ultra detailed --ar 16:9 --v 7" \
  --aspect-ratio 16:9

🎨 Done! Which one to upscale? (U1-U4) Or create a variant? (V1-V4)


User: "Use Flux to generate a perfume product poster, white background"

# Choose flux-pro for photorealistic product shots
node {baseDir}/tools/generate.js \
  --model flux-pro \
  --prompt "a luxury perfume bottle on a clean white background, professional product photography, soft shadows, 8k, highly detailed" \
  --aspect-ratio 3:4

User: "Show me a quick draft"

# flux-schnell for instant previews
node {baseDir}/tools/generate.js \
  --model flux-schnell \
  --prompt "..." \
  --aspect-ratio 1:1

User: "Make me an App icon, flat style, blue theme"

# recraft for vector/icon style
node {baseDir}/tools/generate.js \
  --model recraft \
  --prompt "a minimal flat design app icon, blue color scheme, simple geometric shapes, vector style, white background"

Setup

Zero API keys needed! All requests go through a hosted proxy that handles authentication server-side.

The skill works out of the box — just install and use.

Advanced: Custom proxy or token

If you want to use your own proxy or a persistent token, set these environment variables:

{
  "skills": {
    "entries": {
      "videoagent-image-studio": {
        "enabled": true,
        "env": {
          "IMAGE_STUDIO_PROXY_URL": "https://your-proxy.vercel.app",
          "IMAGE_STUDIO_TOKEN": "your_token_here"
        }
      }
    }
  }
}
VariableRequiredDescription
IMAGE_STUDIO_PROXY_URL
NoCustom proxy base URL (default:
https://image-gen-proxy.vercel.app
)
IMAGE_STUDIO_TOKEN
NoPersistent token (auto-obtained if not set, 100 free uses per token)

To deploy your own proxy, see the videoagent-audio-studio proxy as a reference implementation. You'll need

FAL_KEY
and
LEGNEXT_KEY
as Vercel environment variables.


Changelog

v2.0.0

  • Simplified async: The script now blocks until Midjourney completes. No more
    --async
    /
    --poll
    flags needed in SKILL.md instructions.
  • Unified output format: All models return the same
    { success, imageUrl, images }
    shape.
  • Reference images for Nano Banana: Pass
    --reference-images "url1,url2"
    for character/style consistency across generations.

v1.3.0

  • Added non-blocking async mode for Midjourney (
    --async
    +
    --poll
    ).

v1.2.0

  • Midjourney turbo mode enabled by default (~10-20s).

v1.1.0

  • Switched Midjourney provider from TTAPI to Legnext.ai for better stability.

v1.0.0

  • Initial release with Midjourney, Flux, SDXL, Nano Banana, Ideogram, Recraft.