Awesome-omni-skill runpod
Cloud GPU processing via RunPod serverless. Use when setting up RunPod endpoints, deploying Docker images, managing GPU resources, troubleshooting endpoint issues, or understanding costs. Covers all 5 toolkit images (qwen-edit, realesrgan, propainter, sadtalker, qwen3-tts).
git clone https://github.com/diegosouzapw/awesome-omni-skill
T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/devops/runpod" ~/.claude/skills/diegosouzapw-awesome-omni-skill-runpod && rm -rf "$T"
skills/devops/runpod/SKILL.mdRunPod Cloud GPU
Run open-source AI models on cloud GPUs via RunPod serverless. Pay-per-second, no minimums.
Setup
# 1. Create account at https://runpod.io # 2. Add API key to .env echo "RUNPOD_API_KEY=your_key_here" >> .env # 3. Deploy any tool with --setup python tools/image_edit.py --setup python tools/upscale.py --setup python tools/dewatermark.py --setup python tools/sadtalker.py --setup python tools/qwen3_tts.py --setup
Each
--setup command:
- Creates a RunPod template from the Docker image
- Creates a serverless endpoint with appropriate GPU
- Saves the endpoint ID to
(e.g..env
)RUNPOD_QWEN_EDIT_ENDPOINT_ID
Available Images
All images are public on GHCR — no authentication needed.
| Tool | Docker Image | GPU | VRAM | Typical Cost |
|---|---|---|---|---|
| image_edit | | A6000/L40S | 48GB+ | ~$0.05-0.15/job |
| upscale | | RTX 3090/4090 | 24GB | ~$0.01-0.05/job |
| dewatermark | | RTX 3090/4090 | 24GB | ~$0.05-0.30/job |
| sadtalker | | RTX 4090 | 24GB | ~$0.05-0.15/job |
| qwen3_tts | | ADA 24GB | 24GB | ~$0.01-0.05/job |
Total monthly cost: Rarely exceeds $10 even with heavy use.
How It Works
All tools follow the same pattern:
Local CLI → Upload input to cloud storage → RunPod API → Poll for result → Download output
- File transfer: Tools use Cloudflare R2 when configured (
,R2_ACCOUNT_ID
,R2_ACCESS_KEY_ID
,R2_SECRET_ACCESS_KEY
), falling back to free upload servicesR2_BUCKET_NAME - RunPod API: Tools call the
endpoint, then poll/run
until complete/status/{job_id} - Cold vs warm start: First request after idle spins up a worker (~30-90s). Subsequent requests are fast (~5-15s)
Endpoint Management
Workers
workersMin: 0 — Scale to zero when idle (no cost) workersMax: 1 — Max concurrent jobs (increase for throughput) idleTimeout: 5 — Seconds before worker scales down
Across all endpoints, you share a total worker pool based on your RunPod plan. If you hit limits, reduce
workersMax on endpoints you're not actively using.
Checking Endpoint Status
Each tool stores its endpoint ID in
.env:
| Tool | Env Var |
|---|---|
| image_edit | |
| upscale | |
| dewatermark | |
| sadtalker | |
| qwen3_tts | |
Disabling an Endpoint
To free worker slots without deleting the endpoint, set
workersMax=0 via the RunPod dashboard or GraphQL API.
Troubleshooting
Force Image Pull
When you push a new Docker image version, RunPod may still use the cached old one. To force a pull:
- Update the template's
to useimageName
notation@sha256:DIGEST - Wait for the worker to restart
- Revert to
tag after confirming:latest
Cold Start Too Slow
- qwen3-tts: ~70s cold start, ~7s warm
- sadtalker: ~60s cold start, ~10s warm
- image_edit: ~90s cold start, ~15s warm
If cold starts are a problem, set
workersMin: 1 (costs money when idle).
Job Fails with OOM
The model needs more VRAM than the GPU provides. Options:
- Use a larger GPU tier
- For dewatermark: reduce
(default 0.5 for safety)--resize-ratio - For image_edit: reduce
--steps
"No workers available"
You've hit your plan's concurrent worker limit. Either:
- Wait for a running job to finish
- Set
on endpoints you're not usingworkersMax=0 - Upgrade your RunPod plan
Docker Images
All Dockerfiles live in
docker/runpod-*/. Images use runpod/pytorch as the base to share layers across tools.
Building for RunPod (from Apple Silicon Mac):
docker buildx build --platform linux/amd64 -t ghcr.io/conalmullan/video-toolkit-<name>:latest docker/runpod-<name>/ docker push ghcr.io/conalmullan/video-toolkit-<name>:latest
GHCR packages default to private — you must manually make them public for RunPod to pull them. Go to GitHub > Packages > Package Settings > Change Visibility.
Cost Optimization
- Keep
on all endpoints (scale to zero)workersMin: 0 - Only deploy endpoints you actively need
- Use
to disable idle endpoints without deleting themworkersMax=0 - Qwen3-TTS is significantly cheaper than ElevenLabs for voiceovers
- Check the RunPod dashboard for usage and billing