Learn-skills.dev agent-media
Agent-first media toolkit for image, video, and audio processing. Use when you need to resize, convert, generate images, remove backgrounds, extract audio, transcribe speech, or generate videos. All commands return deterministic JSON output.
install
source · Clone the upstream repo
git clone https://github.com/NeverSight/learn-skills.dev
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/NeverSight/learn-skills.dev "$T" && mkdir -p ~/.claude/skills && cp -r "$T/data/skills-md/agntswrm/agent-media/agent-media" ~/.claude/skills/neversight-learn-skills-dev-agent-media && rm -rf "$T"
manifest:
data/skills-md/agntswrm/agent-media/agent-media/SKILL.mdsource content
Agent Media
Agent Media is an agent-first media toolkit that provides CLI-accessible commands for image, video, and audio processing. All commands produce deterministic, machine-readable JSON output.
Available Commands
Image Commands
- Resize an imageagent-media image resize
- Convert image formatagent-media image convert
- Remove image backgroundagent-media image remove-background
- Generate image from textagent-media image generate
Audio Commands
- Extract audio from videoagent-media audio extract
- Transcribe audio to textagent-media audio transcribe
Video Commands
- Generate video from text or imageagent-media video generate
Output Format
All commands return JSON to stdout:
{ "ok": true, "media_type": "image", "action": "resize", "provider": "local", "output_path": "output_123.webp", "mime": "image/webp", "bytes": 12345 }
On error:
{ "ok": false, "error": { "code": "INVALID_INPUT", "message": "input file not found" } }
Providers
- local - Default provider using Sharp (resize, convert) and Transformers.js (remove-background, transcribe)
- fal - fal.ai provider (generate, edit, remove-background, transcribe, video)
- replicate - Replicate API (generate, edit, remove-background, transcribe, video)
- runpod - Runpod API (generate, edit)
- ai-gateway - Vercel AI Gateway (generate, edit)
Provider Selection
- Explicit:
--provider <name> - Auto-detect from environment variables
- Fallback to local provider
Environment Variables
- Custom output directoryAGENT_MEDIA_DIR
- Enable fal providerFAL_API_KEY
- Enable replicate providerREPLICATE_API_TOKEN
- Enable runpod providerRUNPOD_API_KEY
- Enable ai-gateway providerAI_GATEWAY_API_KEY