Openclaw-master-skills ltx-video
install
source · Clone the upstream repo
git clone https://github.com/LeoYeAI/openclaw-master-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/LeoYeAI/openclaw-master-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/ltx-video" ~/.claude/skills/leoyeai-openclaw-master-skills-ltx-video && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/LeoYeAI/openclaw-master-skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/ltx-video" ~/.openclaw/skills/leoyeai-openclaw-master-skills-ltx-video && rm -rf "$T"
manifest:
skills/ltx-video/SKILL.mdsafety · automated scan (medium risk)
This is a pattern-based risk scan, not a security review. Our crawler flagged:
- makes HTTP requests (curl)
- references .env files
- references API keys
Always read a skill's source content before installing. Patterns alone don't mean the skill is malicious — but they warrant attention.
source content
LTX-2.3 Video API
API Reference
Base URL:
https://api.ltx.video/v1Auth:
Authorization: Bearer <API_KEY>Response: MP4 binary (direct download, no polling)
Endpoints
| Endpoint | Input | Use |
|---|---|---|
| prompt | Generate video from text |
| image_uri + prompt | Animate a still image |
| audio_uri + image_uri + prompt | Lip-sync video from audio + image |
| video_uri + prompt | Extend a video at start or end |
| video_uri + time range | Regenerate a section of a video |
Models
| Model | Speed | Quality |
|---|---|---|
| ~17s | Good (use for tests) |
| ~30-60s | Best (use for final) |
Supported Resolutions
(landscape 16:9)1920x1080
(portrait 9:16 — native vertical, trained on vertical data)1080x1920
,1440x1080
(text-to-video only)4096x2160
audio-to-video only supports:
1920x1080 or 1080x1920
Quick Examples
Text to Video
curl -X POST "https://api.ltx.video/v1/text-to-video" \ -H "Authorization: Bearer $LTX_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "prompt": "A man in a navy blue suit sits at a luxury restaurant table...", "model": "ltx-2-3-pro", "duration": 8, "resolution": "1920x1080" }' -o output.mp4
Audio to Video (Lip-sync)
curl -X POST "https://api.ltx.video/v1/audio-to-video" \ -H "Authorization: Bearer $LTX_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "audio_uri": "https://example.com/voice.mp3", "image_uri": "https://example.com/portrait.jpg", "prompt": "A man speaks directly to camera...", "model": "ltx-2-3-pro", "resolution": "1920x1080" }' -o output.mp4
Python Wrapper
import requests def ltx_audio_to_video(audio_url, image_url, prompt, api_key, model="ltx-2-3-pro", resolution="1920x1080", output_path="output.mp4"): r = requests.post( "https://api.ltx.video/v1/audio-to-video", headers={"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}, json={"audio_uri": audio_url, "image_uri": image_url, "prompt": prompt, "model": model, "resolution": resolution}, timeout=300, stream=True ) if r.status_code != 200: raise RuntimeError(f"LTX error {r.status_code}: {r.text}") with open(output_path, "wb") as f: for chunk in r.iter_content(8192): f.write(chunk) return output_path
⚠️ Critical Rules (learned from experience)
File Hosting
- URLs must be HTTPS — HTTP is rejected
- Files must return correct MIME type (not
)application/octet-stream - uguu.se works: upload with
curl -F "files[]=@file.mp3" https://uguu.se/upload - Audio: upload as MP3 (not WAV) → uguu returns
✅audio/mpeg - 4K images fail → resize to 1920x1080 before uploading
# Upload MP3 to uguu.se AUDIO_URL=$(curl -s -F "files[]=@audio.mp3" "https://uguu.se/upload" | \ python3 -c "import sys,json; print(json.load(sys.stdin)['files'][0]['url'])") # Upload image IMAGE_URL=$(curl -s -F "files[]=@portrait.jpg" "https://uguu.se/upload" | \ python3 -c "import sys,json; print(json.load(sys.stdin)['files'][0]['url'])")
Image Size Limit
# Resize large images before upload ffmpeg -y -i input_4k.png -vf "scale=1920:1080" output_1080.jpg
Face Consistency
- Avoid prompts where the character looks down — breaks face consistency
- Keep head level and gaze forward throughout
- Place objects already in frame instead of having character reach below frame
Last Frame
- LTX does not support first+last frame natively
- Workaround: generate clip A, generate clip B, then use
to chain them/v1/extend
Prompting Guide (LTX-2.3)
LTX-2.3 has a much stronger text connector. Specificity wins.
1. Use Verbs, Not Nouns
❌
"A dramatic portrait of a man standing"✅
"A man stands on a rooftop. His coat flaps in the wind. He adjusts his collar and steps forward as the camera tracks right."
2. Block the Scene Like a Director
- Specify left vs right, foreground vs background
- Describe who moves, what moves, how they move, what the camera does
- Spatial relationships are now respected
3. Describe Audio Explicitly (for text-to-video)
- Name the type of sound: dialogue, ambient, music
- Specify tone and intensity
- Example:
"His voice is clear and warm. Restaurant ambient sound softly in the background."
4. Avoid Static Photo-Like Prompts
- If the prompt reads like a still image → the output behaves like one
- Add wind, motion, breathing, gestures, camera movement
5. Describe Texture and Material
- Hair, fabric, surface finish, lighting fall-off
→ now renders correctly"Individual hair strands visible in the backlight"
6. Portrait (9:16) Native
→ trained on vertical dataresolution: "1080x1920"- Frame for vertical intentionally, don't treat as cropped landscape
7. Complex Shots Work Now
- Layer multiple actions:
"He picks up the banana, raises it to his ear, and smirks" - Combine character performance + environment + camera motion
Lip-Sync Prompt Template
A [description of person] sits/stands [location]. He/she speaks directly to camera, lips moving in perfect sync with his/her voice. [Gesture details]. Head stays level and gaze remains locked on camera throughout. [Environment description softly blurred in background]. [Lighting]. [Camera: holds steady at eye level, front-on].
ComfyUI Node
Custom nodes for ComfyUI (no manual API calls):
cd ComfyUI/custom_nodes git clone https://github.com/PauldeLavallaz/comfyui-ltx-node
Nodes:
LTX Text to Video, LTX Image to Video, LTX Extend VideoCategory: LTX Video
API Key
Paul's key: stored in
~/clawd/.env as LTX_API_KEY
ltxv_RfSU5hdKJb_g5dwbECZWnilE1P8dJzbavz6niP_0LQJ942ARHIVhrBCfebcytEL1efLVx_63S_PJyWTzicrBcWEkOXfCbGTl8JSzlJJk329MwRViEgOoE2KnE9LIA5t6QSFeBy7DLnTIcX0AZNbV9Jv0TuC7qcq2gV33G6ROhUVUDCuN