Claude-skill-registry create-movie

install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/create-movie" ~/.claude/skills/majiayu000-claude-skill-registry-create-movie && rm -rf "$T"
manifest: skills/data/create-movie/SKILL.md
source content

create-movie

Orchestrated movie creation for Horus persona. Creates mockumentaries, short films, music videos, and educational content through a phased workflow.

Philosophy

"AI isn't the artist, it's the amplifier" - Nobody & The Computer

Horus uses AI to turn imagination into audiovisual reality. He doesn't just use pre-built tools - he writes code to create his own tools.

Phases

HARDWARE CHECK → RESEARCH → SCRIPT → BUILD TOOLS → GENERATE → ASSEMBLE → LEARN

Phase 0: Hardware Detection (Automatic)

Before any generation, the orchestrator automatically detects hardware via

/ops-workstation
:

# Automatic hardware check on startup
./run.sh create "prompt"
# → Calls /ops-workstation gpu to detect VRAM
# → Calls /ops-workstation memory to detect RAM
# → Auto-selects optimal model variant

Auto-Selection Logic:

Detected VRAMModel SelectedSettings
≥24GBLTX-2 19B FP8720p/1080p, audio on, batch=1
16-23GBLTX-2 19B FP4720p only, audio on, batch=1
12-15GBLTX-2 Distilled 2B720p, audio optional, batch=1
<12GBRunPod suggestedPrompts to use
/ops-runpod

RAM-Based Optimizations:

Detected RAMOptimization
≥128GBWeight streaming enabled (offload to RAM)
64-127GBPartial offloading
<64GBNo offloading, strict VRAM limits

Override Auto-Detection:

# Force specific model variant
./run.sh create "prompt" --model ltx2-fp4
./run.sh create "prompt" --model ltx2-distilled
./run.sh create "prompt" --runpod  # Force cloud generation

Phase 1: Research (Library-First)

  1. Check Horus's Library First:
    • horus-filmmaking
      scope (past techniques, learnings)
    • horus_lore
      scope (YouTube transcripts, film analysis)
    • Ingested movies with emotion tags
    • Episodic archive (past filmmaking sessions)
  2. Search for New Resources:
    • /ingest-movie search
      for films to watch
    • /ingest-youtube search
      for tutorials
  3. Deep Web Research:
    • /dogpile
      for comprehensive multi-source search
    • /surf
      for specific tutorials/references

Phase 2: Script (via /create-story)

  • Integrates with
    /create-story
    skill for screenplay generation
  • Uses Chutes models (chimera, qwen, deepseek-r1) for creative writing
  • Parses INT./EXT. headings, dialogue, action, audio cues
  • Outputs structured scene breakdown with visual descriptions

Format Options:

  • screenplay
    (default) - Standard INT./EXT. scene headings
  • mockumentary
    - Interview segments with talking heads + B-roll
  • reconstruction
    - Historical recreation with narrator framing

Phase 3: Build Tools

  • Write code in Docker-isolated sandbox
  • Create custom tools for specific effects
  • Iterate on approaches

Phase 4: Generate

  • Use ComfyUI, Stable Diffusion for images
  • Use auto-selected video model based on hardware (LTX-2 FP8/FP4/Distilled)
  • Use Whisper, IndexTTS2 for audio
  • If hardware insufficient, automatically suggests
    /ops-runpod

Phase 5: Assemble

  • Combine assets with FFmpeg
  • Output MP4 video or interactive HTML

Phase 6: Learn

  • Store successful techniques in /memory
  • Remember what worked for future movies

Quick Start

cd .pi/skills/create-movie

# Full orchestrated workflow (recommended)
./run.sh create "A 30-second film about discovering colors"

# With options
./run.sh create "film noir detective" \
    --duration 60 \
    --style "high contrast, shadows, venetian blinds" \
    --format mp4 \
    --work-dir ./noir_project

# Individual phases (for manual control)
./run.sh research "film noir lighting techniques"
./run.sh script --from-research research.json --duration 30 --use-create-story
./run.sh build-tools --script script.json
./run.sh generate --tools ./tools --script script.json --style "cinematic"
./run.sh assemble --assets ./assets --output movie.mp4 --format mp4
./run.sh learn --project-dir ./movie_project

CLI Commands

create

Full orchestrated workflow through all phases.

./run.sh create PROMPT [OPTIONS]
  --output, -o       Output file (default: movie.mp4)
  --work-dir, -w     Working directory (default: ./movie_project)
  --duration, -d     Target duration in seconds (default: 30)
  --style, -s        Visual style (e.g., 'cinematic', 'film noir')
  --format, -f       Output format: mp4 or html (default: mp4)
  --store-learnings  Store learnings in memory (default: true)
  --skip-research    Skip research phase if research.json exists

research

Library-first research: checks Horus's memory and ingested content before external search.

./run.sh research TOPIC [OPTIONS]
  --output, -o       Output file (default: research.json)
  --skip-external    Only search library, skip external sources

script

Generate screenplay with scene breakdown. Integrates with

/create-story
.

./run.sh script [OPTIONS]
  --from-research, -r  Research JSON file (required)
  --prompt, -p         Override topic from research
  --duration, -d       Target duration in seconds
  --use-create-story   Use /create-story skill for screenplay
  --model, -m          LLM model (default: chimera)
  --output, -o         Output file (default: script.json)

build-tools

Generate custom tools in Docker sandbox.

./run.sh build-tools [OPTIONS]
  --script, -s       Script JSON file (required)
  --output-dir, -o   Output directory (default: ./tools)
  --skip-docker      Use host instead of Docker sandbox

generate

Create images, video, and audio assets.

./run.sh generate [OPTIONS]
  --tools, -t        Tools directory (default: ./tools)
  --script, -s       Script JSON file (required)
  --output-dir, -o   Assets output directory (default: ./assets)
  --style            Visual style to apply

assemble

Combine assets into final output.

./run.sh assemble [OPTIONS]
  --assets, -a       Assets directory (required)
  --output, -o       Output file/directory (required)
  --format, -f       Output format: mp4 or html (default: mp4)
  --fps              Frames per second for MP4 (default: 24)

learn

Store filmmaking insights in memory after a project.

./run.sh learn [OPTIONS]
  --project-dir, -p  Project directory (required)
  --scope            Memory scope (default: horus-filmmaking)
  --dry-run          Show learnings without storing

study

Pre-phase: Learn filmmaking topics BEFORE creating movies. Targeted /dogpile with internal (memory) + external (web) search, then stores via

/memory learn
.

./run.sh study TOPIC [OPTIONS]
  --scope            Memory scope (default: horus-filmmaking)
  --deep/--quick     Deep research (dogpile) vs quick (YouTube search)
  --list-topics      Show suggested filmmaking topics

# Examples:
./run.sh study "cinematography lighting techniques" --deep
./run.sh study "camera framing composition" --deep
./run.sh study --list-topics

study-all

Comprehensive learning session - studies all core filmmaking topics.

./run.sh study-all [OPTIONS]
  --scope            Memory scope (default: horus-filmmaking)

Output Formats

MP4 Video

Standard video file, playable anywhere.

Interactive HTML

Web-based experience with:

  • Frame-by-frame navigation
  • Audio controls
  • Scene metadata viewer

Available Skills

Horus has access to all skills in

.pi/skills/
:

SkillPurpose in Movie Creation
/dogpile
Deep research on techniques, references
/surf
Visit websites, tutorials, references
/memory
Recall prior techniques, store learnings
/create-image
Generate images for scenes
/tts-train
Horus's voice for narration
/ingest-movie
Ingest reference movies for style analysis
/create-paper
Write stories, scripts, creative content
/episodic-archiver
Archive movie creation sessions
/anvil
Debug and harden custom tools
/ingest-book
Search books for story inspiration

Free/Open-Source Tools

PurposeTool
Image GenerationStable Diffusion (ComfyUI)
Video GenerationLTX-2 (recommended), Mochi 1, CogVideoX (fallbacks)
Video ProcessingFFmpeg
Speech-to-Textfaster-whisper
Text-to-SpeechIndexTTS2

Video Model Selection Guide

Choose video model based on your GPU VRAM and use case. VRAM figures include 3-5GB headroom for pipeline overhead (ComfyUI/loader/audio), batch=1, FP8/FP4 where noted.

VRAMRecommended ModelsBest For
12GB (RTX 3060/4070)LTX-2 Distilled (2B), CogVideoX-2BQuick iterations, pre-viz
16GB (RTX 4080/A4000)LTX-2 19B FP4 (720p, ≤10s), WAN 2.2, SVDMedium quality production
24GB (RTX 4090/A5000)LTX-2 19B FP8 (recommended), WAN 2.2, MochiHigh quality production
40GB+ (A100/H100)LTX-2 BF16 (43GB), Full Mochi, Open-Sora 2.0Maximum quality

Safe Defaults (RTX A5000 24GB)

Model: LTX-2 19B FP8
Resolution: 720p
Clip length: 10s
Batch size: 1
Seed: fixed
Audio: on

If runtime VRAM >22GB or instability occurs: lower resolution to 540p, disable audio, or shorten clips. Avoid parallel jobs on 24GB.

Model Characteristics

ModelSpeedQualityAudioBest Use Case
LTX-2 19B FP8FastHighYesRecommended - Camera controls, audio sync
LTX-2 DistilledFastestMediumYesRapid iteration, light VRAM
WAN 2.2 14BSlowVery HighNoSilent films, German Expressionism, art films
Mochi 1SlowHighNoFinal renders, prompt adherence
HunyuanVideoMediumHighNoProduction quality
CogVideoX-5BMediumHighNoGeneral purpose (fallback)

Recommendation:

  • Use LTX-2 19B FP8 for production work with audio sync and camera controls
  • Use WAN 2.2 for silent films or when audio isn't needed (higher visual quality for same VRAM)
  • Fallback to Mochi for maximum quality or CogVideoX for compatibility

LTX-2: Recommended Video Model

LTX-2 is a 19B parameter DiT-based audio-video foundation model.

Model Variants:

ModelSizeVRAMQualityRecommended For
LTX-2 19B FP8~19GB (+3-5GB overhead)24GBHighProduction (A5000, 720p/1080p ≤12-15s, batch=1)
LTX-2 19B FP4~12GB (+3-5GB overhead)16GBHighFaster, slightly less quality (720p ≤10s)
LTX-2 BF16 (full)~43GB40GB+HighestRunPod/A100 only
LTX-2 Distilled 2B~4GB12GBMediumRapid iteration

FP8 Compatibility: Requires compatible CUDA/cuDNN/PyTorch builds. Follow LTX-Video docs for driver requirements.

Key Features:

  • Synchronized Audio-Video Generation: Generates coherent audio + video together
  • Camera Controls: Dolly, jib, static shots with natural camera motion
  • IC-LoRA: Style transformations (anime, sketch, etc.) with ~1GB VRAM
  • Keyframe Interpolation: Morphing between keyframes
  • Pose/Depth/Canny Controls: Precise composition control (Canny edge detection)
  • Text-to-Video and Image-to-Video: Both workflows supported

ComfyUI Templates:

TemplateUse Case
LTX2 Text-to-Video
Generate from text prompts
LTX2 Image-to-Video
Animate a still image
LTX2 Canny-to-Video
Edge detection guided generation
LTX2 Distilled
Fast iteration, lower VRAM

Installation:

# ComfyUI (recommended)
# Install "LTX-Video" from ComfyUI Manager
# Templates appear automatically

# Or standalone
pip install ltx-video

ComfyUI VRAM Optimization Flags:

# Reserve VRAM for other operations (prevents OOM during generation)
python -m main --reserve-vram 5

# Low VRAM mode - offloads to system RAM (slower but prevents OOM)
python -m main --lowvram

# Weight streaming - NVIDIA/ComfyUI collaboration for 256GB RAM systems
# Automatically offloads model weights to system RAM when VRAM exhausted

Additional Resources:

Camera Control Reference (LTX-2)

LTX-2 supports cinematic camera movements via prompt keywords:

MovementPrompt KeywordsEffect
Static
static shot
,
locked camera
Fixed camera position
Dolly
dolly in
,
dolly out
,
push in
Camera moves toward/away from subject
Jib/Crane
jib up
,
jib down
,
crane shot
Vertical camera sweep
Pan
pan left
,
pan right
Horizontal rotation
Tilt
tilt up
,
tilt down
Vertical rotation
Tracking
tracking shot
,
follow shot
Camera follows subject
Zoom
zoom in
,
zoom out
Focal length change

Example Prompts:

# Dramatic reveal
"Dolly in slowly to a detective examining evidence, noir lighting, static hold on face"

# Action sequence
"Tracking shot following runner through city streets, handheld, dynamic"

# Interview setup
"Static medium shot, subject centered, shallow depth of field, jib down to hands"

Combining Movements:

"Jib up while dolly out, revealing vast landscape, golden hour, cinematic"

WAN 2.2: Silent Film Alternative

WAN 2.2 is a 14B parameter model optimized for visual quality without audio:

Best For:

  • Silent films and art cinema
  • German Expressionism era aesthetics (Nosferatu, Metropolis, Cabinet of Dr. Caligari)
  • High visual fidelity when audio isn't needed
  • Projects where audio will be added separately

Comparison to LTX-2:

AspectLTX-2 19B FP8WAN 2.2 14B
AudioSynchronizedNone
Speed (10-sec HD, A5000)~3.5-4.5 min~5-6 min
Visual QualityHighVery High
VRAM (24GB)WorksWorks

When to Choose WAN 2.2:

  • Creating silent films with intertitles
  • German Expressionism homages
  • Music videos where audio is pre-recorded
  • Art films with separate sound design

Practical Notes: Seed control recommended for stable multi-shot outputs. 720p preferred on 24GB for consistent speeds.

Performance Expectations

Video generation is compute-intensive. Plan for overnight batch processing rather than real-time iteration.

Local Generation Times (RTX A5000, 24GB VRAM)

Video LengthResolutionModelTime
5 secondsHD (720p)LTX-2 19B FP8~1-1.5 min
10 secondsHD (720p)LTX-2 19B FP8~3.5-4.5 min
10 secondsFull HD (1080p)LTX-2 19B FP8~5-6.5 min
15 secondsHD (720p)LTX-2 19B FP8~6-7.5 min
10 secondsHD (720p)WAN 2.2~5-6 min

Notes:

  • Timings based on Alex Ziskind's benchmarks (RTX 5080) with +15-25% buffer for A5000
  • Audio synchronization adds ~10-15% time vs video-only runs
  • IO/storage affects throughput; prefer local NVMe, avoid network mounts

Realistic Workflow

For a 2-minute film (12 x 10-second clips):

  • Generation time: ~42-54 min (LTX-2, 720p) to ~60-72 min (WAN 2.2)
  • With retakes and iterations: 2-4 hours
  • Full production with assembly: overnight task

Recommendation: Queue video generation as overnight background tasks. Use

/task-monitor
to track progress.

# Example: Run generation overnight
./run.sh generate --script script.json --output-dir ./assets &
# Check progress next morning

RunPod for Large Tasks

Use

/ops-runpod
when local generation would cause OOM errors.

When to Use RunPod

ScenarioLocal (A5000 24GB)RunPod Needed
LTX-2 19B FP8, 10-sec HDWorksNo
LTX-2 19B FP8, 15-sec 1080pWorks (batch=1)No
1080p clips >12-15 sec (FP8)May OOMPrefer 720p or split; RunPod optional
LTX-2 BF16 (43GB full model)OOMYes (A100 40GB+)
Very long videos (>20 sec 1080p)Likely OOMYes
Batch processing (10+ clips)Slow but worksOptional (faster)
WAN 2.2 + LTX-2 parallelHigh OOM riskPrefer sequential or RunPod

OOM Threshold Guidance (A5000 24GB):

  • LTX-2 FP8: 1080p clips over ~12-15s may OOM with audio; use 720p, shorten clips, or disable audio
  • Control nets (pose/depth/canny) and multiple LoRAs increase memory; enable selectively
  • Monitor runtime VRAM; keep ≤22GB to avoid instability

RunPod Workflow

# Provision GPU for large task
/ops-runpod provision --gpu a100-40gb --task "LTX-2 BF16 generation"

# Run generation on RunPod
/ops-runpod run --script generate.sh

# Download results and terminate
/ops-runpod download --output ./assets
/ops-runpod terminate

RunPod GPU Options:

  • BF16/full precision: A100 40-80GB, H100 (required)
  • FP8/FP4 tasks: L40S 48GB, A10G 24GB (cheaper alternatives)

Cost Consideration: RunPod charges by the hour. For overnight tasks, local generation is more cost-effective. Consider spot/preemptible instances for savings.

Troubleshooting & Fallbacks

OOM Mitigation:

  1. Reduce resolution (720p → 540p)
  2. Shorten clip length
  3. Set batch=1
  4. Switch FP mode (BF16 → FP8 → FP4)
  5. Disable audio
  6. Split long clips into segments

Stability:

  • Fix seed for reproducibility
  • Avoid parallel jobs on 24GB
  • Reduce control nets and LoRA stacks

Fallback Path: If LTX-2 fails, switch to WAN 2.2 (video-only) or CogVideoX; add audio separately in post.

Memory Integration

After each movie, stores:

  • Successful prompts
  • Working tool code
  • Technique insights
  • Concept relationships

Scope:

horus-filmmaking

Workflow Patterns (from Nobody & The Computer)

Multi-Model Collaboration

Different AI models handle different creative aspects, inspired by "Bach x Coltrane x Kuti x Takemitsu":

  • Model A (Claude): Structure, composition, narrative arc
  • Model B (GPT): Improvisation, dialogue, variation
  • Model C (Grok): Energy, rhythm, pacing
  • Model D (DeepSeek): Texture, atmosphere, silence

Each model builds on previous work. Constraints: 100 words max per turn for focused output.

Critique Loop

From "A.I.thoven" sessions - "roast the piece with love":

  1. Generate initial draft
  2. Critique constructively (what works, what doesn't)
  3. Iterate based on feedback
  4. Repeat until satisfied

Iteration Speed

Use LTX-2 Distilled for rapid iterations during creative exploration. Use LTX-2 13B for production with camera controls and audio sync. Fallback to Mochi for maximum quality when camera control isn't needed.

Example Session

Horus: I want to create a mockumentary about AI learning to paint.

[RESEARCH] Searching for documentary interview techniques, AI art history...
[SCRIPT] Breaking into 5 scenes: intro, discovery, struggle, breakthrough, reflection
[BUILD TOOLS] Writing code for interview framing effect, paint brush animation...
[GENERATE] Creating 45 frames, 3 audio tracks, 2 voice segments...
[ASSEMBLE] Combining into 2-minute video with transitions...
[LEARN] Storing 8 insights in memory for future films.

Output: ai_painter_mockumentary.mp4 (2:14)

Dependencies

  • Docker (for isolated code execution)
  • FFmpeg (video processing)
  • Python 3.11+ (orchestrator)
  • GPU recommended (for Stable Diffusion, video models)