Vibeship-spawner-skills text-to-video

Text-to-Video AI Skill

install
source · Clone the upstream repo
git clone https://github.com/vibeforge1111/vibeship-spawner-skills
manifest: ai/text-to-video/skill.yaml
source content

Text-to-Video AI Skill

Video generation from text and images with Runway, Kling, Luma, Wan

id: text-to-video name: Text to Video display_name: Text-to-Video AI description: Expert patterns for AI video generation including text-to-video, image-to-video, video editing, and API integration with Runway, Kling, Luma, Wan, and Replicate version: 1.0.0 category: ai tags:

  • video-generation
  • text-to-video
  • image-to-video
  • runway
  • kling
  • luma
  • wan
  • replicate
  • ai-video

triggers:

  • "text to video"
  • "video generation"
  • "image to video"
  • "runway api"
  • "kling video"
  • "luma dream machine"
  • "wan video"
  • "animate image"
  • "ai video"

capabilities:

  • "Text-to-video generation"
  • "Image-to-video animation"
  • "Video-to-video transformation"
  • "Camera control (pan, zoom, orbit)"
  • "Character consistency"
  • "Video extension/continuation"
  • "Multi-platform API integration"
  • "Async video processing"

patterns:

  • id: replicate-text-to-video name: Text-to-Video with Replicate description: | Generate videos from text prompts using Replicate models.

    Model recommendations:

    • Wan 2.2: Open-source, fast, good quality
    • Kling 2.1: High quality, character consistency
    • minimax/hailuo: Physics realism

    Videos typically take 30-150 seconds to generate.

    code_example: | import replicate import asyncio from typing import Optional

    Initialize client

    client = replicate.Client(api_token=os.environ["REPLICATE_API_TOKEN"])

    Text-to-Video with Wan 2.2 (fast, open-source)

    def generate_video_wan( prompt: str, resolution: str = "480p", # 480p, 720p duration: int = 5, # seconds ) -> str: """Generate video from text with Wan 2.2."""

      model = f"wavespeedai/wan-2.1-t2v-{resolution}"
    
      output = client.run(
          model,
          input={
              "prompt": prompt,
              "num_frames": duration * 16,  # ~16 FPS
              "guidance_scale": 6.0,
              "num_inference_steps": 30,
          }
      )
    
      return output  # Video URL
    

    Text-to-Video with Kling (high quality)

    def generate_video_kling( prompt: str, duration: int = 5, # 5 or 10 seconds resolution: str = "720p", ) -> str: """Generate video with Kling v2.1."""

      output = client.run(
          "kwaivgi/kling-v2.1",
          input={
              "prompt": prompt,
              "duration": duration,
              "aspect_ratio": "16:9",
              "negative_prompt": "blurry, distorted, low quality",
          }
      )
    
      return output
    

    Async generation with polling

    async def generate_video_async( prompt: str, model: str = "wavespeedai/wan-2.1-t2v-480p", timeout: int = 300, ) -> str: """Generate video asynchronously with status polling."""

      prediction = client.predictions.create(
          model=model,
          input={"prompt": prompt}
      )
    
      start = asyncio.get_event_loop().time()
    
      while True:
          prediction = client.predictions.get(prediction.id)
    
          if prediction.status == "succeeded":
              return prediction.output
    
          if prediction.status == "failed":
              raise Exception(f"Generation failed: {prediction.error}")
    
          if asyncio.get_event_loop().time() - start > timeout:
              client.predictions.cancel(prediction.id)
              raise TimeoutError("Video generation timed out")
    
          await asyncio.sleep(5)  # Poll every 5 seconds
    

    Batch generation

    async def generate_batch(prompts: list[str]) -> list[str]: """Generate multiple videos concurrently."""

      tasks = [
          generate_video_async(prompt)
          for prompt in prompts
      ]
    
      return await asyncio.gather(*tasks)
    

    anti_patterns:

    • pattern: "Blocking synchronous generation" why: "Videos take 30-150 seconds to generate" fix: "Use async with polling or webhooks"

    • pattern: "No timeout on generation" why: "Failed generations can hang forever" fix: "Add timeout and cancellation logic"

    references:

  • id: image-to-video name: Image-to-Video Animation description: | Animate still images into videos. Preserve image content while adding motion.

    Best for:

    • Product animations
    • Character motion
    • Scene transitions
    • Social media content

    code_example: | import replicate import fal_client

    Image-to-Video with Wan (Replicate)

    def animate_image_wan( image_url: str, prompt: str, duration: int = 5, ) -> str: """Animate image with Wan 2.2."""

      client = replicate.Client()
    
      output = client.run(
          "wavespeedai/wan-2.1-i2v-480p",
          input={
              "image": image_url,
              "prompt": prompt,
              "num_frames": duration * 16,
              "guidance_scale": 6.0,
          }
      )
    
      return output
    

    Image-to-Video with Kling (high quality)

    def animate_image_kling( image_url: str, prompt: str, duration: int = 5, # 5 or 10 seconds motion_amount: float = 0.5, # 0-1 ) -> str: """Animate image with Kling v2.1."""

      client = replicate.Client()
    
      output = client.run(
          "kwaivgi/kling-v2.1",
          input={
              "image": image_url,
              "prompt": prompt,
              "duration": duration,
              "cfg_scale": 0.5,
              "negative_prompt": "static, frozen, no motion",
          }
      )
    
      return output
    

    Image-to-Video with Fal.ai

    def animate_image_fal( image_url: str, prompt: str, ) -> dict: """Animate image with Fal.ai."""

      result = fal_client.submit(
          "fal-ai/wan-i2v",
          arguments={
              "image_url": image_url,
              "prompt": prompt,
              "num_inference_steps": 30,
          }
      )
    
      return result.get()
    

    Luma Dream Machine style animation

    def animate_with_camera_motion( image_url: str, camera_motion: str = "zoom_in", # zoom_in, zoom_out, pan_left, pan_right, orbit prompt: str = "", ) -> str: """Animate with specific camera motion."""

      # Map camera motion to prompt additions
      MOTION_PROMPTS = {
          "zoom_in": "smooth zoom in, camera pushing forward",
          "zoom_out": "smooth zoom out, camera pulling back",
          "pan_left": "smooth pan left, camera sliding left",
          "pan_right": "smooth pan right, camera sliding right",
          "orbit": "camera orbiting around subject, 3D parallax",
          "tilt_up": "camera tilting upward, looking up",
          "tilt_down": "camera tilting downward, looking down",
      }
    
      motion_prompt = MOTION_PROMPTS.get(camera_motion, "")
      full_prompt = f"{prompt}, {motion_prompt}".strip(", ")
    
      client = replicate.Client()
      output = client.run(
          "wavespeedai/wan-2.1-i2v-720p",
          input={
              "image": image_url,
              "prompt": full_prompt,
              "guidance_scale": 7.0,
          }
      )
    
      return output
    

    Product turntable animation

    def create_product_turntable( product_image_url: str, rotation_degrees: int = 360, duration: int = 5, ) -> str: """Create 360-degree product rotation video."""

      prompt = f"product rotating {rotation_degrees} degrees on turntable, smooth continuous rotation, studio lighting, white background"
    
      return animate_image_kling(
          image_url=product_image_url,
          prompt=prompt,
          duration=duration,
          motion_amount=0.7,
      )
    

    anti_patterns:

    • pattern: "Vague motion prompts" why: "Results in random or no motion" fix: "Be specific about motion type and direction"

    • pattern: "Expecting perfect character consistency" why: "Current models drift across frames" fix: "Use shorter clips, or models with element/character features"

    references:

  • id: video-prompting name: Video Prompting Best Practices description: | Write effective prompts for video generation. Key elements: subject, action, camera, lighting, style.

    Video prompts differ from image prompts:

    • Emphasize motion and action
    • Describe camera movement
    • Specify timing and pacing

    code_example: |

    Video prompt structure

    class VideoPrompt: """Build structured video prompts."""

      def __init__(self):
          self.subject = ""
          self.action = ""
          self.camera = ""
          self.lighting = ""
          self.style = ""
          self.negative = ""
    
      def with_subject(self, subject: str) -> "VideoPrompt":
          self.subject = subject
          return self
    
      def with_action(self, action: str) -> "VideoPrompt":
          self.action = action
          return self
    
      def with_camera(self, camera: str) -> "VideoPrompt":
          self.camera = camera
          return self
    
      def with_lighting(self, lighting: str) -> "VideoPrompt":
          self.lighting = lighting
          return self
    
      def with_style(self, style: str) -> "VideoPrompt":
          self.style = style
          return self
    
      def build(self) -> str:
          parts = [
              self.subject,
              self.action,
              self.camera,
              self.lighting,
              self.style,
          ]
          return ", ".join(p for p in parts if p)
    

    Usage examples

    prompt = ( VideoPrompt() .with_subject("a young woman in a red dress") .with_action("walking confidently through a busy city street") .with_camera("tracking shot, following from the side") .with_lighting("golden hour sunlight, dramatic shadows") .with_style("cinematic, film grain, shallow depth of field") .build() )

    Camera motion vocabulary

    CAMERA_MOTIONS = { "static": "locked off shot, stationary camera", "pan": "horizontal pan, smooth lateral movement", "tilt": "vertical tilt, camera looking up/down", "zoom": "slow zoom in, pushing forward", "dolly": "dolly shot, camera moving forward", "tracking": "tracking shot, following subject", "orbit": "orbital shot, camera circling subject", "crane": "crane shot, camera rising upward", "handheld": "handheld camera, slight shake", "steadicam": "steadicam shot, smooth floating movement", }

    Action speed modifiers

    SPEED_MODIFIERS = { "slow": "slow motion, graceful movement, 0.5x speed", "normal": "natural pace, realistic timing", "fast": "quick movement, energetic, rapid motion", "timelapse": "time lapse, accelerated, fast forward", }

    Prompt templates by genre

    GENRE_TEMPLATES = { "commercial": "{subject}, {action}, professional lighting, clean composition, 4K quality, commercial production", "cinematic": "{subject}, {action}, cinematic lighting, film grain, anamorphic lens, movie quality", "social": "{subject}, {action}, vibrant colors, engaging, vertical format, social media style", "documentary": "{subject}, {action}, natural lighting, authentic, observational, documentary style", }

    def build_genre_prompt( subject: str, action: str, genre: str = "commercial" ) -> str: template = GENRE_TEMPLATES.get(genre, GENRE_TEMPLATES["commercial"]) return template.format(subject=subject, action=action)

    Negative prompts for video

    NEGATIVE_PROMPTS = { "quality": "blurry, distorted, low resolution, pixelated, artifacts", "motion": "jittery, stuttering, frozen, static, no motion", "anatomy": "deformed, extra limbs, missing limbs, bad anatomy", "all": "blurry, distorted, low quality, jittery, deformed, extra limbs, artifacts, watermark", }

    anti_patterns:

    • pattern: "Image-style prompts for video" why: "Video needs motion and action descriptions" fix: "Add verbs, camera motion, and timing"

    • pattern: "Overly complex prompts" why: "Models struggle with many elements" fix: "Focus on key subject and single clear action"

    references:

  • id: runway-gen3-integration name: Runway Gen-3 API Integration description: | Integrate Runway's Gen-3 Alpha for professional video. Features: 4K resolution, camera controls, lip sync.

    Note: Requires Runway API access (paid). Credits: ~10 credits/second for Gen-3 Alpha.

    code_example: | import httpx from typing import Optional import asyncio

    class RunwayClient: """Runway Gen-3 API client."""

      def __init__(self, api_key: str):
          self.api_key = api_key
          self.base_url = "https://api.runwayml.com/v1"
    
      async def generate_video(
          self,
          prompt: str,
          image_url: Optional[str] = None,
          duration: int = 5,  # seconds
          resolution: str = "720p",
          aspect_ratio: str = "16:9",
      ) -> dict:
          """Generate video with Gen-3 Alpha."""
    
          async with httpx.AsyncClient() as client:
              # Create generation task
              response = await client.post(
                  f"{self.base_url}/generation",
                  headers={
                      "Authorization": f"Bearer {self.api_key}",
                      "Content-Type": "application/json",
                  },
                  json={
                      "model": "gen3a_turbo",  # or "gen3a" for Alpha
                      "prompt_text": prompt,
                      "prompt_image": image_url,
                      "duration": duration,
                      "resolution": resolution,
                      "aspect_ratio": aspect_ratio,
                  },
                  timeout=30,
              )
    
              if response.status_code != 200:
                  raise Exception(f"Generation failed: {response.text}")
    
              task = response.json()
              return await self._wait_for_completion(task["id"])
    
      async def _wait_for_completion(
          self,
          task_id: str,
          timeout: int = 300,
          poll_interval: int = 5,
      ) -> dict:
          """Poll for task completion."""
    
          start = asyncio.get_event_loop().time()
    
          async with httpx.AsyncClient() as client:
              while True:
                  response = await client.get(
                      f"{self.base_url}/generation/{task_id}",
                      headers={
                          "Authorization": f"Bearer {self.api_key}",
                      },
                  )
    
                  task = response.json()
    
                  if task["status"] == "completed":
                      return task
    
                  if task["status"] == "failed":
                      raise Exception(f"Task failed: {task.get('error')}")
    
                  if asyncio.get_event_loop().time() - start > timeout:
                      raise TimeoutError("Generation timed out")
    
                  await asyncio.sleep(poll_interval)
    
      async def extend_video(
          self,
          video_url: str,
          prompt: str,
          extend_seconds: int = 4,
      ) -> dict:
          """Extend existing video."""
    
          async with httpx.AsyncClient() as client:
              response = await client.post(
                  f"{self.base_url}/extend",
                  headers={
                      "Authorization": f"Bearer {self.api_key}",
                      "Content-Type": "application/json",
                  },
                  json={
                      "model": "gen3a_turbo",
                      "video_url": video_url,
                      "prompt_text": prompt,
                      "extend_duration": extend_seconds,
                  },
              )
    
              task = response.json()
              return await self._wait_for_completion(task["id"])
    

    Usage

    runway = RunwayClient(os.environ["RUNWAY_API_KEY"])

    Text-to-video

    result = await runway.generate_video( prompt="A drone shot flying over misty mountains at sunrise", duration=5, resolution="1080p", ) print(f"Video URL: {result['output_url']}")

    Image-to-video

    result = await runway.generate_video( prompt="Camera slowly zooming in, subject looking at camera", image_url="https://example.com/portrait.jpg", duration=5, )

    anti_patterns:

    • pattern: "Not tracking credit usage" why: "Gen-3 uses 10 credits/second, costs add up" fix: "Track usage and set budget limits"

    • pattern: "Synchronous API calls" why: "Generation takes 30+ seconds" fix: "Use async with proper polling"

    references:

  • id: luma-dream-machine name: Luma Dream Machine Integration description: | Integrate Luma's Dream Machine for cinematic video. Features: HDR output, keyframes, character reference.

    Ray3: Latest model with studio-grade HDR. Draft Mode: Fast exploration before final render.

    code_example: | import httpx from typing import Optional, List import asyncio

    class LumaClient: """Luma Dream Machine API client."""

      def __init__(self, api_key: str):
          self.api_key = api_key
          self.base_url = "https://api.lumalabs.ai/dream-machine/v1"
    
      async def generate(
          self,
          prompt: str,
          aspect_ratio: str = "16:9",
          loop: bool = False,
          keyframes: Optional[dict] = None,
      ) -> dict:
          """Generate video with Dream Machine."""
    
          payload = {
              "prompt": prompt,
              "aspect_ratio": aspect_ratio,
              "loop": loop,
          }
    
          if keyframes:
              payload["keyframes"] = keyframes
    
          async with httpx.AsyncClient() as client:
              response = await client.post(
                  f"{self.base_url}/generations",
                  headers={
                      "Authorization": f"Bearer {self.api_key}",
                      "Content-Type": "application/json",
                  },
                  json=payload,
              )
    
              if response.status_code != 201:
                  raise Exception(f"Failed: {response.text}")
    
              task = response.json()
              return await self._poll(task["id"])
    
      async def generate_from_image(
          self,
          image_url: str,
          prompt: str,
          end_image_url: Optional[str] = None,  # Ray3 keyframe feature
      ) -> dict:
          """Generate video from start image, optionally with end keyframe."""
    
          keyframes = {
              "frame0": {"type": "image", "url": image_url},
          }
    
          if end_image_url:
              keyframes["frame1"] = {"type": "image", "url": end_image_url}
    
          return await self.generate(
              prompt=prompt,
              keyframes=keyframes,
          )
    
      async def _poll(
          self,
          generation_id: str,
          timeout: int = 300,
      ) -> dict:
          """Poll for completion."""
    
          start = asyncio.get_event_loop().time()
    
          async with httpx.AsyncClient() as client:
              while True:
                  response = await client.get(
                      f"{self.base_url}/generations/{generation_id}",
                      headers={
                          "Authorization": f"Bearer {self.api_key}",
                      },
                  )
    
                  result = response.json()
    
                  if result["state"] == "completed":
                      return result
    
                  if result["state"] == "failed":
                      raise Exception(f"Failed: {result.get('failure_reason')}")
    
                  if asyncio.get_event_loop().time() - start > timeout:
                      raise TimeoutError("Generation timed out")
    
                  await asyncio.sleep(5)
    

    Usage

    luma = LumaClient(os.environ["LUMA_API_KEY"])

    Text-to-video

    result = await luma.generate( prompt="A spaceship traveling through a colorful nebula, cinematic", aspect_ratio="16:9", )

    Image-to-video with camera motion

    result = await luma.generate_from_image( image_url="https://example.com/landscape.jpg", prompt="Camera slowly panning right, revealing the scene", )

    Transition between two keyframes (Ray3)

    result = await luma.generate_from_image( image_url="https://example.com/start.jpg", prompt="Smooth transition, morphing", end_image_url="https://example.com/end.jpg", )

    anti_patterns:

    • pattern: "Ignoring HDR capabilities" why: "Ray3 supports native HDR for pro workflows" fix: "Export as 16-bit EXR for high-end projects"

    references:

  • id: video-processing-pipeline name: Video Processing Pipeline description: | Build end-to-end video generation pipelines. Handle upload, generation, post-processing, delivery.

    Components:

    • Input validation
    • Queue management
    • Progress tracking
    • Error handling

    code_example: | import asyncio from enum import Enum from dataclasses import dataclass from typing import Optional, Callable import uuid

    class VideoStatus(Enum): PENDING = "pending" GENERATING = "generating" PROCESSING = "processing" COMPLETED = "completed" FAILED = "failed"

    @dataclass class VideoJob: id: str prompt: str image_url: Optional[str] status: VideoStatus progress: int output_url: Optional[str] error: Optional[str] created_at: float updated_at: float

    class VideoGenerationPipeline: """End-to-end video generation pipeline."""

      def __init__(
          self,
          replicate_client,
          storage_client,
          max_concurrent: int = 5,
      ):
          self.replicate = replicate_client
          self.storage = storage_client
          self.max_concurrent = max_concurrent
          self.jobs: dict[str, VideoJob] = {}
          self.semaphore = asyncio.Semaphore(max_concurrent)
    
      async def submit(
          self,
          prompt: str,
          image_url: Optional[str] = None,
          model: str = "wavespeedai/wan-2.1-t2v-480p",
          callback: Optional[Callable] = None,
      ) -> str:
          """Submit video generation job."""
    
          job_id = str(uuid.uuid4())
          now = asyncio.get_event_loop().time()
    
          job = VideoJob(
              id=job_id,
              prompt=prompt,
              image_url=image_url,
              status=VideoStatus.PENDING,
              progress=0,
              output_url=None,
              error=None,
              created_at=now,
              updated_at=now,
          )
    
          self.jobs[job_id] = job
    
          # Start processing in background
          asyncio.create_task(
              self._process_job(job, model, callback)
          )
    
          return job_id
    
      async def _process_job(
          self,
          job: VideoJob,
          model: str,
          callback: Optional[Callable],
      ):
          """Process a single video job."""
    
          async with self.semaphore:
              try:
                  # Update status
                  job.status = VideoStatus.GENERATING
                  job.progress = 10
                  await self._notify(callback, job)
    
                  # Generate video
                  input_data = {"prompt": job.prompt}
                  if job.image_url:
                      input_data["image"] = job.image_url
    
                  prediction = self.replicate.predictions.create(
                      model=model,
                      input=input_data,
                  )
    
                  # Poll for completion
                  while True:
                      prediction = self.replicate.predictions.get(prediction.id)
    
                      if prediction.status == "processing":
                          job.progress = min(80, job.progress + 10)
                          await self._notify(callback, job)
    
                      if prediction.status == "succeeded":
                          video_url = prediction.output
                          break
    
                      if prediction.status == "failed":
                          raise Exception(prediction.error)
    
                      await asyncio.sleep(5)
    
                  # Post-process and store
                  job.status = VideoStatus.PROCESSING
                  job.progress = 90
                  await self._notify(callback, job)
    
                  # Upload to permanent storage
                  permanent_url = await self.storage.upload_from_url(
                      video_url,
                      f"videos/{job.id}.mp4",
                  )
    
                  # Complete
                  job.status = VideoStatus.COMPLETED
                  job.progress = 100
                  job.output_url = permanent_url
                  await self._notify(callback, job)
    
              except Exception as e:
                  job.status = VideoStatus.FAILED
                  job.error = str(e)
                  await self._notify(callback, job)
    
      async def _notify(
          self,
          callback: Optional[Callable],
          job: VideoJob,
      ):
          """Notify callback of job update."""
          job.updated_at = asyncio.get_event_loop().time()
    
          if callback:
              await callback(job)
    
      def get_status(self, job_id: str) -> Optional[VideoJob]:
          """Get job status."""
          return self.jobs.get(job_id)
    
      def get_queue_stats(self) -> dict:
          """Get queue statistics."""
          statuses = [job.status for job in self.jobs.values()]
          return {
              "total": len(self.jobs),
              "pending": statuses.count(VideoStatus.PENDING),
              "generating": statuses.count(VideoStatus.GENERATING),
              "completed": statuses.count(VideoStatus.COMPLETED),
              "failed": statuses.count(VideoStatus.FAILED),
          }
    

    Usage

    pipeline = VideoGenerationPipeline( replicate_client=replicate.Client(), storage_client=s3_client, max_concurrent=5, )

    Submit job

    job_id = await pipeline.submit( prompt="A cat playing piano in a jazz club", callback=lambda job: print(f"Job {job.id}: {job.status.value} - {job.progress}%"), )

    Check status

    job = pipeline.get_status(job_id)

    anti_patterns:

    • pattern: "No concurrency limits" why: "Can overwhelm API rate limits" fix: "Use semaphores to limit concurrent jobs"

    • pattern: "Storing videos in memory" why: "Videos are large, exhaust memory" fix: "Stream to cloud storage"

    references:

handoff_triggers:

  • condition: "needs audio/music" target_skill: ai-music-audio context: "Background music, sound effects"

  • condition: "needs image editing first" target_skill: ai-image-editing context: "Prepare images before animation"

  • condition: "needs lip sync" target_skill: voice-synthesis context: "Speaking character videos"

  • condition: "needs batch processing" target_skill: workflow-automation context: "Video processing pipelines"