Antigravity-awesome-skills videodb

Video and audio perception, indexing, and editing. Ingest files/URLs/live streams, build visual/spoken indexes, search with timestamps, edit timelines, add overlays/subtitles, generate media, and create real-time alerts.

install
source · Clone the upstream repo
git clone https://github.com/sickn33/antigravity-awesome-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/sickn33/antigravity-awesome-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/antigravity-awesome-skills/skills/videodb" ~/.claude/skills/sickn33-antigravity-awesome-skills-videodb-a52804 && rm -rf "$T"
manifest: plugins/antigravity-awesome-skills/skills/videodb/SKILL.md
safety · automated scan (medium risk)
This is a pattern-based risk scan, not a security review. Our crawler flagged:
  • pip install
  • references .env files
Always read a skill's source content before installing. Patterns alone don't mean the skill is malicious — but they warrant attention.
source content

VideoDB Skill

Perception + memory + actions for video, live streams, and desktop sessions.

Use this skill when you need to:

When to Use

  • You need video or audio perception, indexing, search, or timeline editing from files, URLs, desktop sessions, or live streams.
  • The task involves timestamps, searchable evidence, subtitles, clips, overlays, or real-time monitoring alerts.
  • You want one workflow that combines ingestion, understanding, retrieval, and media actions.

1) Desktop Perception

  • Start/stop a desktop session capturing screen, mic, and system audio
  • Stream live context and store episodic session memory
  • Run real-time alerts/triggers on what's spoken and what's happening on screen
  • Produce session summaries, a searchable timeline, and playable evidence links

2) Video ingest + stream

  • Ingest a file or URL and return a playable web stream link
  • Transcode/normalize: codec, bitrate, fps, resolution, aspect ratio

3) Index + search (timestamps + evidence)

  • Build visual, spoken, and keyword indexes
  • Search and return exact moments with timestamps and playable evidence
  • Auto-create clips from search results

4) Timeline editing + generation

  • Subtitles: generate, translate, burn-in
  • Overlays: text/image/branding, motion captions
  • Audio: background music, voiceover, dubbing
  • Programmatic composition and exports via timeline operations

5) Live streams (RTSP) + monitoring

  • Connect RTSP/live feeds
  • Run real-time visual and spoken understanding and emit events/alerts for monitoring workflows

Common inputs

  • Local file path, public URL, or RTSP URL
  • Desktop capture request: start / stop / summarize session
  • Desired operations: get context for understanding, transcode spec, index spec, search query, clip ranges, timeline edits, alert rules

Common outputs

  • Stream URL
  • Search results with timestamps and evidence links
  • Generated assets: subtitles, audio, images, clips
  • Event/alert payloads for live streams
  • Desktop session summaries and memory entries

Canonical prompts (examples)

  • "Start desktop capture and alert when a password field appears."
  • "Record my session and produce an actionable summary when it ends."
  • "Ingest this file and return a playable stream link."
  • "Index this folder and find every scene with people, return timestamps."
  • "Generate subtitles, burn them in, and add light background music."
  • "Connect this RTSP URL and alert when a person enters the zone."

Running Python code

Before running any VideoDB code, change to the project directory and load environment variables:

from dotenv import load_dotenv
load_dotenv(".env")

import videodb
conn = videodb.connect()

This reads

VIDEO_DB_API_KEY
from:

  1. Environment (if already exported)
  2. Project's
    .env
    file in current directory

If the key is missing,

videodb.connect()
raises
AuthenticationError
automatically.

Do NOT write a script file when a short inline command works.

When writing inline Python (

python -c "..."
), always use properly formatted code — use semicolons to separate statements and keep it readable. For anything longer than ~3 statements, use a heredoc instead:

python << 'EOF'
from dotenv import load_dotenv
load_dotenv(".env")

import videodb
conn = videodb.connect()
coll = conn.get_collection()
print(f"Videos: {len(coll.get_videos())}")
EOF

Setup

When the user asks to "setup videodb" or similar:

1. Install SDK

pip install "videodb[capture]" python-dotenv

If

videodb[capture]
fails on Linux, install without the capture extra:

pip install videodb python-dotenv

2. Configure API key

The user must set

VIDEO_DB_API_KEY
using either method:

  • Export in terminal (before starting Claude):
    export VIDEO_DB_API_KEY=your-key
  • Project
    .env
    file
    : Save
    VIDEO_DB_API_KEY=your-key
    in the project's
    .env
    file

Get a free API key at https://console.videodb.io (50 free uploads, no credit card).

Do NOT read, write, or handle the API key yourself. Always let the user set it.

Quick Reference

Upload media

# URL
video = coll.upload(url="https://example.com/video.mp4")

# YouTube
video = coll.upload(url="https://www.youtube.com/watch?v=VIDEO_ID")

# Local file
video = coll.upload(file_path="/path/to/video.mp4")

Transcript + subtitle

# force=True skips the error if the video is already indexed
video.index_spoken_words(force=True)
text = video.get_transcript_text()
stream_url = video.add_subtitle()

Search inside videos

from videodb.exceptions import InvalidRequestError

video.index_spoken_words(force=True)

# search() raises InvalidRequestError when no results are found.
# Always wrap in try/except and treat "No results found" as empty.
try:
    results = video.search("product demo")
    shots = results.get_shots()
    stream_url = results.compile()
except InvalidRequestError as e:
    if "No results found" in str(e):
        shots = []
    else:
        raise

Scene search

import re
from videodb import SearchType, IndexType, SceneExtractionType
from videodb.exceptions import InvalidRequestError

# index_scenes() has no force parameter — it raises an error if a scene
# index already exists. Extract the existing index ID from the error.
try:
    scene_index_id = video.index_scenes(
        extraction_type=SceneExtractionType.shot_based,
        prompt="Describe the visual content in this scene.",
    )
except Exception as e:
    match = re.search(r"id\s+([a-f0-9]+)", str(e))
    if match:
        scene_index_id = match.group(1)
    else:
        raise

# Use score_threshold to filter low-relevance noise (recommended: 0.3+)
try:
    results = video.search(
        query="person writing on a whiteboard",
        search_type=SearchType.semantic,
        index_type=IndexType.scene,
        scene_index_id=scene_index_id,
        score_threshold=0.3,
    )
    shots = results.get_shots()
    stream_url = results.compile()
except InvalidRequestError as e:
    if "No results found" in str(e):
        shots = []
    else:
        raise

Timeline editing

Important: Always validate timestamps before building a timeline:

  • start
    must be >= 0 (negative values are silently accepted but produce broken output)
  • start
    must be <
    end
  • end
    must be <=
    video.length
from videodb.timeline import Timeline
from videodb.asset import VideoAsset, TextAsset, TextStyle

timeline = Timeline(conn)
timeline.add_inline(VideoAsset(asset_id=video.id, start=10, end=30))
timeline.add_overlay(0, TextAsset(text="The End", duration=3, style=TextStyle(fontsize=36)))
stream_url = timeline.generate_stream()

Transcode video (resolution / quality change)

from videodb import TranscodeMode, VideoConfig, AudioConfig

# Change resolution, quality, or aspect ratio server-side
job_id = conn.transcode(
    source="https://example.com/video.mp4",
    callback_url="https://example.com/webhook",
    mode=TranscodeMode.economy,
    video_config=VideoConfig(resolution=720, quality=23, aspect_ratio="16:9"),
    audio_config=AudioConfig(mute=False),
)

Reframe aspect ratio (for social platforms)

Warning:

reframe()
is a slow server-side operation. For long videos it can take several minutes and may time out. Best practices:

  • Always limit to a short segment using
    start
    /
    end
    when possible
  • For full-length videos, use
    callback_url
    for async processing
  • Trim the video on a
    Timeline
    first, then reframe the shorter result
from videodb import ReframeMode

# Always prefer reframing a short segment:
reframed = video.reframe(start=0, end=60, target="vertical", mode=ReframeMode.smart)

# Async reframe for full-length videos (returns None, result via webhook):
video.reframe(target="vertical", callback_url="https://example.com/webhook")

# Presets: "vertical" (9:16), "square" (1:1), "landscape" (16:9)
reframed = video.reframe(start=0, end=60, target="square")

# Custom dimensions
reframed = video.reframe(start=0, end=60, target={"width": 1280, "height": 720})

Generative media

image = coll.generate_image(
    prompt="a sunset over mountains",
    aspect_ratio="16:9",
)

Error handling

from videodb.exceptions import AuthenticationError, InvalidRequestError

try:
    conn = videodb.connect()
except AuthenticationError:
    print("Check your VIDEO_DB_API_KEY")

try:
    video = coll.upload(url="https://example.com/video.mp4")
except InvalidRequestError as e:
    print(f"Upload failed: {e}")

Common pitfalls

ScenarioError messageSolution
Indexing an already-indexed video
Spoken word index for video already exists
Use
video.index_spoken_words(force=True)
to skip if already indexed
Scene index already exists
Scene index with id XXXX already exists
Extract the existing
scene_index_id
from the error with
re.search(r"id\s+([a-f0-9]+)", str(e))
Search finds no matches
InvalidRequestError: No results found
Catch the exception and treat as empty results (
shots = []
)
Reframe times outBlocks indefinitely on long videosUse
start
/
end
to limit segment, or pass
callback_url
for async
Negative timestamps on TimelineSilently produces broken streamAlways validate
start >= 0
before creating
VideoAsset
generate_video()
/
create_collection()
fails
Operation not allowed
or
maximum limit
Plan-gated features — inform the user about plan limits

Additional docs

Reference documentation is in the

reference/
directory adjacent to this SKILL.md file. Use the Glob tool to locate it if needed.

Screen Recording (Desktop Capture)

Use

ws_listener.py
to capture WebSocket events during recording sessions. Desktop capture supports macOS only.

Quick Start

  1. Start listener:
    python scripts/ws_listener.py &
  2. Get WebSocket ID:
    cat /tmp/videodb_ws_id
  3. Run capture code (see reference/capture.md for full workflow)
  4. Events written to:
    /tmp/videodb_events.jsonl

Query Events

import json
events = [json.loads(l) for l in open("/tmp/videodb_events.jsonl")]

# Get all transcripts
transcripts = [e["data"]["text"] for e in events if e.get("channel") == "transcript"]

# Get visual descriptions from last 5 minutes
import time
cutoff = time.time() - 300
recent_visual = [e for e in events 
                 if e.get("channel") == "visual_index" and e["unix_ts"] > cutoff]

Utility Scripts

For complete capture workflow, see reference/capture.md.

Do not use ffmpeg, moviepy, or local encoding tools when VideoDB supports the operation. The following are all handled server-side by VideoDB — trimming, combining clips, overlaying audio or music, adding subtitles, text/image overlays, transcoding, resolution changes, aspect-ratio conversion, resizing for platform requirements, transcription, and media generation. Only fall back to local tools for operations listed under Limitations in reference/editor.md (transitions, speed changes, crop/zoom, colour grading, volume mixing).

When to use what

ProblemVideoDB solution
Platform rejects video aspect ratio or resolution
video.reframe()
or
conn.transcode()
with
VideoConfig
Need to resize video for Twitter/Instagram/TikTok
video.reframe(target="vertical")
or
target="square"
Need to change resolution (e.g. 1080p → 720p)
conn.transcode()
with
VideoConfig(resolution=720)
Need to overlay audio/music on video
AudioAsset
on a
Timeline
Need to add subtitles
video.add_subtitle()
or
CaptionAsset
Need to combine/trim clips
VideoAsset
on a
Timeline
Need to generate voiceover, music, or SFX
coll.generate_voice()
,
generate_music()
,
generate_sound_effect()

Repository

https://github.com/video-db/skills

Maintained By: VideoDB

Limitations

  • Use this skill only when the task clearly matches the scope described above.
  • Do not treat the output as a substitute for environment-specific validation, testing, or expert review.
  • Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.