Claude-skill-registry gemini-video

Invoke Google Gemini for video understanding and analysis using the Python google-genai SDK. Supports gemini-3-pro-preview and gemini-2.5-flash for video analysis, transcription, and content extraction.

install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/gemini-video" ~/.claude/skills/majiayu000-claude-skill-registry-gemini-video && rm -rf "$T"
manifest: skills/data/gemini-video/SKILL.md
source content

Gemini Video Skill

Invoke Google Gemini models for video understanding, analysis, transcription, and content extraction using the Python

google-genai
SDK.

Available Models

Model IDDescriptionBest For
gemini-3-pro-preview
Best multimodal understandingComplex video analysis, detailed descriptions
gemini-2.5-pro
Advanced reasoningDeep video analysis with reasoning
gemini-2.5-flash
Fast processingQuick video summaries, high throughput

Configuration

API Key: Use environment variable

GEMINI_API_KEY

Usage

Video Analysis (Local File)

For local video files, use the File API to upload first:

python -c "
from google import genai
from google.genai import types
import time

client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY'))

# Upload video file
video_file = client.files.upload(file='VIDEO_PATH')
print(f'Uploaded file: {video_file.name}')

# Wait for processing
while video_file.state.name == 'PROCESSING':
    print('Processing video...')
    time.sleep(5)
    video_file = client.files.get(name=video_file.name)

if video_file.state.name == 'FAILED':
    raise ValueError('Video processing failed')

# Analyze video
response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='Describe what happens in this video'),
            types.Part(file_data=types.FileData(file_uri=video_file.uri, mime_type=video_file.mime_type))
        ])
    ]
)
print(response.text)
"

Video Analysis (From URL)

python -c "
from google import genai
from google.genai import types

client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY'))

# For publicly accessible video URLs
video_url = 'VIDEO_URL_HERE'

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='Analyze this video and provide a detailed summary'),
            types.Part(file_data=types.FileData(file_uri=video_url, mime_type='video/mp4'))
        ])
    ]
)
print(response.text)
"

Video Transcription

python -c "
from google import genai
from google.genai import types
import time

client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY'))

# Upload video
video_file = client.files.upload(file='VIDEO_PATH')
while video_file.state.name == 'PROCESSING':
    time.sleep(5)
    video_file = client.files.get(name=video_file.name)

# Transcribe
response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='Transcribe all spoken words in this video. Include timestamps if possible.'),
            types.Part(file_data=types.FileData(file_uri=video_file.uri, mime_type=video_file.mime_type))
        ])
    ]
)
print(response.text)
"

Workflow

When this skill is invoked:

  1. Determine the task type:

    • Video Summary: Generate overview of video content
    • Transcription: Extract spoken words
    • Visual Analysis: Describe visual elements, scenes, actions
    • Content Extraction: Pull specific information from video
    • Q&A: Answer questions about video content
  2. Prepare the video:

    • Local file → Upload via File API
    • Remote URL → Use directly (if publicly accessible)
    • Wait for processing if needed
  3. Select the appropriate model:

    • Complex analysis →
      gemini-3-pro-preview
    • Quick summaries →
      gemini-2.5-flash
  4. Execute and return results

Example Invocations

Summarize Meeting Recording

python -c "
from google import genai
from google.genai import types
import time

client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY'))

video_file = client.files.upload(file='meeting_recording.mp4')
while video_file.state.name == 'PROCESSING':
    time.sleep(5)
    video_file = client.files.get(name=video_file.name)

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='''Summarize this meeting recording:
1. List all participants mentioned
2. Key discussion points
3. Action items and decisions made
4. Any deadlines mentioned'''),
            types.Part(file_data=types.FileData(file_uri=video_file.uri, mime_type=video_file.mime_type))
        ])
    ]
)
print(response.text)
"

Analyze Tutorial Video

python -c "
from google import genai
from google.genai import types
import time

client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY'))

video_file = client.files.upload(file='tutorial.mp4')
while video_file.state.name == 'PROCESSING':
    time.sleep(5)
    video_file = client.files.get(name=video_file.name)

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='''Analyze this tutorial video and create:
1. A step-by-step guide based on the content
2. Key concepts explained
3. Any tips or best practices mentioned
4. Prerequisites needed to follow along'''),
            types.Part(file_data=types.FileData(file_uri=video_file.uri, mime_type=video_file.mime_type))
        ])
    ]
)
print(response.text)
"

Extract Code from Coding Video

python -c "
from google import genai
from google.genai import types
import time

client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY'))

video_file = client.files.upload(file='coding_session.mp4')
while video_file.state.name == 'PROCESSING':
    time.sleep(5)
    video_file = client.files.get(name=video_file.name)

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='''Extract all code shown in this video:
1. Identify the programming language
2. Capture the complete code snippets
3. Note any explanations given for the code
4. List any libraries or dependencies mentioned'''),
            types.Part(file_data=types.FileData(file_uri=video_file.uri, mime_type=video_file.mime_type))
        ])
    ]
)
print(response.text)
"

Timestamp-Based Analysis

python -c "
from google import genai
from google.genai import types
import time

client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY'))

video_file = client.files.upload(file='presentation.mp4')
while video_file.state.name == 'PROCESSING':
    time.sleep(5)
    video_file = client.files.get(name=video_file.name)

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='''Create a timestamped outline of this video:
Format: [MM:SS] - Topic/Event
Include major topic changes, key points, and notable moments.'''),
            types.Part(file_data=types.FileData(file_uri=video_file.uri, mime_type=video_file.mime_type))
        ])
    ]
)
print(response.text)
"

Q&A About Video

python -c "
from google import genai
from google.genai import types
import time

client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY'))

video_file = client.files.upload(file='lecture.mp4')
while video_file.state.name == 'PROCESSING':
    time.sleep(5)
    video_file = client.files.get(name=video_file.name)

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='YOUR_QUESTION_ABOUT_VIDEO_HERE'),
            types.Part(file_data=types.FileData(file_uri=video_file.uri, mime_type=video_file.mime_type))
        ])
    ]
)
print(response.text)
"

File Management

List Uploaded Files

python -c "
from google import genai

client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY'))

for f in client.files.list():
    print(f'{f.name}: {f.state.name} ({f.mime_type})')
"

Delete Uploaded File

python -c "
from google import genai

client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY'))
client.files.delete(name='files/FILE_ID_HERE')
print('File deleted')
"

Supported Video Formats

  • MP4 (
    video/mp4
    )
  • MOV (
    video/quicktime
    )
  • AVI (
    video/x-msvideo
    )
  • FLV (
    video/x-flv
    )
  • MKV (
    video/x-matroska
    )
  • WebM (
    video/webm
    )
  • WMV (
    video/x-ms-wmv
    )
  • 3GPP (
    video/3gpp
    )

Video Limitations

  • Maximum file size: Check current API limits (typically 2GB)
  • Maximum duration: Varies by model (typically up to 1 hour)
  • Processing time: Longer videos take more time to process
  • Quota: Video analysis consumes more tokens than text

Error Handling

Common errors and solutions:

  • PROCESSING state stuck: Video may be too large or corrupted
  • FAILED state: Unsupported format or processing error
  • File not found: Upload before analysis
  • Rate limiting: Implement retry with exponential backoff

Notes

  • Videos must be uploaded via File API before analysis (no inline data like images)
  • Processing time depends on video length and complexity
  • Uploaded files are automatically deleted after 48 hours
  • For very long videos, consider chunking or asking specific timestamp questions
  • Gemini 3 Pro provides the most detailed video analysis

Tools to Use

  • Bash: Execute Python commands
  • Read: Load local video file paths
  • Write: Save transcriptions or analysis to files
  • Glob: Find video files in directories