Claude-skill-registry gemini-video
Invoke Google Gemini for video understanding and analysis using the Python google-genai SDK. Supports gemini-3-pro-preview and gemini-2.5-flash for video analysis, transcription, and content extraction.
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/gemini-video" ~/.claude/skills/majiayu000-claude-skill-registry-gemini-video && rm -rf "$T"
manifest:
skills/data/gemini-video/SKILL.mdsource content
Gemini Video Skill
Invoke Google Gemini models for video understanding, analysis, transcription, and content extraction using the Python
google-genai SDK.
Available Models
| Model ID | Description | Best For |
|---|---|---|
| Best multimodal understanding | Complex video analysis, detailed descriptions |
| Advanced reasoning | Deep video analysis with reasoning |
| Fast processing | Quick video summaries, high throughput |
Configuration
API Key: Use environment variable
GEMINI_API_KEY
Usage
Video Analysis (Local File)
For local video files, use the File API to upload first:
python -c " from google import genai from google.genai import types import time client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY')) # Upload video file video_file = client.files.upload(file='VIDEO_PATH') print(f'Uploaded file: {video_file.name}') # Wait for processing while video_file.state.name == 'PROCESSING': print('Processing video...') time.sleep(5) video_file = client.files.get(name=video_file.name) if video_file.state.name == 'FAILED': raise ValueError('Video processing failed') # Analyze video response = client.models.generate_content( model='gemini-3-pro-preview', contents=[ types.Content(parts=[ types.Part(text='Describe what happens in this video'), types.Part(file_data=types.FileData(file_uri=video_file.uri, mime_type=video_file.mime_type)) ]) ] ) print(response.text) "
Video Analysis (From URL)
python -c " from google import genai from google.genai import types client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY')) # For publicly accessible video URLs video_url = 'VIDEO_URL_HERE' response = client.models.generate_content( model='gemini-3-pro-preview', contents=[ types.Content(parts=[ types.Part(text='Analyze this video and provide a detailed summary'), types.Part(file_data=types.FileData(file_uri=video_url, mime_type='video/mp4')) ]) ] ) print(response.text) "
Video Transcription
python -c " from google import genai from google.genai import types import time client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY')) # Upload video video_file = client.files.upload(file='VIDEO_PATH') while video_file.state.name == 'PROCESSING': time.sleep(5) video_file = client.files.get(name=video_file.name) # Transcribe response = client.models.generate_content( model='gemini-3-pro-preview', contents=[ types.Content(parts=[ types.Part(text='Transcribe all spoken words in this video. Include timestamps if possible.'), types.Part(file_data=types.FileData(file_uri=video_file.uri, mime_type=video_file.mime_type)) ]) ] ) print(response.text) "
Workflow
When this skill is invoked:
-
Determine the task type:
- Video Summary: Generate overview of video content
- Transcription: Extract spoken words
- Visual Analysis: Describe visual elements, scenes, actions
- Content Extraction: Pull specific information from video
- Q&A: Answer questions about video content
-
Prepare the video:
- Local file → Upload via File API
- Remote URL → Use directly (if publicly accessible)
- Wait for processing if needed
-
Select the appropriate model:
- Complex analysis →
gemini-3-pro-preview - Quick summaries →
gemini-2.5-flash
- Complex analysis →
-
Execute and return results
Example Invocations
Summarize Meeting Recording
python -c " from google import genai from google.genai import types import time client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY')) video_file = client.files.upload(file='meeting_recording.mp4') while video_file.state.name == 'PROCESSING': time.sleep(5) video_file = client.files.get(name=video_file.name) response = client.models.generate_content( model='gemini-3-pro-preview', contents=[ types.Content(parts=[ types.Part(text='''Summarize this meeting recording: 1. List all participants mentioned 2. Key discussion points 3. Action items and decisions made 4. Any deadlines mentioned'''), types.Part(file_data=types.FileData(file_uri=video_file.uri, mime_type=video_file.mime_type)) ]) ] ) print(response.text) "
Analyze Tutorial Video
python -c " from google import genai from google.genai import types import time client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY')) video_file = client.files.upload(file='tutorial.mp4') while video_file.state.name == 'PROCESSING': time.sleep(5) video_file = client.files.get(name=video_file.name) response = client.models.generate_content( model='gemini-3-pro-preview', contents=[ types.Content(parts=[ types.Part(text='''Analyze this tutorial video and create: 1. A step-by-step guide based on the content 2. Key concepts explained 3. Any tips or best practices mentioned 4. Prerequisites needed to follow along'''), types.Part(file_data=types.FileData(file_uri=video_file.uri, mime_type=video_file.mime_type)) ]) ] ) print(response.text) "
Extract Code from Coding Video
python -c " from google import genai from google.genai import types import time client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY')) video_file = client.files.upload(file='coding_session.mp4') while video_file.state.name == 'PROCESSING': time.sleep(5) video_file = client.files.get(name=video_file.name) response = client.models.generate_content( model='gemini-3-pro-preview', contents=[ types.Content(parts=[ types.Part(text='''Extract all code shown in this video: 1. Identify the programming language 2. Capture the complete code snippets 3. Note any explanations given for the code 4. List any libraries or dependencies mentioned'''), types.Part(file_data=types.FileData(file_uri=video_file.uri, mime_type=video_file.mime_type)) ]) ] ) print(response.text) "
Timestamp-Based Analysis
python -c " from google import genai from google.genai import types import time client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY')) video_file = client.files.upload(file='presentation.mp4') while video_file.state.name == 'PROCESSING': time.sleep(5) video_file = client.files.get(name=video_file.name) response = client.models.generate_content( model='gemini-3-pro-preview', contents=[ types.Content(parts=[ types.Part(text='''Create a timestamped outline of this video: Format: [MM:SS] - Topic/Event Include major topic changes, key points, and notable moments.'''), types.Part(file_data=types.FileData(file_uri=video_file.uri, mime_type=video_file.mime_type)) ]) ] ) print(response.text) "
Q&A About Video
python -c " from google import genai from google.genai import types import time client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY')) video_file = client.files.upload(file='lecture.mp4') while video_file.state.name == 'PROCESSING': time.sleep(5) video_file = client.files.get(name=video_file.name) response = client.models.generate_content( model='gemini-3-pro-preview', contents=[ types.Content(parts=[ types.Part(text='YOUR_QUESTION_ABOUT_VIDEO_HERE'), types.Part(file_data=types.FileData(file_uri=video_file.uri, mime_type=video_file.mime_type)) ]) ] ) print(response.text) "
File Management
List Uploaded Files
python -c " from google import genai client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY')) for f in client.files.list(): print(f'{f.name}: {f.state.name} ({f.mime_type})') "
Delete Uploaded File
python -c " from google import genai client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY')) client.files.delete(name='files/FILE_ID_HERE') print('File deleted') "
Supported Video Formats
- MP4 (
)video/mp4 - MOV (
)video/quicktime - AVI (
)video/x-msvideo - FLV (
)video/x-flv - MKV (
)video/x-matroska - WebM (
)video/webm - WMV (
)video/x-ms-wmv - 3GPP (
)video/3gpp
Video Limitations
- Maximum file size: Check current API limits (typically 2GB)
- Maximum duration: Varies by model (typically up to 1 hour)
- Processing time: Longer videos take more time to process
- Quota: Video analysis consumes more tokens than text
Error Handling
Common errors and solutions:
- PROCESSING state stuck: Video may be too large or corrupted
- FAILED state: Unsupported format or processing error
- File not found: Upload before analysis
- Rate limiting: Implement retry with exponential backoff
Notes
- Videos must be uploaded via File API before analysis (no inline data like images)
- Processing time depends on video length and complexity
- Uploaded files are automatically deleted after 48 hours
- For very long videos, consider chunking or asking specific timestamp questions
- Gemini 3 Pro provides the most detailed video analysis
Tools to Use
- Bash: Execute Python commands
- Read: Load local video file paths
- Write: Save transcriptions or analysis to files
- Glob: Find video files in directories