Claude-code-plugins-plus-skills Vertex AI Media Master
git clone https://github.com/jeremylongshore/claude-code-plugins-plus-skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/jeremylongshore/claude-code-plugins-plus-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/backups/skills-migration-20251108-070147/plugins/productivity/003-jeremy-vertex-ai-media-master/skills/vertex-media-master" ~/.claude/skills/jeremylongshore-claude-code-plugins-plus-skills-vertex-ai-media-master-54d5ba && rm -rf "$T"
backups/skills-migration-20251108-070147/plugins/productivity/003-jeremy-vertex-ai-media-master/skills/vertex-media-master/SKILL.mdVertex AI Media Master - Comprehensive Multimodal AI Operations
This Agent Skill provides comprehensive mastery of Google Vertex AI multimodal capabilities for video, audio, image, and text processing with focus on marketing applications.
Core Capabilities
🎥 Video Processing (Gemini 2.0/2.5)
- Video Understanding: Process videos up to 6 hours at low resolution or 2 hours at default resolution
- 2M Context Window: Gemini 2.5 Pro handles massive video content
- Audio Track Processing: Automatic audio transcription from video
- Multi-video Analysis: Process multiple videos in single request
- Video Summarization: Extract key moments, scenes, and insights
- Marketing Use Cases:
- Analyze competitor video ads
- Extract highlights from long-form content
- Generate video summaries for social media
- Transcribe and caption video content
- Identify brand mentions and product placements
🎵 Audio Generation & Processing
- Lyria Model (2025): Native audio and music generation
- Speech-to-Text: Transcribe audio with speaker diarization
- Text-to-Speech: Generate natural voiceovers
- Music Composition: Background music for campaigns
- Audio Enhancement: Noise reduction and quality improvement
- Marketing Use Cases:
- Generate podcast scripts and voiceovers
- Create audio ads and radio spots
- Produce background music for video campaigns
- Transcribe customer interviews
- Generate multilingual voiceovers
🖼️ Image Generation (Imagen 4 & Gemini 2.5 Flash Image)
- Imagen 4: Highest quality text-to-image generation
- Gemini 2.5 Flash Image: Interleaved image generation with text
- Style Transfer: Apply brand styles to generated images
- Product Visualization: Generate product mockups
- Campaign Assets: Create ad creatives and social media graphics
- Marketing Use Cases:
- Generate personalized ad images (Adios solution)
- Create social media graphics at scale
- Produce product lifestyle images
- Generate A/B test variations
- Create branded campaign visuals
📢 Marketing Campaign Automation
- ViGenAiR: Convert long-form video ads to short formats automatically
- Adios: Generate personalized ad images tailored to audience context
- Campaign Asset Generation: Photos, soundtracks, voiceovers from prompts
- Content Pipeline: Email copy, blog posts, social media, PMax assets
- Catalog Enrichment: Multi-agent workflow for product onboarding
- Marketing Use Cases:
- Automated campaign asset production
- Personalized content at scale
- Multi-channel content distribution
- Product catalog enhancement
- Visual merchandising automation
🔧 Technical Implementation
API Integration:
from google.cloud import aiplatform from vertexai.preview.generative_models import GenerativeModel # Initialize Vertex AI aiplatform.init(project="your-project", location="us-central1") # Gemini 2.5 Pro for video model = GenerativeModel("gemini-2.5-pro") # Process video with audio response = model.generate_content([ "Analyze this video and extract key marketing insights", video_file, # Up to 6 hours ]) # Imagen 4 for image generation from vertexai.preview.vision_models import ImageGenerationModel imagen = ImageGenerationModel.from_pretrained("imagen-4") images = imagen.generate_images( prompt="Professional product photo, studio lighting, white background", number_of_images=4 )
Gemini 2.5 Flash Image (Interleaved Generation):
# Generate images within text responses model = GenerativeModel("gemini-2.5-flash-image") response = model.generate_content([ "Create a 5-step recipe with images for each step" ]) # Returns text + images interleaved
Audio Generation (Lyria):
from vertexai.preview.audio_models import AudioGenerationModel lyria = AudioGenerationModel.from_pretrained("lyria") audio = lyria.generate_audio( prompt="Upbeat background music for product launch video, 30 seconds", duration=30 )
📊 Marketing Workflow Automation
1. Multi-Channel Campaign Creation:
# Single prompt generates all assets campaign = model.generate_content([ """Create a product launch campaign for [product]: - Hero image (1920x1080) - 3 social media graphics (1080x1080) - 30-second video script - Background music description - Email marketing copy - Instagram caption""" ])
2. Video Repurposing Pipeline:
# Long-form to short-form conversion (ViGenAiR approach) long_video = "gs://bucket/original-ad-60s.mp4" response = model.generate_content([ f"Extract 3 engaging 15-second clips from this video for TikTok/Reels", long_video ]) # Auto-generates format-specific versions
3. Personalized Ad Generation:
# Context-aware image generation (Adios approach) for audience in audiences: ad_image = imagen.generate_images( prompt=f"Product ad for {product}, targeting {audience.demographics}, {audience.style_preference}", aspect_ratio="16:9" )
🎯 Best Practices for Jeremy
1. Project Setup:
# Set environment variables export GOOGLE_CLOUD_PROJECT="your-project-id" export GOOGLE_APPLICATION_CREDENTIALS="path/to/service-account.json" # Install SDK pip install google-cloud-aiplatform[vision,audio] google-generativeai
2. Rate Limits & Quotas:
- Gemini 2.5 Pro: 2M tokens/min (video processing)
- Imagen 4: 100 images/min
- Monitor usage in Cloud Console
3. Cost Optimization:
- Use Gemini 2.5 Flash for faster, cheaper operations
- Batch image generation requests
- Cache video embeddings for repeated analysis
- Use low-resolution video setting when appropriate
4. Security & Compliance:
- Keep API keys in Secret Manager, never in code
- Use service accounts with minimal permissions
- Enable VPC Service Controls for data residency
- Log all API calls for audit trails
🚀 Advanced Marketing Use Cases
1. Campaign Performance Analysis:
# Analyze competitor campaigns competitor_videos = ["gs://bucket/competitor1.mp4", "gs://bucket/competitor2.mp4"] analysis = model.generate_content([ "Compare these competitor videos: themes, messaging, CTAs, production quality", *competitor_videos ])
2. Content Localization:
# Generate multilingual campaigns for lang in ["en", "es", "fr", "de", "ja"]: localized_content = model.generate_content([ f"Translate and culturally adapt this campaign for {lang} market:", campaign_brief, hero_image ])
3. A/B Test Generation:
# Generate variations automatically variations = [] for style in ["minimalist", "bold", "luxury", "playful"]: variation = imagen.generate_images( prompt=f"Product ad, {style} style, {brand_guidelines}", number_of_images=1 ) variations.append(variation)
📚 Reference Documentation
Official Documentation:
- Vertex AI Multimodal: https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/overview
- Gemini 2.5 Pro: https://cloud.google.com/vertex-ai/generative-ai/docs/models
- Imagen 4: https://cloud.google.com/vertex-ai/generative-ai/docs/image/overview
- Video Understanding: https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/video-understanding
Marketing Solutions:
- GenAI for Marketing: https://github.com/GoogleCloudPlatform/genai-for-marketing
- ViGenAiR (video repurposing)
- Adios (personalized ad images)
Pricing:
- Gemini 2.5 Pro: $3.50/1M input tokens, $10.50/1M output tokens
- Imagen 4: $0.04/image
- Video processing: Included in Gemini token pricing
When This Skill Activates
This skill automatically activates when you mention:
- Video processing, analysis, or understanding
- Audio generation, music composition, or voiceovers
- Image generation, ad creatives, or visual content
- Marketing campaigns, content automation, or asset production
- Gemini multimodal capabilities
- Vertex AI media operations
- Social media content, email marketing, or PMax campaigns
Integration with Other Tools
Google Cloud Services:
- Cloud Storage for media asset management
- BigQuery for campaign analytics
- Cloud Functions for automation triggers
- Vertex AI Pipelines for content workflows
Third-Party Integrations:
- Social media APIs (LinkedIn, Twitter, Instagram)
- Marketing automation platforms (HubSpot, Marketo)
- CMS integrations (WordPress, Contentful)
- DAM systems (Bynder, Cloudinary)
Success Metrics
Track These KPIs:
- Asset generation speed (baseline: 5 images/min)
- Content approval rate (target: >80%)
- Campaign personalization scale (target: 1000+ variants)
- Cost per asset (target: <$0.10/image)
- Time saved vs manual production (target: 90% reduction)
This skill makes Jeremy a Vertex AI multimodal expert with instant access to video processing, audio generation, image creation, and marketing automation capabilities.