Claude-skill-registry autocut-shorts
Main orchestration skill for automatic creation of short-form content (TikTok, YouTube Shorts, Instagram Reels) from long videos. Fully automated workflow: download video, transcribe, detect highlights (transcript + laughter + sentiment + scenes), trim segments, resize to 9:16 portrait, and add subtitles. Finds viral-worthy moments like OpusClip and Vizard.ai.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/autocut-shorts" ~/.claude/skills/majiayu000-claude-skill-registry-autocut-shorts && rm -rf "$T"
skills/data/autocut-shorts/SKILL.mdAutocut Shorts
This is the main orchestration skill that combines all other skills to automatically create short-form content from long videos.
What It Does
This skill automates the entire workflow:
- Download video from YouTube URL (if provided)
- Transcribe audio using Whisper or Gemini API
- Perform speaker diarization (pyannote or Gemini) - identifies who speaks when
- Detect highlights using combined analysis:
- Transcript analysis (hooks, viral phrases)
- Speaker dynamics (debates, interactions, overlapping speech)
- Laughter detection (humorous moments)
- Sentiment analysis (emotional peaks)
- Scene detection (cut points)
- Select best segments (15-60 seconds each)
- Trim video to highlight segments
- Resize to 9:16 portrait format (1080x1920)
- Add burned-in subtitles with speaker labels
- Export multiple clips ready for upload
When to Use
- User wants to create TikTok clips from a YouTube video
- Converting podcasts to short-form content
- Finding viral moments in vlogs or tutorials
- Repurposing gaming content for Shorts/Reels
- Batch processing multiple videos
Available Scripts
scripts/autocut.py
scripts/autocut.pyMain autocut workflow script.
Usage:
python skills/autocut-shorts/scripts/autocut.py <video_or_url> [options]
Options:
: Source type (file, youtube) - auto-detected--source
: Number of clips to generate (default: 5)--num-clips
: Minimum clip duration in seconds (default: 15)--min-duration
: Maximum clip duration in seconds (default: 60)--max-duration
: Target platform (tiktok, shorts, reels, facebook) - default: tiktok--platform
: Output directory (default:--output-dir
)./shorts/
: Transcription model (auto, whisper, gemini) - default: auto--transcription-model
: Speaker diarization (auto, pyannote, gemini, none) - default: auto--diarization-model
: HuggingFace token for pyannote (or use env var)--huggingface-token
: Extract clips only for specific speaker (SPEAKER_00, etc.)--focus-speaker
: Gemini API key (or use env var)--gemini-api-key
: Skip transcription if already have transcript--skip-transcribe
: Skip speaker diarization--skip-diarization
: Skip scene detection--skip-scenes
: Skip laughter detection--skip-laughter
: Skip sentiment analysis--skip-sentiment
: Use existing transcript file--transcript-path
: Subtitle style (tiktok, shorts, reels) - default: tiktok--style
Examples:
Basic autocut from file:
python skills/autocut-shorts/scripts/autocut.py video.mp4
Autocut from YouTube URL:
python skills/autocut-shorts/scripts/autocut.py "https://www.youtube.com/watch?v=VIDEO_ID"
Generate 10 clips for Instagram Reels:
python skills/autocut-shorts/scripts/autocut.py video.mp4 --num-clips 10 --platform reels --style reels
Use Gemini for transcription:
python skills/autocut-shorts/scripts/autocut.py video.mp4 --transcription-model gemini
Custom duration range:
python skills/autocut-shorts/scripts/autocut.py video.mp4 --min-duration 20 --max-duration 45
Use existing transcript:
python skills/autocut-shorts/scripts/autocut.py video.mp4 --transcript-path video.srt --skip-transcribe
scripts/quick_cut.py
scripts/quick_cut.pyQuick cut without full analysis (faster).
Usage:
python skills/autocut-shorts/scripts/quick_cut.py <video_path> [options]
Options:
: JSON file with timestamps to cut--timestamps
: Output directory--output-dir
: Target platform--platform
Example:
python skills/autocut-shorts/scripts/quick_cut.py video.mp4 --timestamps cuts.json
Workflow Steps
Step 1: Download (Optional)
If URL provided:
- Downloads from YouTube using yt-dlp
- Best quality MP4
- Saves to temp directory
Step 2: Transcribe
Extracts audio and transcribes:
- Auto mode: Chooses based on requirements
- Whisper: Local processing, good for privacy
- Gemini: Cloud processing, better quality + features
Step 3: Detect Highlights
Runs detection modules:
- Transcript analysis: Viral phrases, hooks, questions
- Laughter detection: Funny moments (if enabled)
- Sentiment analysis: Emotional peaks (if enabled)
- Scene detection: Visual cut points (if enabled)
Step 4: Score and Rank
Combines all signals:
Virality Score = 35% Transcript (hooks, viral content) + 25% Laughter (humor) + 25% Sentiment (emotion) + 15% Scenes (visual transitions)
Ranks all segments and selects top N.
Step 5: Trim
For each highlight:
- Extends 2-3 seconds before/after for context
- Trims using FFmpeg (stream copy for speed)
- Validates duration constraints
Step 6: Resize to Portrait
Converts to 9:16:
- Smart crop (focus on subjects)
- 1080x1920 resolution
- Maintains quality
Step 7: Add Subtitles
Burns in captions:
- Platform-specific styling
- White text with black outline
- Bottom position
- Readable size (24-28px)
Step 8: Export
Saves final clips:
- Named:
{original}_short_{index}.mp4 - Organized in output directory
- JSON report with metadata
Output Format
Directory Structure
shorts/ video_short_001.mp4 video_short_002.mp4 video_short_003.mp4 report.json
JSON Report
{ "success": true, "source": { "type": "youtube", "url": "https://youtube.com/watch?v=...", "title": "Video Title", "duration": 1200.5 }, "processing": { "transcription_model": "gemini-flash-lite-latest", "detection_methods": ["transcript", "laughter", "sentiment", "scenes"], "platform": "tiktok" }, "results": { "total_clips": 5, "clips": [ { "rank": 1, "filename": "video_short_001.mp4", "start_time": 45.2, "end_time": 72.5, "duration": 27.3, "virality_score": 0.92, "text": "This is the key moment...", "output_path": "shorts/video_short_001.mp4" } ], "total_duration": 135.5, "avg_virality_score": 0.78 }, "performance": { "total_time": 180.5, "transcription_time": 45.2, "analysis_time": 67.3, "processing_time": 68.0 } }
Platform Presets
TikTok
- Resolution: 1080x1920
- Duration: 15-60 seconds
- Subtitle style: TikTok
- Output naming:
_tiktok_{index}.mp4
YouTube Shorts
- Resolution: 1080x1920
- Duration: 15-60 seconds
- Subtitle style: Shorts
- Output naming:
_shorts_{index}.mp4
Instagram Reels
- Resolution: 1080x1920
- Duration: 15-90 seconds
- Subtitle style: Reels
- Output naming:
_reels_{index}.mp4
Facebook Reels
- Resolution: 1080x1920
- Duration: 15-90 seconds
- Subtitle style: Default
- Output naming:
_facebook_{index}.mp4
Viral Detection Algorithm
High-Value Signals
Transcript (35% weight):
- Viral phrases ("you won't believe", "this changes everything")
- Hooks ("let me tell you", "here's the secret")
- Questions and answers
- Story beats
Laughter (25% weight):
- Explicit laughter markers
- High-confidence laughter detection
- Audience reactions
Sentiment (25% weight):
- Positive emotions (excitement, joy)
- Surprise moments
- Negative emotions (controversy, drama)
- Emotional intensity > 0.7
Scenes (15% weight):
- Scene transitions
- Visual changes
- Topic shifts
Scoring
virality_score = ( transcript_score * 0.35 + laughter_score * 0.25 + sentiment_score * 0.25 + scene_score * 0.15 )
Premium Clips (0.8-1.0): Must include Excellent Clips (0.6-0.8): High priority Good Clips (0.4-0.6): Consider including
AI Agent Decision Guide
When to Use pyannote vs Gemini Diarization
Default Behavior (--diarization-model auto): The AI agent automatically selects based on context:
# Use pyannote when: if "podcast" in user_request or "interview" in user_request: return "pyannote" # Multi-speaker, needs accuracy if "accurate" in user_request or "precise" in user_request: return "pyannote" # User explicitly wants accuracy if "panel" in user_request or "debate" in user_request: return "pyannote" # Complex multi-speaker scenarios if "overlapping" in user_request or "talk over" in user_request: return "pyannote" # Overlapping speech detection if "privacy" in user_request or "offline" in user_request: return "pyannote" # Local processing needed # Use Gemini when: if "quick" in user_request or "fast" in user_request: return "gemini" # Speed priority if "single speaker" in user_request or "monologue" in user_request: return "gemini" # Simple scenario if "no diarization" in user_request or "skip speakers" in user_request: return "none" # User doesn't want speaker detection # Default for ambiguous cases: return "pyannote" if likely_multi_speaker(video) else "gemini"
Decision Matrix:
| Scenario | Recommended | Reason |
|---|---|---|
| Podcast with 2-3 hosts | pyannote | High accuracy for multi-speaker |
| Interview (host + guest) | pyannote | Precise speaker separation |
| Panel discussion | pyannote | Handles 4+ speakers well |
| Single speaker vlog | gemini | Faster, good enough |
| Gaming commentary | gemini | Usually 1-2 speakers |
| Tutorial video | gemini | Single speaker, speed matters |
| Debate/competitive | pyannote | Overlapping speech detection |
| Privacy-sensitive | pyannote | Local processing |
Examples by Use Case:
# Podcast - use pyannote automatically python skills/autocut-shorts/scripts/autocut.py podcast.mp4 # Interview - use pyannote for accuracy python skills/autocut-shorts/scripts/autocut.py interview.mp4 # Vlog - use gemini (single speaker, faster) python skills/autocut-shorts/scripts/autocut.py vlog.mp4 # Force pyannote explicitly python skills/autocut-shorts/scripts/autocut.py video.mp4 --diarization-model pyannote # Skip diarization for simple content python skills/autocut-shorts/scripts/autocut.py tutorial.mp4 --diarization-model none # Extract only host's segments python skills/autocut-shorts/scripts/autocut.py podcast.mp4 --focus-speaker SPEAKER_00
Smart Defaults
The agent automatically detects:
- Content type (podcast, vlog, tutorial, gaming, etc.)
- Likely speaker count based on audio patterns
- User priority (speed vs accuracy vs privacy)
- Available resources (GPU, internet, API keys)
Override any time: Users can always override with
--diarization-model flag.
Integration
This skill uses all other skills:
: Download from URLyoutube-downloader
: Transcribe audiovideo-transcriber
: Find visual cut pointsscene-detector
: Find funny momentslaughter-detector
: Find emotional peakssentiment-analyzer
: Combine all signalshighlight-scanner
: Cut segmentsvideo-trimmer
: Convert to 9:16portrait-resizer
: Add captionssubtitle-overlay
Common Use Cases
Podcast to Shorts
python skills/autocut-shorts/scripts/autocut.py podcast.mp4 --num-clips 10 --platform shorts
Vlog Highlights
python skills/autocut-shorts/scripts/autocut.py vlog.mp4 --num-clips 5 --platform tiktok
YouTube to TikTok
python skills/autocut-shorts/scripts/autocut.py "https://youtube.com/watch?v=..." --platform tiktok
Tutorial Clips
python skills/autocut-shorts/scripts/autocut.py tutorial.mp4 --min-duration 30 --max-duration 60
Performance
Processing Time (approximate):
- 1-minute video: ~30-60 seconds
- 10-minute video: ~3-5 minutes
- 30-minute video: ~8-12 minutes
- 1-hour video: ~15-25 minutes
Breakdown:
- Download: 5-30 seconds (depends on video)
- Transcription: 20-60 seconds
- Detection: 10-30 seconds per method
- Trimming: 1-5 seconds per clip
- Resizing: 5-10 seconds per clip
- Subtitles: 5-10 seconds per clip
Error Handling
- Download failure: Retries up to 3 times
- Transcription failure: Falls back to alternative model
- No highlights found: Returns error with suggestions
- Processing failure: Reports which step failed
- Partial success: Reports successful clips vs failed
Tips
- Use Gemini transcription for best highlight detection
- Provide more clips requested than needed (filter by score)
- 15-30 second clips perform best on TikTok
- 30-60 second clips work well for Shorts/Reels
- Keep 2-3 second buffer around highlights
- Test different platforms for best engagement
- Use transcript-only mode for faster processing
- Batch process multiple videos for efficiency
References
- OpusClip: https://www.opus.pro/
- Vizard.ai: https://vizard.ai/
- TikTok specs: https://www.tiktok.com/business/en-US/solutions/tiktok-specs
- YouTube Shorts specs: https://support.google.com/youtube/answer/10059066
- Instagram Reels specs: https://help.instagram.com/609412256345459