Learn-skills.dev video-edit
Complete video editing toolkit - silence removal, auto-captions, vertical crop, YouTube clipping, 3D transitions, and social media compression. Use when user asks to edit video, remove silences, add captions/subtitles, crop to vertical/shorts, download YouTube clips, compress video, or create video teasers.
install
source · Clone the upstream repo
git clone https://github.com/NeverSight/learn-skills.dev
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/NeverSight/learn-skills.dev "$T" && mkdir -p ~/.claude/skills && cp -r "$T/data/skills-md/aiagentwithdhruv/skills/video-edit" ~/.claude/skills/neversight-learn-skills-dev-video-edit && rm -rf "$T"
manifest:
data/skills-md/aiagentwithdhruv/skills/video-edit/SKILL.mdsource content
Video Editing Toolkit
Goal
Complete video production pipeline: silence removal, auto-captions, vertical cropping, YouTube clipping, compression, and 3D transitions.
Scripts (7 total)
| Script | Purpose |
|---|---|
| Remove silences with neural VAD (Silero) |
| Generate + burn styled subtitles (Whisper + FFmpeg) |
| Auto-crop 16:9 → 9:16 with face tracking |
| Download YouTube + AI chapter clipping |
| Social media compression with platform presets |
| Full pipeline: silence removal + transcription + metadata + upload |
| 3D swivel teaser insertion |
Quick Start Recipes
Recipe 1: Full Social Media Pipeline
# 1. Remove silences python3 ./scripts/jump_cut_vad_singlepass.py input.mp4 .tmp/edited.mp4 # 2. Add captions python3 ./scripts/auto_captions.py .tmp/edited.mp4 .tmp/captioned.mp4 --word-level --max-words 2 # 3. Crop to vertical for Shorts/Reels python3 ./scripts/vertical_crop.py .tmp/captioned.mp4 .tmp/vertical.mp4 # 4. Compress for platform python3 ./scripts/compress_video.py .tmp/vertical.mp4 output.mp4 --preset instagram-reel
Recipe 2: YouTube → Shorts Pipeline
# 1. Download + auto-clip YouTube video python3 ./scripts/youtube_clip.py "https://youtube.com/watch?v=..." --auto-clip --output-dir clips/ # 2. Add captions to best clip python3 ./scripts/auto_captions.py clips/chapters/01_intro.mp4 .tmp/captioned.mp4 --word-level # 3. Crop to vertical python3 ./scripts/vertical_crop.py .tmp/captioned.mp4 .tmp/vertical.mp4 # 4. Compress for YouTube Shorts python3 ./scripts/compress_video.py .tmp/vertical.mp4 short.mp4 --preset youtube-shorts
Recipe 3: Quick Edit + Upload
# All-in-one: silence removal + transcription + metadata + Auphonic upload python3 ./scripts/simple_video_edit.py --video input.mp4 --title "My Video"
Script 1: VAD Silence Removal
File:
./scripts/jump_cut_vad_singlepass.py
How It Works
- Extracts audio as WAV (16kHz mono)
- Runs Silero VAD to detect speech segments
- Merges close segments, adds padding
- Uses FFmpeg trim+concat to join segments in single pass
- Hardware encodes with hevc_videotoolbox (H.265, 17Mbps, 30fps)
CLI Arguments
| Argument | Default | Description |
|---|---|---|
| 0.5 | Min silence duration to cut (seconds) |
| 0.25 | Min speech duration to keep (seconds) |
| 100 | Padding around speech (ms) |
| 0.3 | Merge segments closer than this (seconds) |
| true | Always start from 0:00 |
Usage
python3 ./scripts/jump_cut_vad_singlepass.py input.mp4 output.mp4 python3 ./scripts/jump_cut_vad_singlepass.py input.mp4 output.mp4 --min-silence 1.0 --padding 200
Processing Time
~8 minutes for a 49-min 4K video
Script 2: Auto-Captions
File:
./scripts/auto_captions.py
How It Works
- Transcribes video using faster-whisper (word-level timestamps)
- Generates SRT subtitles (segment-level or word-level)
- Burns styled captions into video with FFmpeg
CLI Arguments
| Argument | Default | Description |
|---|---|---|
| base | Whisper model: tiny, base, small, medium, large-v3 |
| auto | Force language (en, es, hi, etc.) |
| false | Short punchy captions (1-3 words at a time) |
| 3 | Max words per caption in word-level mode |
| 22 | Caption font size (auto-scales for vertical) |
| white | Text color: white, yellow, cyan, green, red, orange, pink |
| black | Outline color |
| bottom | Caption position: bottom, top, middle |
| false | Only generate SRT, don't burn |
| auto | Custom SRT output path |
| true | Bold text |
Usage
# Basic captions python3 ./scripts/auto_captions.py input.mp4 output.mp4 # Word-level (CapCut/Submagic style) python3 ./scripts/auto_captions.py input.mp4 output.mp4 --word-level --max-words 2 # Yellow captions at top python3 ./scripts/auto_captions.py input.mp4 output.mp4 --font-color yellow --position top # SRT only (no burn) python3 ./scripts/auto_captions.py input.mp4 --srt-only # High accuracy python3 ./scripts/auto_captions.py input.mp4 output.mp4 --model large-v3
Dependencies
pip install faster-whisper
Script 3: Vertical Crop
File:
./scripts/vertical_crop.py
How It Works
- Samples frames throughout the video
- Detects faces using OpenCV Haar cascade
- Smooths face positions to avoid jitter
- Crops 16:9 → 9:16 centered on the speaker
- For moving subjects: segments video into 2s chunks with per-segment tracking
CLI Arguments
| Argument | Default | Description |
|---|---|---|
| 9:16 | Target aspect ratio (e.g., 9:16, 4:5, 1:1) |
| auto | Crop position: auto (face tracking), left, center, right |
| 30 | Smoothing window in frames |
| 5 | Sample every N frames for detection |
Usage
# Auto face-tracking crop python3 ./scripts/vertical_crop.py input.mp4 output.mp4 # Square crop (Instagram post) python3 ./scripts/vertical_crop.py input.mp4 output.mp4 --ratio 1:1 # 4:5 crop (Instagram feed) python3 ./scripts/vertical_crop.py input.mp4 output.mp4 --ratio 4:5 # Center crop (no tracking) python3 ./scripts/vertical_crop.py input.mp4 output.mp4 --position center
Dependencies
pip install opencv-python numpy
Script 4: YouTube Clip
File:
./scripts/youtube_clip.py
How It Works
- Downloads video via yt-dlp (up to 4K)
- Extracts YouTube chapters if available
- Falls back to AI chapter generation (Whisper + Claude)
- Clips video into individual chapter files
CLI Arguments
| Argument | Default | Description |
|---|---|---|
| false | Only download, don't clip |
| 1080 | Max quality: 480, 720, 1080, 1440, 2160 |
| false | Download audio only (MP3) |
| none | Manual clip start (seconds) |
| none | Manual clip end (seconds) |
| false | AI-powered chapter detection and clipping |
| true | Use YouTube chapters if available |
| none | Limit number of clips |
| base | Whisper model for transcription |
| false | Re-encode clips (precise cuts) |
| .tmp/clips | Output directory |
Usage
# Download only python3 ./scripts/youtube_clip.py "https://youtube.com/watch?v=..." --download-only # Auto-clip into chapters python3 ./scripts/youtube_clip.py "https://youtube.com/watch?v=..." --auto-clip # Extract specific range python3 ./scripts/youtube_clip.py "https://youtube.com/watch?v=..." --start 60 --end 180 -o clip.mp4 # Download audio only python3 ./scripts/youtube_clip.py "https://youtube.com/watch?v=..." --audio-only # Top 3 clips only python3 ./scripts/youtube_clip.py "https://youtube.com/watch?v=..." --auto-clip --max-clips 3
Dependencies
pip install yt-dlp faster-whisper anthropic
Script 5: Video Compressor
File:
./scripts/compress_video.py
Platform Presets
| Preset | Resolution | CRF | Max Duration | Max Size | Notes |
|---|---|---|---|---|---|
| 1080p | 18 | none | none | High quality |
| 1080x1920 | 20 | 60s | none | Vertical |
| 1080x1920 | 23 | 90s | none | Vertical |
| 1080x1080 | 23 | 60s | none | Square/vertical |
| 1080x1920 | 23 | 180s | none | Vertical |
| 1080p | 23 | 140s | 512MB | Auto-bitrate |
| 1080p | 23 | 600s | 200MB | Auto-bitrate |
| 720p | 28 | none | 50MB | For bots |
| 720p | 28 | 120s | 16MB | Aggressive |
| 480p | 30 | none | none | Quick sharing |
CLI Arguments
| Argument | Default | Description |
|---|---|---|
| none | Platform preset (see table above) |
| 1080 | Max height in pixels |
| 23 | Quality (0=lossless, 51=worst) |
| 128k | Audio bitrate |
| none | Target file size in MB |
| none | Max duration in seconds |
| false | Show all presets |
Usage
# Platform preset python3 ./scripts/compress_video.py input.mp4 output.mp4 --preset instagram-reel python3 ./scripts/compress_video.py input.mp4 output.mp4 --preset whatsapp # Target file size python3 ./scripts/compress_video.py input.mp4 output.mp4 --target-size 25 # Custom python3 ./scripts/compress_video.py input.mp4 output.mp4 --resolution 720 --crf 28 # List all presets python3 ./scripts/compress_video.py input.mp4 output.mp4 --list-presets
Script 6: Simple Video Edit (Full Pipeline)
File:
./scripts/simple_video_edit.py
How It Works
- FFmpeg silence detection + cutting
- Audio normalization (loudnorm)
- Whisper transcription
- Claude-generated YouTube metadata (summary + chapters)
- Auphonic upload → YouTube (private draft)
CLI Arguments
| Argument | Default | Description |
|---|---|---|
| required | Input video path |
| required | YouTube title |
| none | Thumbnail image path |
| false | Skip Auphonic upload |
| false | Skip audio normalization |
| false | Skip editing, just upload |
| -35 | Silence threshold in dB |
| 3.0 | Min silence duration (seconds) |
Usage
# Full pipeline python3 ./scripts/simple_video_edit.py --video input.mp4 --title "My Video" # Local only python3 ./scripts/simple_video_edit.py --video input.mp4 --title "Test" --no-upload
Dependencies
pip install anthropic faster-whisper requests python-dotenv # Requires: ANTHROPIC_API_KEY, AUPHONIC_API_KEY in .env
Script 7: 3D Swivel Teaser
File:
./scripts/insert_3d_transition.py
How It Works
- Extracts frames from later in video (default: 60s onwards)
- Creates 3D rotating "swivel" animation via Remotion
- Splits video: intro → transition → main content
- Re-encodes and concatenates with audio preserved
CLI Arguments
| Argument | Default | Description |
|---|---|---|
| 3 | Where to insert teaser (seconds) |
| 5 | Teaser duration (seconds) |
| 60 | Where to sample content from (seconds) |
| #2d3436 | Background color (hex) |
| none | Background image path |
Final Timeline
[0-3s intro] [3-8s swivel teaser @ 100x] [8s onwards: edited content] Audio: Original audio plays continuously
Dependencies
pip install torch # For Silero VAD (jump_cut script) brew install ffmpeg node # macOS cd video_effects && npm install # For Remotion 3D rendering
All Dependencies (Install Once)
# Core (required for all scripts) brew install ffmpeg # Silence removal pip install torch # Auto-captions + YouTube clip pip install faster-whisper # Vertical crop pip install opencv-python numpy # YouTube download pip install yt-dlp # AI features (chapters, metadata) pip install anthropic # Full pipeline (simple_video_edit) pip install requests python-dotenv # 3D transitions brew install node cd scripts/video_effects && npm install
Troubleshooting
| Issue | Solution |
|---|---|
| Cuts feel abrupt | in jump_cut |
| Too much cut | in jump_cut |
| Captions too small | in auto_captions |
| Vertical video detected | Font auto-scales 1.3x |
| Won't play in QuickTime | Ensure hvc1 codec tag |
| Face not detected | Try in vertical_crop |
| YouTube download fails | Update yt-dlp: |
| File too large | Use or |
| Hardware encoder fails | Auto-falls back to software (libx264) |
Technical Details
- macOS: Hardware encoding (hevc_videotoolbox / h264_videotoolbox)
- Fallback: libx264/libx265 with CRF
- Audio: AAC 128-192kbps
- Uses
codec tag for QuickTime compatibilityhvc1 - All scripts support
for full argument list--help
Schema
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| file_path | Yes | Input video file path |
| string | No | Pipeline recipe: full-social, youtube-shorts, quick-edit |
| string | No | Compression preset: youtube, instagram-reel, tiktok, whatsapp, etc. |
Outputs
| Name | Type | Description |
|---|---|---|
| file_path | Processed video file path |
Credentials
| Name | Source |
|---|---|
| .env (for AI chapters) |
| .env (for upload) |
Composable With
Skills that chain well with this one:
pan-3d-transition, recreate-thumbnails
Cost
Free locally (FFmpeg + Silero VAD)