Openclip video-clip-extractor

Processes videos to identify engaging moments, generate transcripts, and create highlight clips with artistic titles and custom cover images. Use when user needs to: extract highlights from long videos or livestreams, clip or cut best moments from videos, cut video highlights, process Bilibili/YouTube URLs or local video files, generate transcripts via Whisper, analyze content for engaging moments, create short-form clips with styled titles and covers, adjust cover text position and colors, find and export memorable scenes from recordings, burn subtitles into clips (with optional translation), guide clip selection with user intent, or identify speakers in multi-person conversations.

install

source · Clone the upstream repo

git clone https://github.com/linzzzzzz/openclip

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/linzzzzzz/openclip "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.cursor/skills/video-clip-extractor" ~/.claude/skills/linzzzzzz-openclip-video-clip-extractor-d53d38 && rm -rf "$T"

manifest: .cursor/skills/video-clip-extractor/SKILL.md

source content

Video Clip Extractor Skill

Run the video orchestrator to process videos and extract engaging highlights.

When Triggered

Get the source — if the user didn't provide a video URL or file path, ask for it.
Clarify intent (optional) — if the user wants clips focused on a specific topic, capture it for
```
--user-intent
```
. If unclear, ask: "Any specific topic or moments to focus on? (e.g. 'funny moments', 'key arguments')"
Check environment — does
```
video_orchestrator.py
```
exist in the current directory? If yes, run directly. Otherwise use the global install at
```
~/.local/share/openclip
```
.
Verify prerequisites — check ffmpeg is installed and at least one API key is set. Warn if missing before running.
Run the command and stream output to user.
Report results — after completion, list the generated clips with timestamps and titles.

Setup (first use only)

Before running, determine the execution context:

Inside openclip repo — if
```
video_orchestrator.py
```
exists in the current directory, skip setup and run directly.
Global install — if
```
~/.local/share/openclip
```
does not exist, run these steps:

Prerequisites:

git

and

uv

must be installed.

Install uv if missing: macOS:
```
brew install uv
```
· Linux/Windows:
```
pip install uv
```

git clone https://github.com/linzzzzzz/openclip.git ~/.local/share/openclip
cd ~/.local/share/openclip && uv sync

To update openclip later:

git -C ~/.local/share/openclip pull && cd ~/.local/share/openclip && uv sync

Execution

If inside the openclip repo (current directory contains

video_orchestrator.py

uv run python video_orchestrator.py [options] <source>

If running globally (from any other directory):

cd ~/.local/share/openclip && uv run python video_orchestrator.py -o "$OLDPWD/processed_videos" [options] <source>

$OLDPWD

captures the user's original directory so clips are saved there, not inside the openclip install.

Where

<source>

is a video URL (Bilibili/YouTube) or local file path (MP4, WebM, AVI, MOV, MKV).

For local files with existing subtitles, place the

.srt

file in the same directory with the same filename (e.g.

video.mp4

→

video.srt

Preflight Checklist

Inside openclip repo: run from the repo root so relative paths (e.g.
```
references/
```
,
```
prompts/
```
) resolve correctly
ffmpeg
must be installed (required for all clip generation):
- macOS:
```
brew install ffmpeg
```
- Ubuntu:
```
sudo apt install ffmpeg
```
- Windows: download from ffmpeg.org
- If using
```
--burn-subtitles
```
  : needs ffmpeg with
```
libass
```
  (see README for details)

Set one API key:

```
QWEN_API_KEY
```
(default provider: qwen), or

OPENROUTER_API_KEY

(if

--llm-provider openrouter

), or

```
GLM_API_KEY
```
(if
```
--llm-provider glm
```
), or
```
MINIMAX_API_KEY
```
(if
```
--llm-provider minimax
```
)

If using

--speaker-references

: run

uv sync --extra speakers

and set

HUGGINGFACE_TOKEN

CLI Reference

Required

Argument	Description
`source`	Video URL or local file path

Optional

Flag	Default	Description
`-o` , `--output <dir>`	`processed_videos`	Output directory
`--max-clips <n>`	`5`	Maximum number of highlight clips
`--browser <browser>`	`firefox`	Browser for cookies: `chrome` , `firefox` , `edge` , `safari`
`--title-style <style>`	`fire_flame`	Title style: `gradient_3d` , `neon_glow` , `metallic_gold` , `rainbow_3d` , `crystal_ice` , `fire_flame` , `metallic_silver` , `glowing_plasma` , `stone_carved` , `glass_transparent`
`--title-font-size <size>`	`medium`	Font size preset for artistic titles. Options: small(30px), medium(40px), large(50px), xlarge(60px)
`--cover-text-location <loc>`	`center`	Cover text position: `top` , `upper_middle` , `bottom` , `center`
`--cover-fill-color <color>`	`yellow`	Cover text fill color: `yellow` , `red` , `white` , `cyan` , `green` , `orange` , `pink` , `purple` , `gold` , `silver`
`--cover-outline-color <color>`	`black`	Cover text outline color: `yellow` , `red` , `white` , `cyan` , `green` , `orange` , `pink` , `purple` , `gold` , `silver` , `black`
`--language <lang>`	`zh`	Output language: `zh` (Chinese), `en` (English)
`--llm-provider <provider>`	`qwen`	LLM provider: `qwen` , `openrouter` , `glm` , `minimax`
`--user-intent <text>`	—	Free-text focus description (e.g. "moments about AI risks"). Steers LLM clip selection toward this topic
`--subtitle-translation <lang>`	—	Translate subtitles to this language before burning (e.g. `"Simplified Chinese"` ). Requires `--burn-subtitles` and `QWEN_API_KEY`
`--speaker-references <dir>`	—	Directory of reference WAV files (one per speaker, filename = speaker name) for speaker diarization. Requires `uv sync --extra speakers` and `HUGGINGFACE_TOKEN`
`-f` , `--filename <template>`	—	yt-dlp template: `%(title)s` , `%(uploader)s` , `%(id)s` , etc.

Flags

Flag	Description
`--force-whisper`	Ignore platform subtitles, use Whisper
`--skip-download`	Use existing downloaded video
`--skip-transcript`	Skip transcript generation, use existing transcript file
`--skip-analysis`	Skip analysis, use existing analysis file for clip generation
`--use-background`	Include background info (streamer names/nicknames) in analysis prompts
`--skip-clips`	Skip clip generation
`--add-titles`	Add artistic titles to clips (disabled by default)
`--skip-cover`	Skip cover image generation
`--burn-subtitles`	Burn SRT subtitles into video. Output goes to `clips_post_processed/` . Requires ffmpeg with libass
`-v` , `--verbose`	Enable verbose logging
`--debug`	Export full prompts sent to LLM (saved to `debug_prompts/` )

Custom Filename Template (

-f

)

Uses yt-dlp template syntax. Common variables:

%(title)s

%(uploader)s

%(upload_date)s

%(id)s

%(ext)s

%(duration)s

Example:

-f "%(upload_date)s_%(title)s.%(ext)s"

Environment Variables

Set the appropriate API key for the chosen

--llm-provider

```
QWEN_API_KEY
```
— for
```
--llm-provider qwen
```

OPENROUTER_API_KEY

— for

--llm-provider openrouter

```
GLM_API_KEY
```
— for
```
--llm-provider glm
```
```
MINIMAX_API_KEY
```
— for
```
--llm-provider minimax
```

Workflow

The orchestrator runs this pipeline automatically:

Download — fetch video + platform subtitles (Bilibili/YouTube) or accept local file
Split — divide videos longer than the built-in threshold into segments for parallel analysis
Transcribe — use platform subtitles or Whisper AI;
```
--force-whisper
```
overrides
Analyze — LLM scores transcript segments for engagement;
```
--user-intent
```
steers selection
Generate clips — ffmpeg cuts the video at identified timestamps
Add titles (opt-in) — render artistic text overlay using
```
--title-style
```
Generate covers — create thumbnail image for each clip

Use

--skip-clips

--skip-cover

to skip specific steps. Use

--add-titles

to enable artistic titles. Use

--skip-download

and

--skip-analysis

to resume from intermediate results.

Output Example

After a successful run, report results like this:

✅ Processing complete — 5 clips generated
📁 processed_videos/video_name/clips/

  clip_01.mp4  [00:12:34 – 00:15:20]  "Title of the moment"
  clip_02.mp4  [00:28:45 – 00:31:10]  "Another highlight"
  clip_03.mp4  [00:45:00 – 00:47:30]  "Key discussion point"
  ...

Cover images: clips/*.jpg

Output Structure

processed_videos/{video_name}/
├── downloads/              # Original video, subtitles, and metadata (URL sources)
├── local_videos/           # Copied video and subtitles (local file sources)
├── splits/                 # Split parts and AI analysis results
├── clips/                  # Generated highlight clips + cover images
└── clips_post_processed/   # Post-processed clips when using --add-titles and/or --burn-subtitles

Option Selection Guide

Whisper model — Default

base

works for clear audio. Use

small

for background noise, multiple speakers, or accents. Use

turbo

for speed + accuracy. Use

large

medium

only when transcript quality is critical.

--force-whisper

— Use when platform subtitles are auto-generated (often inaccurate), when "no engaging moments found" occurs (better transcripts improve analysis), or for non-native language content where platform captions are unreliable.

--use-background

— Use for content featuring recurring personalities (streamers, hosts) where nicknames and community references matter. Reads from

prompts/background/background.md

Multi-part analysis — Videos that get split are analyzed per-segment, then aggregated to the top 5 engaging moments across all segments.

--user-intent

— Steers LLM clip selection at both the per-segment and cross-segment aggregation stages. Useful when you want to find clips about a specific topic (e.g. "AI safety predictions", "funny moments").

--burn-subtitles

— Hardcodes the SRT subtitle into the video frame. Use when you want subtitles always visible (e.g. for social media). Combine with

--subtitle-translation

to add a translated subtitle track below the original.

--speaker-references

— Enables speaker diarization for interviews/podcasts. Provide a directory of 10–30 second clean WAV clips (one per speaker), named after the speaker (e.g.

references/Host.wav

Troubleshooting

Error	Fix
"ffmpeg not found" / clip generation fails silently	Install ffmpeg: `brew install ffmpeg` (macOS) or `sudo apt install ffmpeg` (Ubuntu)
"No API key provided"	Set `QWEN_API_KEY` , `OPENROUTER_API_KEY` , `GLM_API_KEY` , or `MINIMAX_API_KEY` env var
"Video download failed"	Check network/URL; try different `--browser` ; or use local file
"Transcript generation failed"	Try `--force-whisper` or check audio quality
"No engaging moments found"	Try `--force-whisper` for better transcript accuracy
"Clip generation failed"	Ensure analysis completed; check for existing analysis file