videocaptioner
Process video subtitles — transcribe speech, optimize/translate text, burn styled subtitles into video. Use when you need to add subtitles to a video, transcribe audio, translate subtitles, or customize subtitle styles.
install
source · Clone the upstream repo
git clone https://github.com/WEIFENG2333/VideoCaptioner
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/WEIFENG2333/VideoCaptioner "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills" ~/.claude/skills/weifeng2333-videocaptioner-videocaptioner && rm -rf "$T"
manifest:
skills/SKILL.mdsource content
VideoCaptioner CLI
AI-powered video captioning: transcribe speech → optimize subtitles → translate → burn into video with beautiful styles.
When to use
- User wants to add subtitles to a video
- User wants to transcribe audio/video to text
- User wants to translate subtitles to another language
- User wants to customize subtitle appearance (colors, fonts, rounded backgrounds)
- User wants to download and subtitle online videos
Before you start
Always run
first to check the latest options and defaults before executing a command. The examples below are common patterns, but --help is the source of truth.videocaptioner <command> --help
- Install:
pip install videocaptioner - FFmpeg required for video synthesis (
on macOS)brew install ffmpeg - Free (no API key): transcription (bijian/jianying), translation (Bing/Google)
- Requires LLM API key: subtitle optimization, subtitle re-segmentation, LLM translation. Set via
env var orOPENAI_API_KEY
flag--api-key
Common scenarios
1. Give a Chinese video English subtitles (one command, all free)
videocaptioner process video.mp4 --asr bijian --translator bing --target-language en \ --subtitle-mode hard --quality high -o output.mp4
2. Transcribe a video to SRT (free)
videocaptioner transcribe video.mp4 --asr bijian -o output.srt # Output as JSON format to a directory videocaptioner transcribe video.mp4 --asr bijian --format json -o ./subtitles/
3. Translate existing subtitles
# Free Bing → English, bilingual output with translation above original videocaptioner subtitle input.srt --translator bing --target-language en --layout target-above -o translated.srt # Free Google → Japanese, translation only (discard original text) videocaptioner subtitle input.srt --translator google --target-language ja --no-optimize --layout target-only -o output_ja.srt # High quality LLM translation with reflective mode videocaptioner subtitle input.srt --translator llm --target-language en --reflect \ --api-key $OPENAI_API_KEY -o output_en.srt
4. Full pipeline with beautiful styled subtitles
# Anime-style subtitles (warm color + orange outline), high quality video videocaptioner process video.mp4 --asr bijian --translator bing --target-language ja \ --subtitle-mode hard --style anime --quality high -o output_ja.mp4 # Modern rounded background subtitles videocaptioner process video.mp4 --asr bijian --translator google --target-language ko \ --subtitle-mode hard --render-mode rounded -o output_ko.mp4 # Custom colors: white text with red outline, ultra quality videocaptioner process video.mp4 --asr bijian --translator bing --target-language en \ --subtitle-mode hard --quality ultra \ --style-override '{"outline_color": "#ff0000", "primary_color": "#ffffff"}' -o output_en.mp4
5. Subtitle only, output as ASS format (no video synthesis)
videocaptioner process video.mp4 --asr bijian --translator bing --target-language en \ --format ass --no-synthesize -o ./output/
6. Step-by-step control (transcribe → translate → synthesize separately)
# Step 1: Transcribe videocaptioner transcribe video.mp4 --asr bijian -o video.srt # Step 2: Translate (bilingual, original text above translation) videocaptioner subtitle video.srt --translator bing --target-language en --layout source-above -o video_en.srt # Step 3: Burn into video with rounded background, high quality videocaptioner synthesize video.mp4 -s video_en.srt --subtitle-mode hard \ --render-mode rounded --quality high -o video_with_subs.mp4
7. Process audio file (auto-skips video synthesis)
videocaptioner process podcast.mp3 --asr bijian --translator bing --target-language en -o ./output/
8. Transcribe other languages (whisper-api)
videocaptioner transcribe french_video.mp4 --asr whisper-api \ --whisper-api-key $OPENAI_API_KEY --language fr -o french.srt
9. Only optimize subtitles with LLM (fix ASR errors, no translation)
videocaptioner subtitle raw_subtitle.srt --no-translate --api-key $OPENAI_API_KEY -o optimized.srt
10. Custom rounded background style with custom font
videocaptioner synthesize video.mp4 -s subtitle.srt --subtitle-mode hard \ --style-override '{"text_color": "#ffffff", "bg_color": "#000000cc", "corner_radius": 10, "font_size": 36}' \ --font-file ./NotoSansSC.ttf --quality high -o styled_video.mp4
Command reference
| Command | Purpose |
|---|---|
| Speech → subtitles. Engines: (free) (free) |
| Optimize (LLM) and/or translate (LLM/Bing/Google) subtitle files |
| Burn subtitles into video with customizable styles |
| Full pipeline: transcribe → optimize → translate → synthesize |
| Download video from YouTube, Bilibili, etc. |
| Manage settings ( ) |
| List all subtitle style presets with parameters |
Run
videocaptioner <command> --help for full options.
Subtitle styles
Two rendering modes for beautiful subtitles:
ASS mode (default) — outline/shadow style:
- Presets:
(white+black outline),default
(warm+orange outline),anime
(portrait videos)vertical - Customizable fields:
,font_name
,font_size
,primary_color
,outline_color
,outline_width
,bold
,spacingmargin_bottom
Rounded mode — modern rounded background boxes:
- Preset:
(dark text on semi-transparent background)rounded - Customizable fields:
,font_name
,font_size
,text_color
(#rrggbbaa),bg_color
,corner_radius
,padding_h
,padding_vmargin_bottom
Style options only work with
--subtitle-mode hard.
Target languages
BCP 47 codes:
zh-Hans zh-Hant en ja ko fr de es ru pt it ar th vi id and 23 more.
Environment variables
| Variable | Purpose |
|---|---|
| LLM API key |
| LLM API base URL |
Exit codes
0 success · 2 bad arguments · 3 file not found · 4 missing dependency · 5 runtime error
Tips
- Use
for scripting (stdout = result path only)-q - Bing/Google translation is free, no API key needed
/bijian
ASR is free but only supports Chinese & Englishjianying- Run
to see all style presetsvideocaptioner style