Clawfu-skills voice-localization
Scale your brand voice across multiple languages using AI voice synthesis, maintaining consistent character and quality for global content. Use when: Expanding video content to new language markets; Creating multilingual courses or training; Localizing ads and marketing videos; Dubbing existing content for international audiences; Building consistent global brand voice
git clone https://github.com/guia-matthieu/clawfu-skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/guia-matthieu/clawfu-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/audio/voice-localization" ~/.claude/skills/guia-matthieu-clawfu-skills-voice-localization && rm -rf "$T"
skills/audio/voice-localization/SKILL.mdAI Voice Localization
Scale your brand voice across multiple languages using AI voice synthesis, maintaining consistent character and quality for global content.
When to Use This Skill
- Expanding video content to new language markets
- Creating multilingual courses or training
- Localizing ads and marketing videos
- Dubbing existing content for international audiences
- Building consistent global brand voice
- Deciding between dubbing vs. subtitles
Methodology Foundation
Source: ElevenLabs Multilingual + Global Content Best Practices
Core Principle: True localization means the same perceived person speaks each language natively—not a translated voice, but a voice that sounds local while maintaining brand character. AI voice synthesis enables this at scale by preserving voice identity while adapting pronunciation and rhythm to each language.
Why This Matters: Global content traditionally required separate voice actors per language, losing brand consistency. AI voice localization maintains the same "person" across 29+ languages, creating unified brand experience worldwide while reducing production costs 70-90%.
What Claude Does vs What You Decide
| Claude Does | You Decide |
|---|---|
| Structures production workflow | Final creative direction |
| Suggests technical approaches | Equipment and tool choices |
| Creates templates and checklists | Quality standards |
| Identifies best practices | Brand/voice decisions |
| Generates script outlines | Final script approval |
What This Skill Does
- Maintains voice identity across languages - Same character, different language
- Handles cultural adaptation - Beyond translation to localization
- Manages multilingual production - Efficient workflows for many languages
- Ensures quality per market - Native speaker validation
- Calculates ROI - Traditional dubbing vs. AI localization costs
How to Use
Plan Localization Project
Help me plan voice localization for [content]. Source language: [original] Target languages: [list] Content type: [video/audio/course] Volume: [duration/number of assets]
Evaluate Localization Approach
Should I use AI voice localization or traditional dubbing? Content: [describe] Markets: [target countries] Budget: [range] Timeline: [deadline]
Instructions
When localizing voice content, follow this methodology:
Step 1: Assess Localization Needs
Determine the right approach for your content.
## Localization Decision Matrix ### When to Use AI Voice Localization ✓ Same brand voice needed across markets ✓ Frequent content updates (efficiency matters) ✓ Educational/informational content ✓ Budget constraints ✓ Quick turnaround needed ✓ 5+ languages needed ### When to Use Traditional Dubbing ✓ Character-driven content (emotions critical) ✓ One-time major production ✓ Markets expect dubbed content (Germany, France) ✓ Complex lip-sync requirements ✓ Budget allows $1,000+ per language ### When to Use Subtitles Instead ✓ Documentary/interview content ✓ Authenticity of original voice matters ✓ Lowest budget option ✓ Markets prefer subtitles (Nordics, Netherlands) ✓ Legal/compliance content (exact words matter) ### Hybrid Approach Hero content → Traditional dubbing Supporting content → AI localization Supplementary → Subtitles
Step 2: Select Languages Strategically
Prioritize languages based on market opportunity.
## Language Prioritization Framework ### Tier 1: High Volume Languages (1B+ speakers) | Language | Global Speakers | Key Markets | |----------|----------------|-------------| | English | 1.5B | Global | | Mandarin | 1.1B | China | | Spanish | 550M | LATAM, Spain | | Hindi | 600M | India | ### Tier 2: High Value Languages | Language | Economic Value | Markets | |----------|---------------|---------| | German | High GDP | DACH | | French | Colonial reach | France, Africa | | Japanese | High spending | Japan | | Portuguese | Large market | Brazil | ### Tier 3: Strategic Languages | Language | Strategic Value | Markets | |----------|----------------|---------| | Arabic | Growing middle class | MENA | | Korean | Tech-forward | South Korea | | Italian | Fashion/luxury | Italy | | Dutch | High English | Benelux | ### ElevenLabs Supported Languages (29+) English, Spanish, French, German, Italian, Portuguese, Polish, Dutch, Hindi, Arabic, Chinese, Japanese, Korean, Turkish, Swedish, Indonesian, Filipino, Malay, Russian, Czech, Danish, Finnish, Greek, Romanian, Ukrainian, Vietnamese, Norwegian, Hungarian, Tamil, and more.
Step 3: Prepare Content for Localization
Translation alone isn't enough—prepare for voice adaptation.
## Content Preparation Checklist ### Script Adaptation **Text expansion/contraction**: | Language | vs English | |----------|-----------| | German | +30% longer | | French | +15-20% longer | | Spanish | +15-25% longer | | Chinese | -30% shorter | | Japanese | Variable | **Implications**: - Video may need re-timing - Allow flexibility in pacing - Consider sentence splitting for longer languages **Localization notes to provide**: □ Brand terms (don't translate, keep English) □ Product names (pronunciation guide) □ Numbers (format varies by locale) □ Dates (format varies by locale) □ Currency (localize amounts) □ Cultural references (adapt or explain) ### Voice Consistency Notes **Preserve across languages**: - Character/personality - Energy level - Authority/warmth balance - Pace relative to content **Adapt per language**: - Natural rhythm and cadence - Pronunciation of brand terms - Formal/informal register (varies by culture)
Step 4: Production Workflow
Efficient process for multilingual voice production.
## Multilingual Production Pipeline ### Phase 1: Source Production 1. Finalize English script 2. Record/generate English voice 3. Lock timing and pacing 4. Create master video/audio ### Phase 2: Translation 1. Professional translation (not machine) 2. Localization review (cultural adaptation) 3. Timing adaptation (fit original duration) 4. Brand term glossary enforcement ### Phase 3: Voice Generation **Per language**:
- Load translated script
- Apply same voice settings as source
- Generate voice in target language
- Check pronunciation of brand terms
- Adjust pacing if needed
- Review for naturalness
### Phase 4: Quality Control **Native speaker review checklist**: □ Natural pronunciation □ Correct emphasis and intonation □ Brand terms handled correctly □ No awkward phrasing □ Appropriate formality level □ Cultural appropriateness ### Phase 5: Integration 1. Replace audio track in video 2. Re-sync if timing changed 3. Update text overlays 4. Localize captions/subtitles 5. Final review per language
Step 5: Quality Assurance
Ensure each language meets standards.
## Localization QA Framework ### Technical QA □ Audio levels consistent across languages □ No clipping or distortion □ Background music balanced correctly □ Transitions smooth □ Sync with video acceptable ### Linguistic QA □ Translation accuracy (spot check 10%) □ Natural flow and rhythm □ Brand voice maintained □ Technical terms correct □ No machine-translation artifacts ### Cultural QA □ No offensive content for market □ References appropriate □ Humor/idioms adapted correctly □ Visual content appropriate □ Call-to-action localized ### Native Speaker Sign-Off For each language: - [ ] Spanish (Reviewer: _____) ☐ Approved - [ ] French (Reviewer: _____) ☐ Approved - [ ] German (Reviewer: _____) ☐ Approved - [ ] [Add languages...]
Step 6: Calculate ROI
Compare AI localization to traditional approaches.
## Localization Cost Comparison ### Traditional Dubbing (per language) | Component | Cost | |-----------|------| | Translation | $0.15/word | | Voice talent | $300-1,000/hour finished | | Studio time | $100-200/hour | | Direction | $50-100/hour | | Engineering | $50-100/hour | **Example**: 10-minute video (1,500 words) - Translation: $225 - Voice talent: $400 - Studio: $200 - Direction: $150 - Engineering: $100 - **Total: ~$1,075 per language** ### AI Voice Localization | Component | Cost | |-----------|------| | Translation | $0.15/word | | ElevenLabs Pro | $99/mo (unlimited) | | QA review | $50-100/language | **Example**: 10-minute video (1,500 words) - Translation: $225 - Voice generation: ~$0 (within plan) - QA review: $75 - **Total: ~$300 per language** ### ROI Summary | Languages | Traditional | AI Localization | Savings | |-----------|-------------|-----------------|---------| | 5 | $5,375 | $1,500 | 72% | | 10 | $10,750 | $3,000 | 72% | | 20 | $21,500 | $6,000 | 72% | **Break-even**: AI localization typically saves 70%+ vs. traditional
Examples
Example 1: E-Learning Course Localization
Context: Localizing a 20-video course to 5 languages.
Input:
We have a 20-video SaaS training course in English (total 3 hours). Need to localize to Spanish, French, German, Portuguese, and Japanese. What's the best approach?
Output:
Course Localization Plan
Project Scope
- Content: 20 videos, 3 hours total
- Source: English
- Targets: Spanish, French, German, Portuguese, Japanese
- Content type: Educational (clarity > emotion)
Recommended Approach: AI Voice Localization
Why AI is ideal here:
- Educational content (fact delivery, not drama)
- 5 languages (efficiency critical)
- Likely ongoing updates (need repeatable process)
- Consistent "instructor" voice across all versions
Implementation Plan
Platform: ElevenLabs Pro ($99/mo)
Voice Strategy:
- Clone original instructor OR design consistent voice
- Same voice speaks all 6 languages
- Maintains authority and teaching style
Workflow per Video:
-
Translation (outsource)
- Professional translation (not machine)
- Provide glossary of product terms
- Timing guidance for text expansion
-
Voice Generation
- Generate all 5 languages from same voice
- ~15 minutes per language per video
- Total: ~25 hours voice generation
-
QA Review
- Native speaker review per language
- Focus: pronunciation, naturalness, accuracy
- Budget 1 hour review per language per video
-
Video Integration
- Replace audio tracks
- Adjust timing if needed
- Verify captions match
Timeline
| Phase | Duration | Notes |
|---|---|---|
| Translation (all) | 2 weeks | Parallel |
| Voice generation | 1 week | ~5 hours/day |
| QA review | 2 weeks | Parallel per language |
| Integration | 1 week | Parallel |
| Total | 5-6 weeks | With buffer |
Budget Estimate
| Item | Cost |
|---|---|
| Translation (5 lang × 27k words) | $20,250 |
| ElevenLabs (2 months) | $200 |
| QA review (5 lang × 20 videos × $30) | $3,000 |
| Integration (20 hours @ $50) | $1,000 |
| Total | ~$24,500 |
vs. Traditional dubbing: ~$107,500 (77% savings)
Example 2: Global Ad Campaign
Context: 30-second ad needs localization for global launch.
Input:
We're launching a 30-second video ad globally. Need 10 language versions in 2 weeks. Budget is $15,000.
Output:
Ad Localization: Rapid Global Launch
Constraints
- 30-second ad (tight timing)
- 10 languages
- 2-week deadline
- $15,000 budget (~$1,500/language)
Recommended: Hybrid Approach
Tier 1 (Hero Markets) - Traditional Dubbing
- English (source)
- Spanish (largest reach)
- German (high value)
- French (high value)
Tier 2 (Scale Markets) - AI Localization
- Portuguese, Italian, Dutch, Polish, Japanese, Korean
Rationale
- Hero markets get premium treatment
- AI handles scale efficiently
- Both meet deadline
Production Schedule
Week 1:
| Day | Task |
|---|---|
| 1-2 | All translations complete |
| 2-3 | Traditional dubbing sessions (4 languages) |
| 3-4 | AI voice generation (6 languages) |
| 4-5 | QA review all versions |
Week 2:
| Day | Task |
|---|---|
| 1-2 | Revisions and fixes |
| 3-4 | Video integration all versions |
| 5 | Final review and delivery |
Budget Allocation
| Item | Cost |
|---|---|
| Translation (10 × ~120 words) | $1,800 |
| Traditional dubbing (4 lang) | $4,800 |
| AI generation (6 lang) | $600 |
| QA review (10 lang) | $2,000 |
| Integration (10 lang) | $2,500 |
| Buffer | $3,300 |
| Total | $15,000 |
Checklists & Templates
Localization Project Checklist
## Pre-Production □ Languages selected and prioritized □ Budget allocated per language □ Timeline established □ Translation vendor selected □ Brand glossary prepared □ Voice consistency plan defined ## Production □ Translations complete □ Translations reviewed for brand terms □ Voice generated per language □ Pronunciation verified □ Timing adjusted if needed ## Quality Assurance □ Native speaker review complete □ Technical QA passed □ Brand guidelines verified □ Cultural review passed □ Legal/compliance check (if needed) ## Delivery □ Files named correctly per language □ All formats delivered □ Captions/subtitles provided □ Documentation complete □ Source files archived
Brand Glossary Template
## [Brand] Localization Glossary ### Never Translate | English | Note | |---------|------| | [Brand Name] | Keep English, pronunciation: [X] | | [Product Name] | Keep English | | [Feature Name] | Keep English, explain in context | ### Translate Consistently | English | Spanish | French | German | |---------|---------|--------|--------| | Dashboard | Panel | Tableau de bord | Dashboard | | Workflow | Flujo de trabajo | Flux de travail | Arbeitsablauf | | [Term] | | | | ### Pronunciation Guide | Term | Pronunciation | |------|--------------| | [Brand] | /brănd/ | | [Feature] | /fē-chər/ |
Skill Boundaries
What This Skill Does Well
- Structuring audio production workflows
- Providing technical guidance
- Creating quality checklists
- Suggesting creative approaches
What This Skill Cannot Do
- Replace audio engineering expertise
- Make subjective creative decisions
- Access or edit audio files directly
- Guarantee commercial success
References
- ElevenLabs. "Multilingual Voice Synthesis" - Platform documentation
- CSA Research. "Global Content Strategy" - Localization best practices
- Unbabel. "The State of Localization" - Industry benchmarks
- Nimdzi. "Localization Market Research" - Cost and ROI data
Related Skills
- voice-design - Creating the base voice
- voiceover-direction - Quality control principles
- transcription-to-content - Preparing source content
Skill Metadata (Internal Use)
name: voice-localization category: audio subcategory: voice version: 1.0 author: MKTG Skills source_expert: ElevenLabs, Localization Best Practices source_work: Multilingual Content Production difficulty: intermediate estimated_value: 70%+ cost savings vs. traditional dubbing tags: [localization, multilingual, dubbing, ai-voice, global] created: 2026-01-26 updated: 2026-01-26