Claude-skill-registry elevenlabs-voice-cloning
ElevenLabs voice cloning techniques, audio quality requirements, recording best practices, and training data optimization for professional-quality voice clones. Use when creating custom voices, cloning voices, or optimizing voice clone quality.
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/elevenlabs-voice-cloning" ~/.claude/skills/majiayu000-claude-skill-registry-elevenlabs-voice-cloning && rm -rf "$T"
manifest:
skills/data/elevenlabs-voice-cloning/SKILL.mdsource content
ElevenLabs Voice Cloning
Recording Requirements (CRITICAL)
Audio Quality
Environment:
- Acoustically-treated room (no echo/reverb)
- No background noise, music, or interference
- Professional or high-quality USB microphone
- Quiet location (no HVAC, traffic, roommates)
Microphone Technique:
- Distance: 2 fists from microphone
- Consistent positioning throughout
- Pop filter recommended
- Proper gain staging (not too loud/quiet)
Training Data Specifications
Length Requirements:
Instant Cloning: 60 seconds minimum Professional: 30 minutes minimum Optimal Quality: 3 hours ideal
Content Diversity:
Include: ├─ Varied emotions (happy, sad, neutral, excited) ├─ Different speaking styles (casual, professional, energetic) ├─ Questions and statements ├─ Different paces (fast, slow, normal) └─ Emphasis variations
Language Considerations:
- Record in target output language
- Cross-language may have accent
- Example: English sample → Spanish output = English accent
Audio Characteristics
What AI Learns:
- Emotional range in samples
- Prosody variations
- Speaking pace patterns
- Voice timbre and characteristics
- Inflection patterns
Important: AI can only replicate what it's trained on. Flat, monotonous samples = flat, monotonous voice.
Cloning Workflow
// 1. Prepare samples (3+ files recommended) const samples = [ 'sample1_conversational.mp3', 'sample2_professional.mp3', 'sample3_emotional.mp3' ] // 2. Clone voice await mcp__elevenlabs__voice_clone({ name: "Professional Narrator", files: samples, description: "Warm, authoritative voice for educational content" }) // 3. Test and refine // Generate test samples // Evaluate quality // Re-record if needed
Quality Optimization
Pre-Recording Checklist
- Silent room test (record 10s silence, check for noise)
- Microphone positioned correctly
- Pop filter in place
- Recording levels set (-12dB to -6dB peaks)
- Script prepared with varied content
- Warm up voice (read aloud 5 minutes)
Post-Recording
- Listen to each sample
- Remove takes with mistakes
- Normalize volume if needed
- Export as high-quality MP3 or WAV
- Verify no clipping or distortion
Common Issues & Solutions
Issue: Clone sounds robotic
- Add more expressive samples
- Include varied emotional range
- Record longer samples (3+ hours)
Issue: Inconsistent voice
- Ensure consistent microphone technique
- Record all samples in one session
- Same environment/equipment
Issue: Background noise
- Re-record in quieter location
- Use noise reduction (carefully)
- Better microphone/acoustic treatment
Resources
- ElevenLabs Voice Cloning Guide
- Recording Techniques for Voice Actors
- Microphone Setup Best Practices