Clawfu-skills pydub-automation
Automate repetitive audio tasks with Python using PyDub for batch processing, format conversion, normalization, and content assembly. Use when: Processing large numbers of audio files consistently; Converting between audio formats at scale; Normalizing loudness across a batch of files; Assembling intros/outros automatically to episodes; Trimming silence or extracting segments programmatically
git clone https://github.com/guia-matthieu/clawfu-skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/guia-matthieu/clawfu-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/audio/pydub-automation" ~/.claude/skills/guia-matthieu-clawfu-skills-pydub-automation && rm -rf "$T"
skills/audio/pydub-automation/SKILL.mdPyDub Audio Automation
Automate repetitive audio tasks with Python using PyDub for batch processing, format conversion, normalization, and content assembly.
When to Use This Skill
- Processing large numbers of audio files consistently
- Converting between audio formats at scale
- Normalizing loudness across a batch of files
- Assembling intros/outros automatically to episodes
- Trimming silence or extracting segments programmatically
- Building audio pipelines for content production
Methodology Foundation
Source: PyDub Library (James Robert) + Python Audio Processing
Core Principle: "Audio operations that take hours manually can run in minutes with code." PyDub provides a high-level interface that abstracts FFmpeg's complexity, making common operations accessible to non-audio engineers.
Why This Matters: Content teams producing regular podcasts, courses, or video content spend significant time on repetitive audio tasks. Automation enables consistent quality at scale while freeing humans for creative work.
What Claude Does vs What You Decide
| Claude Does | You Decide |
|---|---|
| Structures production workflow | Final creative direction |
| Suggests technical approaches | Equipment and tool choices |
| Creates templates and checklists | Quality standards |
| Identifies best practices | Brand/voice decisions |
| Generates script outlines | Final script approval |
What This Skill Does
- Batch processes audio files - Apply same operations to hundreds of files
- Converts formats - MP3, WAV, FLAC, OGG, and more
- Normalizes loudness - Consistent levels across episodes
- Assembles content - Concatenate intros, content, outros
- Extracts segments - Trim, split, and slice audio programmatically
How to Use
Generate Processing Script
Help me write a PyDub script to [describe task]. Input files: [format, location] Output requirements: [format, specs]
Create Batch Workflow
Create a Python script that processes all audio files in a folder: - Input: [source folder, file type] - Operations: [what to do] - Output: [destination, naming convention]
Debug Audio Script
This PyDub script isn't working as expected: [paste code] Expected: [what you want] Actual: [what's happening]
Instructions
When automating audio with PyDub, follow this methodology:
Step 1: Setup and Prerequisites
## Installation # Install PyDub pip install pydub # FFmpeg is required (PyDub uses it under the hood) # macOS: brew install ffmpeg # Ubuntu/Debian: sudo apt-get install ffmpeg # Windows: # Download from ffmpeg.org, add to PATH
## Basic Imports from pydub import AudioSegment from pydub.effects import normalize, compress_dynamic_range from pydub.silence import detect_silence, split_on_silence import os from pathlib import Path
Step 2: Core Operations
## Loading and Saving Audio # Load audio file (format auto-detected from extension) audio = AudioSegment.from_file("input.mp3") audio = AudioSegment.from_file("input.wav", format="wav") # Save audio file audio.export("output.mp3", format="mp3", bitrate="192k") audio.export("output.wav", format="wav") # Export with metadata audio.export( "output.mp3", format="mp3", bitrate="192k", tags={"artist": "Brand Name", "album": "Podcast"} )
## Basic Properties print(f"Duration: {len(audio)} ms") print(f"Channels: {audio.channels}") print(f"Frame rate: {audio.frame_rate} Hz") print(f"Sample width: {audio.sample_width} bytes") print(f"dBFS: {audio.dBFS}") # Volume level
Step 3: Volume and Normalization
## Volume Adjustments # Increase volume by 6 dB louder = audio + 6 # Decrease volume by 3 dB quieter = audio - 3 # Normalize to target level (0 dB = maximum) normalized = normalize(audio) # Normalize to specific headroom def normalize_to_target(audio, target_dBFS=-16): """Normalize audio to target loudness.""" change_in_dBFS = target_dBFS - audio.dBFS return audio.apply_gain(change_in_dBFS) normalized = normalize_to_target(audio, target_dBFS=-16)
## Batch Normalization def normalize_folder(input_dir, output_dir, target_dBFS=-16): """Normalize all audio files in a folder.""" input_path = Path(input_dir) output_path = Path(output_dir) output_path.mkdir(exist_ok=True) for file in input_path.glob("*.mp3"): audio = AudioSegment.from_file(file) normalized = normalize_to_target(audio, target_dBFS) output_file = output_path / file.name normalized.export(output_file, format="mp3", bitrate="192k") print(f"Processed: {file.name}") # Usage normalize_folder("raw_episodes/", "processed_episodes/", target_dBFS=-16)
Step 4: Concatenation and Assembly
## Basic Concatenation intro = AudioSegment.from_file("intro.mp3") content = AudioSegment.from_file("episode.mp3") outro = AudioSegment.from_file("outro.mp3") # Concatenate (+ operator) full_episode = intro + content + outro # Add silence between segments silence = AudioSegment.silent(duration=2000) # 2 seconds full_episode = intro + silence + content + silence + outro full_episode.export("final_episode.mp3", format="mp3")
## Podcast Assembly Script def assemble_episode( content_file, intro_file="assets/intro.mp3", outro_file="assets/outro.mp3", output_file=None, intro_fade_ms=500, outro_fade_ms=500 ): """ Assemble podcast episode with intro and outro. Includes crossfade for professional sound. """ intro = AudioSegment.from_file(intro_file) content = AudioSegment.from_file(content_file) outro = AudioSegment.from_file(outro_file) # Apply fade out to intro, fade in to content intro = intro.fade_out(intro_fade_ms) content = content.fade_in(intro_fade_ms).fade_out(outro_fade_ms) outro = outro.fade_in(outro_fade_ms) # Crossfade join episode = intro.append(content, crossfade=intro_fade_ms) episode = episode.append(outro, crossfade=outro_fade_ms) # Generate output filename if not provided if output_file is None: output_file = content_file.replace(".mp3", "_final.mp3") episode.export(output_file, format="mp3", bitrate="192k") print(f"Assembled: {output_file} ({len(episode)/1000:.1f}s)") return output_file # Usage assemble_episode("episode_042_raw.mp3")
Step 5: Trimming and Splitting
## Time-Based Trimming # Extract segment (milliseconds) # audio[start:end] first_30_seconds = audio[:30000] last_minute = audio[-60000:] middle_section = audio[60000:120000] # Remove first 5 seconds (skip intro) without_intro = audio[5000:]
## Silence-Based Operations from pydub.silence import detect_silence, split_on_silence # Detect silence regions # Returns list of [start, end] in milliseconds silence_ranges = detect_silence( audio, min_silence_len=1000, # Minimum 1 second silence silence_thresh=-40 # dB threshold for "silence" ) # Split on silence (useful for chapter markers) chunks = split_on_silence( audio, min_silence_len=500, silence_thresh=-40, keep_silence=250 # Keep 250ms of silence on each side ) # Export chunks for i, chunk in enumerate(chunks): chunk.export(f"segment_{i:03d}.mp3", format="mp3")
## Trim Silence from Start/End def trim_silence(audio, silence_thresh=-50, chunk_size=10): """Remove silence from beginning and end of audio.""" # Find first non-silent moment start_trim = 0 for i in range(0, len(audio), chunk_size): if audio[i:i+chunk_size].dBFS > silence_thresh: start_trim = max(0, i - 100) # Keep 100ms before break # Find last non-silent moment end_trim = len(audio) for i in range(len(audio), 0, -chunk_size): if audio[i-chunk_size:i].dBFS > silence_thresh: end_trim = min(len(audio), i + 100) # Keep 100ms after break return audio[start_trim:end_trim]
Step 6: Format Conversion
## Batch Format Conversion def convert_folder(input_dir, output_dir, output_format="mp3", **export_kwargs): """ Convert all audio files to specified format. Example kwargs for MP3: bitrate="192k" Example kwargs for WAV: parameters=["-ac", "1"] # mono """ input_path = Path(input_dir) output_path = Path(output_dir) output_path.mkdir(exist_ok=True) # Supported input formats extensions = ["*.mp3", "*.wav", "*.flac", "*.ogg", "*.m4a"] for ext in extensions: for file in input_path.glob(ext): audio = AudioSegment.from_file(file) output_file = output_path / f"{file.stem}.{output_format}" audio.export(output_file, format=output_format, **export_kwargs) print(f"Converted: {file.name} → {output_file.name}") # Usage: Convert WAVs to MP3 convert_folder("recordings/", "mp3_output/", output_format="mp3", bitrate="192k") # Usage: Convert to mono WAV convert_folder("stereo/", "mono/", output_format="wav", parameters=["-ac", "1"])
Step 7: Complete Pipeline Example
## Full Podcast Processing Pipeline from pydub import AudioSegment from pydub.effects import normalize from pathlib import Path import json from datetime import datetime class PodcastProcessor: """ Complete podcast processing pipeline. Operations: 1. Normalize loudness 2. Trim silence 3. Add intro/outro 4. Export with metadata """ def __init__(self, config_file="podcast_config.json"): with open(config_file) as f: self.config = json.load(f) self.intro = AudioSegment.from_file(self.config["intro_file"]) self.outro = AudioSegment.from_file(self.config["outro_file"]) def process_episode(self, input_file, episode_number, title): """Process a single episode through the full pipeline.""" print(f"Processing Episode {episode_number}: {title}") # 1. Load and normalize audio = AudioSegment.from_file(input_file) target_db = self.config.get("target_loudness", -16) audio = self._normalize_to_target(audio, target_db) print(f" ✓ Normalized to {target_db} dBFS") # 2. Trim silence audio = self._trim_silence(audio) print(f" ✓ Trimmed silence (duration: {len(audio)/1000:.1f}s)") # 3. Add intro/outro with crossfade fade_ms = self.config.get("crossfade_ms", 500) intro = self.intro.fade_out(fade_ms) outro = self.outro.fade_in(fade_ms) audio = audio.fade_in(fade_ms).fade_out(fade_ms) final = intro.append(audio, crossfade=fade_ms) final = final.append(outro, crossfade=fade_ms) print(f" ✓ Added intro/outro (total: {len(final)/1000:.1f}s)") # 4. Export with metadata output_dir = Path(self.config["output_dir"]) output_dir.mkdir(exist_ok=True) filename = f"episode_{episode_number:03d}.mp3" output_path = output_dir / filename final.export( output_path, format="mp3", bitrate=self.config.get("bitrate", "192k"), tags={ "title": f"Episode {episode_number}: {title}", "artist": self.config["podcast_name"], "album": self.config["podcast_name"], "track": episode_number, "date": datetime.now().strftime("%Y"), } ) print(f" ✓ Exported: {output_path}") return output_path def _normalize_to_target(self, audio, target_dBFS): change = target_dBFS - audio.dBFS return audio.apply_gain(change) def _trim_silence(self, audio, threshold=-50): # Simple trim - just first/last 100ms chunks chunk_size = 100 start = 0 for i in range(0, min(5000, len(audio)), chunk_size): if audio[i:i+chunk_size].dBFS > threshold: start = max(0, i - 100) break end = len(audio) for i in range(len(audio), max(0, len(audio)-5000), -chunk_size): if audio[i-chunk_size:i].dBFS > threshold: end = min(len(audio), i + 100) break return audio[start:end] # Example config file (podcast_config.json): """ { "podcast_name": "Marketing Mastery", "intro_file": "assets/intro.mp3", "outro_file": "assets/outro.mp3", "output_dir": "processed_episodes/", "target_loudness": -16, "crossfade_ms": 500, "bitrate": "192k" } """ # Usage: # processor = PodcastProcessor("podcast_config.json") # processor.process_episode("raw/ep42.mp3", 42, "The Power of Audio")
Examples
Example 1: Batch Normalize Podcast Backlog
Context: 50 podcast episodes with inconsistent loudness levels.
Input:
Write a script to normalize all MP3 files in my episodes/ folder to -16 LUFS, keeping the originals as backup.
Output:
#!/usr/bin/env python3 """ Batch normalize podcast episodes to -16 dBFS. Creates backups before processing. """ from pydub import AudioSegment from pathlib import Path import shutil def batch_normalize( input_dir: str, target_dBFS: float = -16, create_backup: bool = True ): """ Normalize all MP3 files in directory to target loudness. Args: input_dir: Folder containing MP3 files target_dBFS: Target loudness level create_backup: If True, saves originals to backup/ folder """ input_path = Path(input_dir) # Create backup folder if needed if create_backup: backup_path = input_path / "backup" backup_path.mkdir(exist_ok=True) # Find all MP3 files mp3_files = list(input_path.glob("*.mp3")) print(f"Found {len(mp3_files)} MP3 files") for i, file in enumerate(mp3_files, 1): print(f"\n[{i}/{len(mp3_files)}] Processing: {file.name}") # Backup original if create_backup: shutil.copy(file, backup_path / file.name) # Load and check current level audio = AudioSegment.from_file(file) original_dBFS = audio.dBFS print(f" Original level: {original_dBFS:.1f} dBFS") # Calculate adjustment needed adjustment = target_dBFS - original_dBFS if abs(adjustment) < 0.5: print(f" Already at target, skipping") continue # Apply normalization normalized = audio.apply_gain(adjustment) # Export (overwrite original) normalized.export(file, format="mp3", bitrate="192k") print(f" Normalized: {original_dBFS:.1f} → {target_dBFS} dBFS") print(f"\n✓ Complete! Processed {len(mp3_files)} files") if create_backup: print(f" Backups saved to: {backup_path}") if __name__ == "__main__": import sys input_dir = sys.argv[1] if len(sys.argv) > 1 else "episodes/" batch_normalize(input_dir, target_dBFS=-16)
Usage:
python normalize_episodes.py episodes/
Example 2: Auto-Add Intro/Outro to New Episodes
Context: Weekly podcast needs intro/outro added to each new recording.
Input:
Create a script that watches a folder for new recordings and automatically adds my standard intro and outro.
Output:
#!/usr/bin/env python3 """ Auto-assemble podcast episodes when new recordings are added. Watch 'incoming/' folder, output to 'ready/' folder. """ from pydub import AudioSegment from pathlib import Path import time # Configuration WATCH_DIR = Path("incoming/") OUTPUT_DIR = Path("ready/") INTRO_FILE = Path("assets/intro.mp3") OUTRO_FILE = Path("assets/outro.mp3") CROSSFADE_MS = 500 TARGET_DBFS = -16 def assemble_episode(input_file: Path) -> Path: """Add intro/outro and normalize a podcast episode.""" print(f"\nProcessing: {input_file.name}") # Load audio intro = AudioSegment.from_file(INTRO_FILE) content = AudioSegment.from_file(input_file) outro = AudioSegment.from_file(OUTRO_FILE) # Normalize content to target adjustment = TARGET_DBFS - content.dBFS content = content.apply_gain(adjustment) print(f" Normalized: {content.dBFS:.1f} dBFS") # Apply fades intro = intro.fade_out(CROSSFADE_MS) content = content.fade_in(CROSSFADE_MS).fade_out(CROSSFADE_MS) outro = outro.fade_in(CROSSFADE_MS) # Assemble with crossfade episode = intro.append(content, crossfade=CROSSFADE_MS) episode = episode.append(outro, crossfade=CROSSFADE_MS) # Export output_file = OUTPUT_DIR / input_file.name episode.export(output_file, format="mp3", bitrate="192k") print(f" Created: {output_file} ({len(episode)/1000/60:.1f} min)") return output_file def watch_and_process(): """Watch folder and process new files.""" WATCH_DIR.mkdir(exist_ok=True) OUTPUT_DIR.mkdir(exist_ok=True) processed = set() print(f"Watching {WATCH_DIR} for new recordings...") print(f"Output to: {OUTPUT_DIR}") print("Press Ctrl+C to stop\n") while True: for file in WATCH_DIR.glob("*.mp3"): if file.name not in processed: try: assemble_episode(file) processed.add(file.name) # Move original to archive archive = WATCH_DIR / "processed" archive.mkdir(exist_ok=True) file.rename(archive / file.name) except Exception as e: print(f" Error: {e}") time.sleep(5) # Check every 5 seconds if __name__ == "__main__": watch_and_process()
Checklists & Templates
PyDub Project Setup
## Requirements □ Python 3.7+ installed □ pip install pydub □ FFmpeg installed and in PATH □ Test: python -c "from pydub import AudioSegment; print('OK')" ## Project Structure project/ ├── scripts/ │ ├── normalize.py │ ├── assemble.py │ └── convert.py ├── assets/ │ ├── intro.mp3 │ └── outro.mp3 ├── incoming/ # Raw recordings ├── processed/ # Final output └── config.json # Settings
Common Operations Reference
## Quick Reference # Load audio = AudioSegment.from_file("file.mp3") # Save audio.export("out.mp3", format="mp3", bitrate="192k") # Volume louder = audio + 6 # +6 dB quieter = audio - 3 # -3 dB # Trim first_30s = audio[:30000] # milliseconds last_min = audio[-60000:] # Concatenate combined = audio1 + audio2 + audio3 # Fade audio = audio.fade_in(500).fade_out(500) # Crossfade combined = audio1.append(audio2, crossfade=500) # Normalize from pydub.effects import normalize normalized = normalize(audio) # Silence silence = AudioSegment.silent(duration=2000) # Properties print(len(audio)) # duration in ms print(audio.dBFS) # volume level
Skill Boundaries
What This Skill Does Well
- Structuring audio production workflows
- Providing technical guidance
- Creating quality checklists
- Suggesting creative approaches
What This Skill Cannot Do
- Replace audio engineering expertise
- Make subjective creative decisions
- Access or edit audio files directly
- Guarantee commercial success
References
- PyDub. "GitHub Repository" (jiaaro/pydub) - Library documentation
- GeeksforGeeks. "Create an Audio Editor in Python Using PyDub"
- Real Python. "Working with Audio in Python"
- FFmpeg Documentation - Underlying encoder
Related Skills
- audio-editing - Manual editing fundamentals
- transcription-to-content - Processing transcripts
- podcast-production - Full podcast workflow
Skill Metadata (Internal Use)
name: pydub-automation category: audio subcategory: automation version: 1.0 author: MKTG Skills source_expert: PyDub Library source_work: jiaaro/pydub difficulty: intermediate estimated_value: Hours saved per batch (5-50 hours depending on scale) tags: [python, automation, audio, batch-processing, pydub] created: 2026-01-26 updated: 2026-01-26