Gsd-skill-creator audio-engineering

Audio engineering — mastering, mixing, EQ, compression, loudness standards, synthesis, podcast production, music theory, spectrum analysis.

install

source · Clone the upstream repo

git clone https://github.com/Tibsfox/gsd-skill-creator

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/Tibsfox/gsd-skill-creator "$T" && mkdir -p ~/.claude/skills && cp -r "$T/examples/skills/media/audio-engineering" ~/.claude/skills/tibsfox-gsd-skill-creator-audio-engineering && rm -rf "$T"

manifest: examples/skills/media/audio-engineering/SKILL.md

source content

Audio Engineering Skill

Built on research from 30 Music cluster projects, 360 PNW musicians deep-dived (S36/SPS), Ableton Live research (ABL), Deep Audio (DAA), Dead Frequencies (DFQ), and High Fidelity amplifier analysis (HFR/HFE).

Expert-level audio engineering covering mastering, mixing, loudness standards, synthesis, podcast production, music theory, and spectrum analysis. Works alongside the

ffmpeg-media

skill for codec/format operations.

Loudness Standards

Target Levels by Platform

Platform	Target LUFS	True Peak	Standard
Spotify	-14 LUFS	-1 dBTP	AES streaming
Apple Music	-16 LUFS	-1 dBTP	Sound Check
YouTube	-14 LUFS	-1 dBTP	ITU-R BS.1770
Podcast (Apple)	-16 LUFS	-1 dBTP	Apple spec
Podcast (Spotify)	-14 LUFS	-1 dBTP	Spotify spec
Broadcast TV	-24 LUFS	-2 dBTP	EBU R128
Broadcast US	-24 LKFS	-2 dBTP	ATSC A/85
CD master	-9 to -12 LUFS	-0.3 dBTP	Red Book
Film/Cinema	-24 LUFS	-1 dBTP	SMPTE RP 200

Measurement Commands

# Measure integrated loudness (LUFS) with ffmpeg
ffmpeg -i input.wav -af loudnorm=print_format=json -f null - 2>&1 | grep -A20 "Parsed_loudnorm"

# Full EBU R128 scan
ffmpeg -i input.wav -af ebur128=peak=true -f null - 2>&1 | tail -20

# Loudness normalization to -14 LUFS (two-pass for accuracy)
# Pass 1: measure
ffmpeg -i input.wav -af loudnorm=I=-14:LRA=11:TP=-1:print_format=json -f null - 2>&1 > /tmp/loudnorm.json
# Pass 2: apply (use measured values from pass 1)
ffmpeg -i input.wav -af loudnorm=I=-14:LRA=11:TP=-1:measured_I=-18.5:measured_LRA=9.2:measured_TP=-0.5:measured_thresh=-28.3 output.wav

With sox

# Normalize peak to -1 dBFS
sox input.wav output.wav gain -n -1

# Compressor (threshold -20dB, ratio 4:1, attack 5ms, release 50ms)
sox input.wav output.wav compand 0.005,0.05 -20,-20,-10,-10,0,-6

# 3-band EQ (low shelf +3dB at 200Hz, mid cut -2dB at 2kHz, high shelf +1dB at 8kHz)
sox input.wav output.wav bass +3 200 equalizer 2000 1q -2 treble +1 8000

# Noise reduction (profile then reduce)
sox noisy.wav -n noiseprof /tmp/noise.prof
sox noisy.wav clean.wav noisered /tmp/noise.prof 0.21

# Generate tone (440Hz sine, 3 seconds)
sox -n -r 44100 -c 1 tone.wav synth 3 sine 440

# Spectrum analysis (generate spectrogram PNG)
sox input.wav -n spectrogram -o spectrum.png

Mastering Chain

Standard Mastering Signal Flow

Input → EQ (corrective) → Compression → EQ (tonal) → Stereo Width → Limiting → Dithering → Output

With ffmpeg Filters

# Full mastering chain: EQ → compression → limiting → loudness normalization
ffmpeg -i mix.wav -af "\
  equalizer=f=80:t=h:w=100:g=2,\
  equalizer=f=3000:t=h:w=1000:g=-1.5,\
  equalizer=f=12000:t=h:w=2000:g=1,\
  acompressor=threshold=-18dB:ratio=3:attack=10:release=100:knee=6,\
  alimiter=limit=-1dBFS:level=false,\
  loudnorm=I=-14:LRA=11:TP=-1\
" -ar 44100 -sample_fmt s16 mastered.wav

# Dithering (16-bit with triangular dither for CD)
ffmpeg -i master_24bit.wav -af "dither=method=triangular" -sample_fmt s16 -ar 44100 cd_master.wav

EQ Reference

Frequency Bands and Characteristics

Band	Range	Character	Common Uses
Sub-bass	20-60 Hz	Felt, not heard	Kick fundamental, sub bass
Bass	60-250 Hz	Warmth, body	Bass guitar, kick punch, vocal warmth
Low-mid	250-500 Hz	Muddiness zone	Cut here to clean up mixes
Mid	500-2000 Hz	Body, presence	Vocal clarity, guitar body
Upper-mid	2-4 kHz	Presence, bite	Vocal intelligibility, guitar attack
Presence	4-6 kHz	Definition, edge	Consonant clarity, string attack
Brilliance	6-12 kHz	Air, shimmer	Cymbals, vocal air, acoustic sparkle
Ultra-high	12-20 kHz	Air, sparkle	Subtle sheen (careful: sibilance)

Common Problem Frequencies

200-300 Hz — boominess in vocals, acoustic guitar
400-600 Hz — cardboard/boxy sound
1-2 kHz — nasal, telephone quality
3-5 kHz — harshness, listening fatigue
6-8 kHz — sibilance (de-ess here)

Compression Reference

Settings by Source

Source	Threshold	Ratio	Attack	Release	Knee
Vocals	-18 to -12 dB	2:1 to 4:1	5-15 ms	40-80 ms	Soft
Drums (bus)	-15 to -10 dB	3:1 to 6:1	10-30 ms	50-100 ms	Hard
Bass	-15 to -8 dB	3:1 to 8:1	10-30 ms	100-200 ms	Hard
Acoustic guitar	-20 to -12 dB	2:1 to 4:1	10-25 ms	100-150 ms	Soft
Mix bus	-20 to -15 dB	1.5:1 to 2:1	10-30 ms	100-300 ms	Soft
Podcast	-20 to -15 dB	3:1 to 5:1	5-10 ms	50-100 ms	Soft

Compression Types

VCA — fast, transparent, precise (SSL, dbx 160)
Optical — smooth, musical, slow (LA-2A, CL 1B)
FET — aggressive, colorful, fast (1176, Distressor)
Variable-mu — warm, glue, gentle (Fairchild 670, Manley Vari-Mu)

Synthesis Reference

Synthesis Types

Type	How It Works	Character	Classic Synths
Subtractive	Oscillator → Filter → Amplifier	Warm, analog, rich	Minimoog, Prophet-5, Juno-106
FM	Operators modulating each other's frequency	Metallic, bell-like, bright	DX7, FM8
Wavetable	Morphing between stored waveforms	Evolving, complex, modern	PPG Wave, Serum, Vital
Granular	Tiny audio grains layered and scattered	Atmospheric, textural, ambient	Granulator, Pigments
Additive	Sum of individual sine wave partials	Precise, organ-like	Kawai K5, Razor
Physical modeling	Mathematical model of physical instrument	Realistic, expressive	Chromaphone, Pianoteq
Sample-based	Recorded audio, pitch-shifted and layered	Realistic, natural	Kontakt, Sampler

ADSR Envelope Quick Reference

Pad: A=500ms, D=200ms, S=0.8, R=1000ms
Pluck: A=1ms, D=200ms, S=0, R=100ms
Bass: A=5ms, D=100ms, S=0.6, R=50ms
Lead: A=10ms, D=50ms, S=0.7, R=200ms
Kick drum: A=0ms, D=150ms, S=0, R=50ms

Music Theory Quick Reference

Circle of Fifths (Major Keys)

        C
    F       G
  Bb          D
    Eb      A
       Ab/E

Common Chord Progressions

Name	Numerals	Example in C	Use
Pop	I-V-vi-IV	C-G-Am-F	80% of pop music
Blues	I-IV-V	C-F-G	Blues, rock
Jazz ii-V-I	ii-V-I	Dm7-G7-Cmaj7	Jazz standard
Andalusian	i-VII-VI-V	Am-G-F-E	Flamenco, dramatic
Canon	I-V-vi-iii-IV-I-IV-V	C-G-Am-Em-F-C-F-G	Pachelbel, ballads
Minor blues	i-iv-V	Am-Dm-E	Minor blues

Scales

Major (Ionian): W-W-H-W-W-W-H
Natural Minor (Aeolian): W-H-W-W-H-W-W
Pentatonic Major: 1-2-3-5-6
Pentatonic Minor: 1-b3-4-5-b7
Blues: 1-b3-4-#4-5-b7
Dorian: W-H-W-W-W-H-W (minor with raised 6th — jazz, funk)
Mixolydian: W-W-H-W-W-H-W (major with flat 7th — blues rock)

Podcast Production Workflow

Recording

# Record from default mic (sox)
sox -d -r 44100 -c 1 -b 16 recording.wav

# Record with ffmpeg (specify ALSA device on Linux)
ffmpeg -f alsa -i default -ar 44100 -ac 1 recording.wav

Processing Chain

# 1. Noise reduction
sox recording.wav -n trim 0 0.5 noiseprof /tmp/noise.prof
sox recording.wav clean.wav noisered /tmp/noise.prof 0.21

# 2. Normalize + compress + EQ for voice
ffmpeg -i clean.wav -af "\
  highpass=f=80,\
  lowpass=f=12000,\
  equalizer=f=3000:t=h:w=1000:g=2,\
  acompressor=threshold=-20dB:ratio=4:attack=5:release=50,\
  loudnorm=I=-16:TP=-1\
" -ar 44100 podcast_ready.wav

# 3. Export MP3 for distribution
ffmpeg -i podcast_ready.wav -c:a libmp3lame -b:a 128k \
  -metadata title="Episode Title" \
  -metadata artist="Show Name" \
  -metadata album="Podcast Name" \
  -metadata genre="Podcast" \
  episode.mp3

# 4. Generate waveform for show notes
ffmpeg -i episode.mp3 -filter_complex "showwavespic=s=1920x200:colors=0x1a1a2e" -frames:v 1 waveform.png

ID3 Tags

# Set all metadata
ffmpeg -i episode.mp3 -c copy \
  -metadata title="EP 42: The Memory Architecture" \
  -metadata artist="GSD Podcast" \
  -metadata album="Getting Shit Done" \
  -metadata track="42" \
  -metadata date="2026" \
  -metadata comment="LOD-tiered memory system deep dive" \
  tagged.mp3

BPM and Key Detection

With ffmpeg/aubio

# Install aubio for beat/pitch detection
# apt install aubio-tools

# BPM detection
aubiotempo input.wav

# Pitch/key detection
aubiopitch -i input.wav -p yinfft

# Onset detection (transient markers)
aubioonset input.wav

With sox

# Generate stats (includes RMS, peak, DC offset)
sox input.wav -n stats 2>&1

Sample Rate / Bit Depth Reference

Format	Sample Rate	Bit Depth	Use
CD	44.1 kHz	16-bit	Consumer playback
DVD	48 kHz	24-bit	Video soundtrack
Hi-Res	96 kHz	24-bit	Audiophile streaming
Studio	96-192 kHz	32-bit float	Recording/mixing
Podcast	44.1 kHz	16-bit	Voice distribution
Phone/VoIP	8-16 kHz	16-bit	Voice calls

Conversion

# Downsample from 96kHz/24-bit to 44.1kHz/16-bit with dither
sox input_96_24.wav -r 44100 -b 16 output_441_16.wav dither -s

# Same with ffmpeg
ffmpeg -i input_96_24.wav -ar 44100 -sample_fmt s16 -af "dither=method=triangular" output.wav

Related Skills & Agents

ffmpeg-media — codec/format operations, video+audio conversion
ffmpeg-processor agent — media processing specialist
gource-visualizer — repository visualization with audio sync capability
Audio research: ABL, DAA, DFQ, HFR, HFE, S36/SPS (360 musicians)

When This Skill Activates

Audio mastering, mixing, EQ, compression
Loudness measurement and normalization (LUFS, EBU R128)
Podcast recording, editing, production
Music theory questions (chords, scales, progressions)
Synthesis design (FM, subtractive, granular, wavetable)
Spectrum analysis and audio visualization
Sample rate/bit depth conversion
Noise reduction and audio cleanup
BPM/key detection