Gsd-skill-creator audio-engineering

Audio engineering — mastering, mixing, EQ, compression, loudness standards, synthesis, podcast production, music theory, spectrum analysis.

install
source · Clone the upstream repo
git clone https://github.com/Tibsfox/gsd-skill-creator
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/Tibsfox/gsd-skill-creator "$T" && mkdir -p ~/.claude/skills && cp -r "$T/examples/skills/media/audio-engineering" ~/.claude/skills/tibsfox-gsd-skill-creator-audio-engineering && rm -rf "$T"
manifest: examples/skills/media/audio-engineering/SKILL.md
source content

Audio Engineering Skill

Built on research from 30 Music cluster projects, 360 PNW musicians deep-dived (S36/SPS), Ableton Live research (ABL), Deep Audio (DAA), Dead Frequencies (DFQ), and High Fidelity amplifier analysis (HFR/HFE).

Expert-level audio engineering covering mastering, mixing, loudness standards, synthesis, podcast production, music theory, and spectrum analysis. Works alongside the

ffmpeg-media
skill for codec/format operations.

Loudness Standards

Target Levels by Platform

PlatformTarget LUFSTrue PeakStandard
Spotify-14 LUFS-1 dBTPAES streaming
Apple Music-16 LUFS-1 dBTPSound Check
YouTube-14 LUFS-1 dBTPITU-R BS.1770
Podcast (Apple)-16 LUFS-1 dBTPApple spec
Podcast (Spotify)-14 LUFS-1 dBTPSpotify spec
Broadcast TV-24 LUFS-2 dBTPEBU R128
Broadcast US-24 LKFS-2 dBTPATSC A/85
CD master-9 to -12 LUFS-0.3 dBTPRed Book
Film/Cinema-24 LUFS-1 dBTPSMPTE RP 200

Measurement Commands

# Measure integrated loudness (LUFS) with ffmpeg
ffmpeg -i input.wav -af loudnorm=print_format=json -f null - 2>&1 | grep -A20 "Parsed_loudnorm"

# Full EBU R128 scan
ffmpeg -i input.wav -af ebur128=peak=true -f null - 2>&1 | tail -20

# Loudness normalization to -14 LUFS (two-pass for accuracy)
# Pass 1: measure
ffmpeg -i input.wav -af loudnorm=I=-14:LRA=11:TP=-1:print_format=json -f null - 2>&1 > /tmp/loudnorm.json
# Pass 2: apply (use measured values from pass 1)
ffmpeg -i input.wav -af loudnorm=I=-14:LRA=11:TP=-1:measured_I=-18.5:measured_LRA=9.2:measured_TP=-0.5:measured_thresh=-28.3 output.wav

With sox

# Normalize peak to -1 dBFS
sox input.wav output.wav gain -n -1

# Compressor (threshold -20dB, ratio 4:1, attack 5ms, release 50ms)
sox input.wav output.wav compand 0.005,0.05 -20,-20,-10,-10,0,-6

# 3-band EQ (low shelf +3dB at 200Hz, mid cut -2dB at 2kHz, high shelf +1dB at 8kHz)
sox input.wav output.wav bass +3 200 equalizer 2000 1q -2 treble +1 8000

# Noise reduction (profile then reduce)
sox noisy.wav -n noiseprof /tmp/noise.prof
sox noisy.wav clean.wav noisered /tmp/noise.prof 0.21

# Generate tone (440Hz sine, 3 seconds)
sox -n -r 44100 -c 1 tone.wav synth 3 sine 440

# Spectrum analysis (generate spectrogram PNG)
sox input.wav -n spectrogram -o spectrum.png

Mastering Chain

Standard Mastering Signal Flow

Input → EQ (corrective) → Compression → EQ (tonal) → Stereo Width → Limiting → Dithering → Output

With ffmpeg Filters

# Full mastering chain: EQ → compression → limiting → loudness normalization
ffmpeg -i mix.wav -af "\
  equalizer=f=80:t=h:w=100:g=2,\
  equalizer=f=3000:t=h:w=1000:g=-1.5,\
  equalizer=f=12000:t=h:w=2000:g=1,\
  acompressor=threshold=-18dB:ratio=3:attack=10:release=100:knee=6,\
  alimiter=limit=-1dBFS:level=false,\
  loudnorm=I=-14:LRA=11:TP=-1\
" -ar 44100 -sample_fmt s16 mastered.wav

# Dithering (16-bit with triangular dither for CD)
ffmpeg -i master_24bit.wav -af "dither=method=triangular" -sample_fmt s16 -ar 44100 cd_master.wav

EQ Reference

Frequency Bands and Characteristics

BandRangeCharacterCommon Uses
Sub-bass20-60 HzFelt, not heardKick fundamental, sub bass
Bass60-250 HzWarmth, bodyBass guitar, kick punch, vocal warmth
Low-mid250-500 HzMuddiness zoneCut here to clean up mixes
Mid500-2000 HzBody, presenceVocal clarity, guitar body
Upper-mid2-4 kHzPresence, biteVocal intelligibility, guitar attack
Presence4-6 kHzDefinition, edgeConsonant clarity, string attack
Brilliance6-12 kHzAir, shimmerCymbals, vocal air, acoustic sparkle
Ultra-high12-20 kHzAir, sparkleSubtle sheen (careful: sibilance)

Common Problem Frequencies

  • 200-300 Hz — boominess in vocals, acoustic guitar
  • 400-600 Hz — cardboard/boxy sound
  • 1-2 kHz — nasal, telephone quality
  • 3-5 kHz — harshness, listening fatigue
  • 6-8 kHz — sibilance (de-ess here)

Compression Reference

Settings by Source

SourceThresholdRatioAttackReleaseKnee
Vocals-18 to -12 dB2:1 to 4:15-15 ms40-80 msSoft
Drums (bus)-15 to -10 dB3:1 to 6:110-30 ms50-100 msHard
Bass-15 to -8 dB3:1 to 8:110-30 ms100-200 msHard
Acoustic guitar-20 to -12 dB2:1 to 4:110-25 ms100-150 msSoft
Mix bus-20 to -15 dB1.5:1 to 2:110-30 ms100-300 msSoft
Podcast-20 to -15 dB3:1 to 5:15-10 ms50-100 msSoft

Compression Types

  • VCA — fast, transparent, precise (SSL, dbx 160)
  • Optical — smooth, musical, slow (LA-2A, CL 1B)
  • FET — aggressive, colorful, fast (1176, Distressor)
  • Variable-mu — warm, glue, gentle (Fairchild 670, Manley Vari-Mu)

Synthesis Reference

Synthesis Types

TypeHow It WorksCharacterClassic Synths
SubtractiveOscillator → Filter → AmplifierWarm, analog, richMinimoog, Prophet-5, Juno-106
FMOperators modulating each other's frequencyMetallic, bell-like, brightDX7, FM8
WavetableMorphing between stored waveformsEvolving, complex, modernPPG Wave, Serum, Vital
GranularTiny audio grains layered and scatteredAtmospheric, textural, ambientGranulator, Pigments
AdditiveSum of individual sine wave partialsPrecise, organ-likeKawai K5, Razor
Physical modelingMathematical model of physical instrumentRealistic, expressiveChromaphone, Pianoteq
Sample-basedRecorded audio, pitch-shifted and layeredRealistic, naturalKontakt, Sampler

ADSR Envelope Quick Reference

  • Pad: A=500ms, D=200ms, S=0.8, R=1000ms
  • Pluck: A=1ms, D=200ms, S=0, R=100ms
  • Bass: A=5ms, D=100ms, S=0.6, R=50ms
  • Lead: A=10ms, D=50ms, S=0.7, R=200ms
  • Kick drum: A=0ms, D=150ms, S=0, R=50ms

Music Theory Quick Reference

Circle of Fifths (Major Keys)

        C
    F       G
  Bb          D
    Eb      A
       Ab/E

Common Chord Progressions

NameNumeralsExample in CUse
PopI-V-vi-IVC-G-Am-F80% of pop music
BluesI-IV-VC-F-GBlues, rock
Jazz ii-V-Iii-V-IDm7-G7-Cmaj7Jazz standard
Andalusiani-VII-VI-VAm-G-F-EFlamenco, dramatic
CanonI-V-vi-iii-IV-I-IV-VC-G-Am-Em-F-C-F-GPachelbel, ballads
Minor bluesi-iv-VAm-Dm-EMinor blues

Scales

  • Major (Ionian): W-W-H-W-W-W-H
  • Natural Minor (Aeolian): W-H-W-W-H-W-W
  • Pentatonic Major: 1-2-3-5-6
  • Pentatonic Minor: 1-b3-4-5-b7
  • Blues: 1-b3-4-#4-5-b7
  • Dorian: W-H-W-W-W-H-W (minor with raised 6th — jazz, funk)
  • Mixolydian: W-W-H-W-W-H-W (major with flat 7th — blues rock)

Podcast Production Workflow

Recording

# Record from default mic (sox)
sox -d -r 44100 -c 1 -b 16 recording.wav

# Record with ffmpeg (specify ALSA device on Linux)
ffmpeg -f alsa -i default -ar 44100 -ac 1 recording.wav

Processing Chain

# 1. Noise reduction
sox recording.wav -n trim 0 0.5 noiseprof /tmp/noise.prof
sox recording.wav clean.wav noisered /tmp/noise.prof 0.21

# 2. Normalize + compress + EQ for voice
ffmpeg -i clean.wav -af "\
  highpass=f=80,\
  lowpass=f=12000,\
  equalizer=f=3000:t=h:w=1000:g=2,\
  acompressor=threshold=-20dB:ratio=4:attack=5:release=50,\
  loudnorm=I=-16:TP=-1\
" -ar 44100 podcast_ready.wav

# 3. Export MP3 for distribution
ffmpeg -i podcast_ready.wav -c:a libmp3lame -b:a 128k \
  -metadata title="Episode Title" \
  -metadata artist="Show Name" \
  -metadata album="Podcast Name" \
  -metadata genre="Podcast" \
  episode.mp3

# 4. Generate waveform for show notes
ffmpeg -i episode.mp3 -filter_complex "showwavespic=s=1920x200:colors=0x1a1a2e" -frames:v 1 waveform.png

ID3 Tags

# Set all metadata
ffmpeg -i episode.mp3 -c copy \
  -metadata title="EP 42: The Memory Architecture" \
  -metadata artist="GSD Podcast" \
  -metadata album="Getting Shit Done" \
  -metadata track="42" \
  -metadata date="2026" \
  -metadata comment="LOD-tiered memory system deep dive" \
  tagged.mp3

BPM and Key Detection

With ffmpeg/aubio

# Install aubio for beat/pitch detection
# apt install aubio-tools

# BPM detection
aubiotempo input.wav

# Pitch/key detection
aubiopitch -i input.wav -p yinfft

# Onset detection (transient markers)
aubioonset input.wav

With sox

# Generate stats (includes RMS, peak, DC offset)
sox input.wav -n stats 2>&1

Sample Rate / Bit Depth Reference

FormatSample RateBit DepthUse
CD44.1 kHz16-bitConsumer playback
DVD48 kHz24-bitVideo soundtrack
Hi-Res96 kHz24-bitAudiophile streaming
Studio96-192 kHz32-bit floatRecording/mixing
Podcast44.1 kHz16-bitVoice distribution
Phone/VoIP8-16 kHz16-bitVoice calls

Conversion

# Downsample from 96kHz/24-bit to 44.1kHz/16-bit with dither
sox input_96_24.wav -r 44100 -b 16 output_441_16.wav dither -s

# Same with ffmpeg
ffmpeg -i input_96_24.wav -ar 44100 -sample_fmt s16 -af "dither=method=triangular" output.wav

Related Skills & Agents

  • ffmpeg-media — codec/format operations, video+audio conversion
  • ffmpeg-processor agent — media processing specialist
  • gource-visualizer — repository visualization with audio sync capability
  • Audio research: ABL, DAA, DFQ, HFR, HFE, S36/SPS (360 musicians)

When This Skill Activates

  • Audio mastering, mixing, EQ, compression
  • Loudness measurement and normalization (LUFS, EBU R128)
  • Podcast recording, editing, production
  • Music theory questions (chords, scales, progressions)
  • Synthesis design (FM, subtractive, granular, wavetable)
  • Spectrum analysis and audio visualization
  • Sample rate/bit depth conversion
  • Noise reduction and audio cleanup
  • BPM/key detection