Skillshub audio-voice-recovery
Forensic Audio Research Audio Voice Recovery Best Practices
git clone https://github.com/ComeOnOliver/skillshub
T=$(mktemp -d) && git clone --depth=1 https://github.com/ComeOnOliver/skillshub "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/pproenca/dot-skills/audio-voice-recovery" ~/.claude/skills/comeonoliver-skillshub-audio-voice-recovery && rm -rf "$T"
skills/pproenca/dot-skills/audio-voice-recovery/SKILL.mdForensic Audio Research Audio Voice Recovery Best Practices
Comprehensive audio forensics and voice recovery guide providing CSI-level capabilities for recovering voice from low-quality, low-volume, or damaged audio recordings. Contains 45 rules across 8 categories, prioritized by impact to guide audio enhancement, forensic analysis, and transcription workflows.
When to Apply
Reference these guidelines when:
- Recovering voice from noisy or low-quality recordings
- Enhancing audio for transcription or legal evidence
- Performing forensic audio authentication
- Analyzing recordings for tampering or splices
- Building automated audio processing pipelines
- Transcribing difficult or degraded speech
Rule Categories by Priority
| Priority | Category | Impact | Prefix | Rules |
|---|---|---|---|---|
| 1 | Signal Preservation & Analysis | CRITICAL | | 5 |
| 2 | Noise Profiling & Estimation | CRITICAL | | 5 |
| 3 | Spectral Processing | HIGH | | 6 |
| 4 | Voice Isolation & Enhancement | HIGH | | 7 |
| 5 | Temporal Processing | MEDIUM-HIGH | | 5 |
| 6 | Transcription & Recognition | MEDIUM | | 5 |
| 7 | Forensic Authentication | MEDIUM | | 5 |
| 8 | Tool Integration & Automation | LOW-MEDIUM | | 7 |
Quick Reference
1. Signal Preservation & Analysis (CRITICAL)
- Never modify original recordingsignal-preserve-original
- Use lossless formats for processingsignal-lossless-format
- Preserve native sample ratesignal-sample-rate
- Use maximum bit depth for processingsignal-bit-depth
- Analyze before processingsignal-analyze-first
2. Noise Profiling & Estimation (CRITICAL)
- Extract noise profile from silent segmentsnoise-profile-silence
- Identify noise type before reductionnoise-identify-type
- Use adaptive estimation for non-stationary noisenoise-adaptive-estimation
- Measure SNR before and afternoise-snr-assessment
- Avoid over-processing and musical artifactsnoise-avoid-overprocessing
3. Spectral Processing (HIGH)
- Apply spectral subtraction for stationary noisespectral-subtraction
- Use Wiener filter for optimal noise estimationspectral-wiener-filter
- Apply notch filters for tonal interferencespectral-notch-filter
- Apply frequency band limiting for speechspectral-band-limiting
- Use forensic equalization to restore intelligibilityspectral-equalization
- Repair clipped audio before other processingspectral-declip
4. Voice Isolation & Enhancement (HIGH)
- Use RNNoise for real-time ML denoisingvoice-rnnoise
- Use source separation for complex backgroundsvoice-dialogue-isolate
- Preserve formants during pitch manipulationvoice-formant-preserve
- Apply dereverberation for room echovoice-dereverb
- Use AI speech enhancement services for quick resultsvoice-enhance-speech
- Use VAD for targeted processingvoice-vad-segment
- Boost frequency regions for specific phonemesvoice-frequency-boost
5. Temporal Processing (MEDIUM-HIGH)
- Use dynamic range compression for level consistencytemporal-dynamic-range
- Apply noise gate to silence non-speech segmentstemporal-noise-gate
- Use time stretching for intelligibilitytemporal-time-stretch
- Repair transient damage (clicks, pops, dropouts)temporal-transient-repair
- Trim silence and normalize before exporttemporal-silence-trim
6. Transcription & Recognition (MEDIUM)
- Use Whisper for noise-robust transcriptiontranscribe-whisper
- Use multi-pass transcription for difficult audiotranscribe-multipass
- Segment audio for targeted transcriptiontranscribe-segment
- Track confidence scores for uncertain wordstranscribe-confidence
- Detect and filter ASR hallucinationstranscribe-hallucination
7. Forensic Authentication (MEDIUM)
- Use ENF analysis for timestamp verificationforensic-enf-analysis
- Extract and verify audio metadataforensic-metadata
- Detect audio tampering and splicesforensic-tampering
- Document chain of custody for evidenceforensic-chain-custody
- Extract speaker characteristics for identificationforensic-speaker-id
8. Tool Integration & Automation (LOW-MEDIUM)
- Master essential FFmpeg audio commandstool-ffmpeg-essentials
- Use SoX for advanced audio manipulationtool-sox-commands
- Build Python audio processing pipelinestool-python-pipeline
- Use Audacity for visual analysis and manual editingtool-audacity-workflow
- Install audio forensic toolchaintool-install-guide
- Automate batch processing workflowstool-batch-automation
- Measure audio quality metricstool-quality-assessment
Essential Tools
| Tool | Purpose | Install |
|---|---|---|
| FFmpeg | Format conversion, filtering | |
| SoX | Noise profiling, effects | |
| Whisper | Speech transcription | |
| librosa | Python audio analysis | |
| noisereduce | ML noise reduction | |
| Audacity | Visual editing | |
Workflow Scripts (Recommended)
Use the bundled scripts to generate objective baselines, create a workflow plan, and verify results.
- Generate a forensic preflight report (JSON or Markdown).scripts/preflight_audio.py
- Create a workflow plan template from the preflight report.scripts/plan_from_preflight.py
- Compare objective metrics between baseline and processed audio.scripts/compare_audio.py
Example usage:
# 1) Analyze and capture baseline metrics python3 skills/.experimental/audio-voice-recovery/scripts/preflight_audio.py evidence.wav --out preflight.json # 2) Generate a workflow plan template python3 skills/.experimental/audio-voice-recovery/scripts/plan_from_preflight.py --preflight preflight.json --out plan.md # 3) Compare baseline vs processed metrics python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py \ --before evidence.wav \ --after enhanced.wav \ --format md \ --out comparison.md
Forensic Preflight Workflow (Do This Before Any Changes)
Align preflight with SWGDE Best Practices for the Enhancement of Digital Audio (20-a-001) and SWGDE Best Practices for Forensic Audio (08-a-001). Establish an objective baseline state and plan the workflow so processing does not introduce clipping, artifacts, or false "done" confidence. Use
scripts/preflight_audio.py to capture baseline metrics and preserve the report with the case file.
Capture and record before processing:
- Record evidence identity and integrity: path, filename, file size, SHA-256 checksum, source, format/container, codec
- Record signal integrity: sample rate, bit depth, channels, duration
- Measure baseline loudness and levels: LUFS/LKFS, true peak, peak, RMS, dynamic range, DC offset
- Detect clipping and document clipped-sample percentage, peak headroom, exact time ranges
- Identify noise profile: stationary vs non-stationary, dominant noise bands, SNR estimate
- Locate the region of interest (ROI) and document time ranges and changes over time
- Inspect spectral content and estimate speech-band energy and intelligibility risk
- Scan for temporal defects: dropouts, discontinuities, splices, drift
- Evaluate channel correlation and phase anomalies (if stereo)
- Extract and preserve metadata: timestamps, device/model tags, embedded notes
Procedure:
- Prepare a forensic working copy, verify hashes, and preserve the original untouched.
- Locate ROI and target signal; document exact time ranges and changes across the recording.
- Assess challenges to intelligibility and signal quality; map challenges to mitigation strategies.
- Identify required processing and plan a workflow order that avoids unwanted artifacts.
Generate a plan draft with
and complete it with case-specific decisions.scripts/plan_from_preflight.py - Measure baseline loudness and true peak per ITU-R BS.1770 / EBU R 128 and record peak/RMS/DC offset.
- Detect clipping and dropouts; if clipping is present, declip first or pause and document limitations.
- Inspect spectral content and noise type; collect representative noise profile segments and estimate SNR.
- If stereo, evaluate channel correlation and phase; document anomalies.
- Create a baseline listening log (multiple devices) and define success criteria for intelligibility and listenability.
Failure-pattern guardrails:
- Do not process until every preflight field is captured.
- Document every process, setting, software version, and time segment to enable repeatability.
- Compare each processed output to the unprocessed input and assess progress toward intelligibility and listenability.
- Avoid over-processing; review removed signal (filter residue) to avoid removing target signal components.
- Keep intermediate files uncompressed and preserve sample rate/bit depth when moving between tools.
- Perform a final review against the original; if unsatisfactory, revise or stop and report limitations.
- If the request is not achievable, communicate limitations and do not declare completion.
- Require objective metrics and A/B listening before declaring completion.
- Do not rely solely on objective metrics; corroborate with critical listening.
- Take listening breaks to avoid ear fatigue during extended reviews.
Quick Enhancement Pipeline
# 1. Analyze original (run preflight and capture baseline metrics) python3 skills/.experimental/audio-voice-recovery/scripts/preflight_audio.py evidence.wav --out preflight.json # 2. Create working copy with checksum cp evidence.wav working.wav sha256sum evidence.wav > evidence.sha256 # 3. Apply enhancement ffmpeg -i working.wav -af "\ highpass=f=80,\ adeclick=w=55:o=75,\ afftdn=nr=12:nf=-30:nt=w,\ equalizer=f=2500:t=q:w=1:g=3,\ loudnorm=I=-16:TP=-1.5:LRA=11\ " enhanced.wav # 4. Transcribe whisper enhanced.wav --model large-v3 --language en # 5. Verify original unchanged sha256sum -c evidence.sha256 # 6. Verify improvement (objective comparison + A/B listening) python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py \ --before evidence.wav \ --after enhanced.wav \ --format md \ --out comparison.md
How to Use
Read individual reference files for detailed explanations and code examples:
- Section definitions - Category structure and impact levels
- Rule template - Template for adding new rules
Reference Files
| File | Description |
|---|---|
| AGENTS.md | Complete compiled guide with all rules |
| references/_sections.md | Category definitions and ordering |
| assets/templates/_template.md | Template for new rules |
| metadata.json | Version and reference information |