Trending-skills nightingale-karaoke
ML-powered Karaoke app in Rust using Bevy, WhisperX, and Demucs for stem separation, lyrics transcription, and pitch scoring.
git clone https://github.com/Aradotso/trending-skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/Aradotso/trending-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/nightingale-karaoke" ~/.claude/skills/aradotso-trending-skills-nightingale-karaoke && rm -rf "$T"
skills/nightingale-karaoke/SKILL.mdNightingale Karaoke Skill
Skill by ara.so — Daily 2026 Skills collection.
Nightingale is a self-contained, ML-powered karaoke application written in Rust (Bevy engine). It scans a local music folder, separates vocals from instrumentals (UVR Karaoke model or Demucs), transcribes lyrics with word-level timestamps (WhisperX), and plays back with synchronized highlighting, real-time pitch scoring, player profiles, and GPU shader / video backgrounds. Everything — ffmpeg, Python, PyTorch, ML models — is bootstrapped automatically on first launch.
Installation
Pre-built Binary (Recommended)
Download the latest release from the Releases page for your platform and run it.
macOS only — remove quarantine after extracting:
xattr -cr Nightingale.app
Build from Source
Prerequisites:
- Rust 1.85+ (edition 2024)
- Linux additionally needs:
libasound2-dev libudev-dev libwayland-dev libxkbcommon-dev
git clone https://github.com/rzru/nightingale cd nightingale # Development build cargo build --release # Run directly ./target/release/nightingale
Release Packaging
# Linux / macOS scripts/make-release.sh # Windows (PowerShell) powershell -ExecutionPolicy Bypass -File scripts/make-release.ps1
Outputs a
.tar.gz (Linux/macOS) or .zip (Windows) ready for distribution.
First Launch / Bootstrap
On first run, Nightingale downloads and configures:
binaryffmpeg
(Python package manager)uv- Python 3.10 via uv
- PyTorch + WhisperX + audio-separator in a virtual environment
- UVR Karaoke ONNX model and WhisperX
modellarge-v3
This takes 2–10 minutes depending on network speed. A progress screen is shown in-app.
To force re-bootstrap at any time:
./nightingale --setup
Bootstrap completion is marked by
~/.nightingale/vendor/.ready.
CLI Flags
| Flag | Description |
|---|---|
| Force re-run of the first-launch bootstrap (re-downloads vendor deps) |
Keyboard & Gamepad Controls
Navigation
| Action | Keyboard | Gamepad |
|---|---|---|
| Move | Arrow keys | D-pad / Left stick |
| Confirm | Enter | A (South) |
| Back | Escape | B (East) / Start |
| Switch panel | Tab | — |
| Search | Type to filter | — |
Playback
| Action | Keyboard | Gamepad |
|---|---|---|
| Pause / Resume | Space | Start |
| Exit to menu | Escape | B (East) |
| Toggle guide vocals | G | — |
| Guide volume up/down | + / - | — |
| Cycle background | T | — |
| Cycle video flavor | F | — |
| Toggle microphone | M | — |
| Next microphone | N | — |
| Toggle fullscreen | F11 | — |
Configuration
Main Config
Located at
~/.nightingale/config.json. Edit directly or via in-app settings.
{ "music_folder": "/home/user/Music", "separator": "uvr", "guide_vocal_volume": 0.3, "background_theme": "plasma", "video_flavor": "nature", "default_profile": "Alice" }
options: separator
"uvr" (default, preserves backing vocals) | "demucs"
options: background_theme
"plasma", "aurora", "waves", "nebula", "starfield", "video", "source_video"
options: video_flavor
"nature", "underwater", "space", "city", "countryside"
Profiles
Located at
~/.nightingale/profiles.json:
{ "profiles": [ { "name": "Alice", "scores": { "blake3_hash_of_song": { "stars": 4, "score": 87250, "played_at": "2026-03-18T21:00:00Z" } } } ] }
Pixabay Video Backgrounds (Dev)
API key is embedded in release builds. For local development, create
.env at project root:
# .env PIXABAY_API_KEY=$PIXABAY_API_KEY
The release script (
make-release.sh) sources .env automatically.
Data Storage Layout
~/.nightingale/ ├── cache/ # Per-song stems, transcripts, lyrics (keyed by blake3 hash) ├── config.json # App settings ├── profiles.json # Player profiles and per-song scores ├── videos/ # Pre-downloaded Pixabay video backgrounds ├── sounds/ # Sound effects ├── vendor/ │ ├── ffmpeg # ffmpeg binary │ ├── uv # uv binary │ ├── python/ # Python 3.10 │ ├── venv/ # ML virtualenv (WhisperX, Demucs, audio-separator) │ ├── analyzer/ # Python analyzer scripts │ └── .ready # Bootstrap completion marker └── models/ ├── torch/ # Demucs model weights ├── huggingface/ # WhisperX large-v3 weights └── audio_separator/ # UVR Karaoke ONNX model
Cache keys are blake3 hashes of the source file — re-analysis only triggers if the file changes or is manually invalidated.
Supported File Formats
Audio:
.mp3, .flac, .ogg, .wav, .m4a, .aac, .wma
Video:
.mp4, .mkv, .avi, .webm, .mov, .m4v
Video files: audio track is extracted, vocals separated, original video plays as background automatically.
Hardware Acceleration
PyTorch backend is auto-detected:
| Backend | Device | Notes |
|---|---|---|
| CUDA | NVIDIA GPU | Fastest; ~2–5 min/song |
| MPS | Apple Silicon | macOS; WhisperX alignment falls back to CPU |
| CPU | Any | Always works; ~10–20 min/song |
UVR Karaoke model uses ONNX Runtime with CUDA (NVIDIA) or CoreML (Apple Silicon) automatically.
Processing Pipeline
Audio/Video file │ ▼ UVR Karaoke (ONNX) or Demucs (PyTorch) │ vocals.ogg + instrumental.ogg ▼ LRCLIB API ──▶ Synced lyrics fetch (if available) │ ▼ WhisperX large-v3 ──▶ Transcription + word-level timestamps │ ▼ Bevy App (Rust) - Plays instrumental audio - Synchronized word highlighting - Real-time pitch detection & scoring - GPU shader / video backgrounds - Scoreboards per profile
Code Patterns
Adding a New Background Theme (Bevy System)
// In your Bevy plugin, register a new background variant use bevy::prelude::*; #[derive(Component)] pub struct MyCustomBackground; pub fn spawn_custom_background(mut commands: Commands) { commands.spawn(( MyCustomBackground, // ... your background components )); } pub struct CustomBackgroundPlugin; impl Plugin for CustomBackgroundPlugin { fn build(&self, app: &mut App) { app.add_systems(OnEnter(AppState::Playing), spawn_custom_background); } }
Extending Config Deserialization
use serde::{Deserialize, Serialize}; #[derive(Debug, Clone, Serialize, Deserialize)] pub struct NightingaleConfig { pub music_folder: String, #[serde(default = "default_separator")] pub separator: StemSeparator, #[serde(default = "default_guide_volume")] pub guide_vocal_volume: f32, } #[derive(Debug, Clone, Serialize, Deserialize, Default)] #[serde(rename_all = "lowercase")] pub enum StemSeparator { #[default] Uvr, Demucs, } fn default_guide_volume() -> f32 { 0.3 } fn default_separator() -> StemSeparator { StemSeparator::Uvr } // Load config fn load_config() -> NightingaleConfig { let path = dirs::home_dir() .unwrap() .join(".nightingale/config.json"); let raw = std::fs::read_to_string(&path).unwrap_or_default(); serde_json::from_str(&raw).unwrap_or_default() }
Triggering Re-analysis Programmatically
use std::fs; use std::path::PathBuf; /// Remove cached stems/transcript for a song to force re-analysis fn invalidate_song_cache(song_hash: &str) { let cache_dir = dirs::home_dir() .unwrap() .join(".nightingale/cache") .join(song_hash); if cache_dir.exists() { fs::remove_dir_all(&cache_dir) .expect("Failed to remove cache directory"); println!("Cache invalidated for {}", song_hash); } }
Computing a Song's Blake3 Hash (for Cache Lookup)
use blake3::Hasher; use std::fs::File; use std::io::{BufReader, Read}; fn hash_file(path: &std::path::Path) -> String { let file = File::open(path).expect("Cannot open file"); let mut reader = BufReader::new(file); let mut hasher = Hasher::new(); let mut buf = [0u8; 65536]; loop { let n = reader.read(&mut buf).unwrap(); if n == 0 { break; } hasher.update(&buf[..n]); } hasher.finalize().to_hex().to_string() }
Profile Score Update Pattern
use serde::{Deserialize, Serialize}; use std::collections::HashMap; #[derive(Debug, Serialize, Deserialize)] pub struct SongScore { pub stars: u8, pub score: u32, pub played_at: String, } #[derive(Debug, Serialize, Deserialize)] pub struct Profile { pub name: String, pub scores: HashMap<String, SongScore>, // key = blake3 hash } fn update_score(profile: &mut Profile, song_hash: &str, stars: u8, score: u32) { profile.scores.insert(song_hash.to_string(), SongScore { stars, score, played_at: chrono::Utc::now().to_rfc3339(), }); }
Troubleshooting
Bootstrap Fails / Stuck on Setup Screen
# Force re-bootstrap ./nightingale --setup # Or manually remove the vendor directory and restart rm -rf ~/.nightingale/vendor ./nightingale
Song Analysis Hangs or Errors
# Check the analyzer venv is healthy ~/.nightingale/vendor/venv/bin/python -c "import whisperx; print('ok')" # Re-bootstrap if broken ./nightingale --setup
macOS "App is damaged" Error
xattr -cr Nightingale.app
GPU Not Being Used
- NVIDIA: Ensure CUDA drivers are installed and
shows your GPU.nvidia-smi - Apple Silicon: MPS is used automatically on macOS with Apple Silicon; WhisperX alignment falls back to CPU (normal behavior).
- Check
— if PyTorch installed the CPU-only build, re-bootstrap after installing CUDA drivers.~/.nightingale/vendor/venv
Cache Corruption / Wrong Lyrics
# Find the blake3 hash of your file (build a small tool or use b3sum) b3sum /path/to/song.mp3 # Remove that song's cache rm -rf ~/.nightingale/cache/<hash>
Then re-open the song in Nightingale to re-analyze.
Audio Playback Issues (Linux)
Ensure ALSA/PulseAudio/PipeWire is running. Install missing deps:
sudo apt install libasound2-dev libudev-dev libwayland-dev libxkbcommon-dev
Video Backgrounds Not Loading
Video backgrounds are pre-downloaded during setup via the Pixabay API. For development builds, ensure
.env contains a valid PIXABAY_API_KEY. If videos are missing in a release build, run --setup to re-trigger the download.
Platform Targets
| Platform | Target Triple |
|---|---|
| Linux x86_64 | |
| Linux aarch64 | |
| macOS ARM | |
| macOS Intel | |
| Windows x86_64 | |
Cross-compile with:
rustup target add aarch64-unknown-linux-gnu cargo build --release --target aarch64-unknown-linux-gnu
License
GPL-3.0-or-later. See LICENSE.