Trending-skills nightingale-karaoke

ML-powered Karaoke app in Rust using Bevy, WhisperX, and Demucs for stem separation, lyrics transcription, and pitch scoring.

install
source · Clone the upstream repo
git clone https://github.com/Aradotso/trending-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/Aradotso/trending-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/nightingale-karaoke" ~/.claude/skills/aradotso-trending-skills-nightingale-karaoke && rm -rf "$T"
manifest: skills/nightingale-karaoke/SKILL.md
source content

Nightingale Karaoke Skill

Skill by ara.so — Daily 2026 Skills collection.

Nightingale is a self-contained, ML-powered karaoke application written in Rust (Bevy engine). It scans a local music folder, separates vocals from instrumentals (UVR Karaoke model or Demucs), transcribes lyrics with word-level timestamps (WhisperX), and plays back with synchronized highlighting, real-time pitch scoring, player profiles, and GPU shader / video backgrounds. Everything — ffmpeg, Python, PyTorch, ML models — is bootstrapped automatically on first launch.


Installation

Pre-built Binary (Recommended)

Download the latest release from the Releases page for your platform and run it.

macOS only — remove quarantine after extracting:

xattr -cr Nightingale.app

Build from Source

Prerequisites:

  • Rust 1.85+ (edition 2024)
  • Linux additionally needs:
    libasound2-dev libudev-dev libwayland-dev libxkbcommon-dev
git clone https://github.com/rzru/nightingale
cd nightingale

# Development build
cargo build --release

# Run directly
./target/release/nightingale

Release Packaging

# Linux / macOS
scripts/make-release.sh

# Windows (PowerShell)
powershell -ExecutionPolicy Bypass -File scripts/make-release.ps1

Outputs a

.tar.gz
(Linux/macOS) or
.zip
(Windows) ready for distribution.


First Launch / Bootstrap

On first run, Nightingale downloads and configures:

  • ffmpeg
    binary
  • uv
    (Python package manager)
  • Python 3.10 via uv
  • PyTorch + WhisperX + audio-separator in a virtual environment
  • UVR Karaoke ONNX model and WhisperX
    large-v3
    model

This takes 2–10 minutes depending on network speed. A progress screen is shown in-app.

To force re-bootstrap at any time:

./nightingale --setup

Bootstrap completion is marked by

~/.nightingale/vendor/.ready
.


CLI Flags

FlagDescription
--setup
Force re-run of the first-launch bootstrap (re-downloads vendor deps)

Keyboard & Gamepad Controls

Navigation

ActionKeyboardGamepad
MoveArrow keysD-pad / Left stick
ConfirmEnterA (South)
BackEscapeB (East) / Start
Switch panelTab
SearchType to filter

Playback

ActionKeyboardGamepad
Pause / ResumeSpaceStart
Exit to menuEscapeB (East)
Toggle guide vocalsG
Guide volume up/down+ / -
Cycle backgroundT
Cycle video flavorF
Toggle microphoneM
Next microphoneN
Toggle fullscreenF11

Configuration

Main Config

Located at

~/.nightingale/config.json
. Edit directly or via in-app settings.

{
  "music_folder": "/home/user/Music",
  "separator": "uvr",
  "guide_vocal_volume": 0.3,
  "background_theme": "plasma",
  "video_flavor": "nature",
  "default_profile": "Alice"
}

separator
options:
"uvr"
(default, preserves backing vocals) |
"demucs"

background_theme
options:
"plasma"
,
"aurora"
,
"waves"
,
"nebula"
,
"starfield"
,
"video"
,
"source_video"

video_flavor
options:
"nature"
,
"underwater"
,
"space"
,
"city"
,
"countryside"

Profiles

Located at

~/.nightingale/profiles.json
:

{
  "profiles": [
    {
      "name": "Alice",
      "scores": {
        "blake3_hash_of_song": {
          "stars": 4,
          "score": 87250,
          "played_at": "2026-03-18T21:00:00Z"
        }
      }
    }
  ]
}

Pixabay Video Backgrounds (Dev)

API key is embedded in release builds. For local development, create

.env
at project root:

# .env
PIXABAY_API_KEY=$PIXABAY_API_KEY

The release script (

make-release.sh
) sources
.env
automatically.


Data Storage Layout

~/.nightingale/
├── cache/              # Per-song stems, transcripts, lyrics (keyed by blake3 hash)
├── config.json         # App settings
├── profiles.json       # Player profiles and per-song scores
├── videos/             # Pre-downloaded Pixabay video backgrounds
├── sounds/             # Sound effects
├── vendor/
│   ├── ffmpeg          # ffmpeg binary
│   ├── uv              # uv binary
│   ├── python/         # Python 3.10
│   ├── venv/           # ML virtualenv (WhisperX, Demucs, audio-separator)
│   ├── analyzer/       # Python analyzer scripts
│   └── .ready          # Bootstrap completion marker
└── models/
    ├── torch/          # Demucs model weights
    ├── huggingface/    # WhisperX large-v3 weights
    └── audio_separator/ # UVR Karaoke ONNX model

Cache keys are blake3 hashes of the source file — re-analysis only triggers if the file changes or is manually invalidated.


Supported File Formats

Audio:

.mp3
,
.flac
,
.ogg
,
.wav
,
.m4a
,
.aac
,
.wma

Video:

.mp4
,
.mkv
,
.avi
,
.webm
,
.mov
,
.m4v

Video files: audio track is extracted, vocals separated, original video plays as background automatically.


Hardware Acceleration

PyTorch backend is auto-detected:

BackendDeviceNotes
CUDANVIDIA GPUFastest; ~2–5 min/song
MPSApple SiliconmacOS; WhisperX alignment falls back to CPU
CPUAnyAlways works; ~10–20 min/song

UVR Karaoke model uses ONNX Runtime with CUDA (NVIDIA) or CoreML (Apple Silicon) automatically.


Processing Pipeline

Audio/Video file
       │
       ▼
 UVR Karaoke (ONNX) or Demucs (PyTorch)
       │  vocals.ogg + instrumental.ogg
       ▼
 LRCLIB API  ──▶  Synced lyrics fetch (if available)
       │
       ▼
 WhisperX large-v3  ──▶  Transcription + word-level timestamps
       │
       ▼
 Bevy App (Rust)
   - Plays instrumental audio
   - Synchronized word highlighting
   - Real-time pitch detection & scoring
   - GPU shader / video backgrounds
   - Scoreboards per profile

Code Patterns

Adding a New Background Theme (Bevy System)

// In your Bevy plugin, register a new background variant
use bevy::prelude::*;

#[derive(Component)]
pub struct MyCustomBackground;

pub fn spawn_custom_background(mut commands: Commands) {
    commands.spawn((
        MyCustomBackground,
        // ... your background components
    ));
}

pub struct CustomBackgroundPlugin;

impl Plugin for CustomBackgroundPlugin {
    fn build(&self, app: &mut App) {
        app.add_systems(OnEnter(AppState::Playing), spawn_custom_background);
    }
}

Extending Config Deserialization

use serde::{Deserialize, Serialize};

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct NightingaleConfig {
    pub music_folder: String,
    #[serde(default = "default_separator")]
    pub separator: StemSeparator,
    #[serde(default = "default_guide_volume")]
    pub guide_vocal_volume: f32,
}

#[derive(Debug, Clone, Serialize, Deserialize, Default)]
#[serde(rename_all = "lowercase")]
pub enum StemSeparator {
    #[default]
    Uvr,
    Demucs,
}

fn default_guide_volume() -> f32 { 0.3 }
fn default_separator() -> StemSeparator { StemSeparator::Uvr }

// Load config
fn load_config() -> NightingaleConfig {
    let path = dirs::home_dir()
        .unwrap()
        .join(".nightingale/config.json");
    let raw = std::fs::read_to_string(&path).unwrap_or_default();
    serde_json::from_str(&raw).unwrap_or_default()
}

Triggering Re-analysis Programmatically

use std::fs;
use std::path::PathBuf;

/// Remove cached stems/transcript for a song to force re-analysis
fn invalidate_song_cache(song_hash: &str) {
    let cache_dir = dirs::home_dir()
        .unwrap()
        .join(".nightingale/cache")
        .join(song_hash);

    if cache_dir.exists() {
        fs::remove_dir_all(&cache_dir)
            .expect("Failed to remove cache directory");
        println!("Cache invalidated for {}", song_hash);
    }
}

Computing a Song's Blake3 Hash (for Cache Lookup)

use blake3::Hasher;
use std::fs::File;
use std::io::{BufReader, Read};

fn hash_file(path: &std::path::Path) -> String {
    let file = File::open(path).expect("Cannot open file");
    let mut reader = BufReader::new(file);
    let mut hasher = Hasher::new();
    let mut buf = [0u8; 65536];
    loop {
        let n = reader.read(&mut buf).unwrap();
        if n == 0 { break; }
        hasher.update(&buf[..n]);
    }
    hasher.finalize().to_hex().to_string()
}

Profile Score Update Pattern

use serde::{Deserialize, Serialize};
use std::collections::HashMap;

#[derive(Debug, Serialize, Deserialize)]
pub struct SongScore {
    pub stars: u8,
    pub score: u32,
    pub played_at: String,
}

#[derive(Debug, Serialize, Deserialize)]
pub struct Profile {
    pub name: String,
    pub scores: HashMap<String, SongScore>, // key = blake3 hash
}

fn update_score(profile: &mut Profile, song_hash: &str, stars: u8, score: u32) {
    profile.scores.insert(song_hash.to_string(), SongScore {
        stars,
        score,
        played_at: chrono::Utc::now().to_rfc3339(),
    });
}

Troubleshooting

Bootstrap Fails / Stuck on Setup Screen

# Force re-bootstrap
./nightingale --setup

# Or manually remove the vendor directory and restart
rm -rf ~/.nightingale/vendor
./nightingale

Song Analysis Hangs or Errors

# Check the analyzer venv is healthy
~/.nightingale/vendor/venv/bin/python -c "import whisperx; print('ok')"

# Re-bootstrap if broken
./nightingale --setup

macOS "App is damaged" Error

xattr -cr Nightingale.app

GPU Not Being Used

  • NVIDIA: Ensure CUDA drivers are installed and
    nvidia-smi
    shows your GPU.
  • Apple Silicon: MPS is used automatically on macOS with Apple Silicon; WhisperX alignment falls back to CPU (normal behavior).
  • Check
    ~/.nightingale/vendor/venv
    — if PyTorch installed the CPU-only build, re-bootstrap after installing CUDA drivers.

Cache Corruption / Wrong Lyrics

# Find the blake3 hash of your file (build a small tool or use b3sum)
b3sum /path/to/song.mp3

# Remove that song's cache
rm -rf ~/.nightingale/cache/<hash>

Then re-open the song in Nightingale to re-analyze.

Audio Playback Issues (Linux)

Ensure ALSA/PulseAudio/PipeWire is running. Install missing deps:

sudo apt install libasound2-dev libudev-dev libwayland-dev libxkbcommon-dev

Video Backgrounds Not Loading

Video backgrounds are pre-downloaded during setup via the Pixabay API. For development builds, ensure

.env
contains a valid
PIXABAY_API_KEY
. If videos are missing in a release build, run
--setup
to re-trigger the download.


Platform Targets

PlatformTarget Triple
Linux x86_64
x86_64-unknown-linux-gnu
Linux aarch64
aarch64-unknown-linux-gnu
macOS ARM
aarch64-apple-darwin
macOS Intel
x86_64-apple-darwin
Windows x86_64
x86_64-pc-windows-msvc

Cross-compile with:

rustup target add aarch64-unknown-linux-gnu
cargo build --release --target aarch64-unknown-linux-gnu

License

GPL-3.0-or-later. See LICENSE.