Awesome-openclaw-skills whisper-mlx-local
Free local speech-to-text for Telegram and WhatsApp using MLX Whisper on Apple Silicon. Private, no API costs.
install
source · Clone the upstream repo
git clone https://github.com/sundial-org/awesome-openclaw-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/sundial-org/awesome-openclaw-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/whisper-mlx-local" ~/.claude/skills/sundial-org-awesome-openclaw-skills-whisper-mlx-local && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/sundial-org/awesome-openclaw-skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/whisper-mlx-local" ~/.openclaw/skills/sundial-org-awesome-openclaw-skills-whisper-mlx-local && rm -rf "$T"
manifest:
skills/whisper-mlx-local/SKILL.mdsource content
Local Whisper
Transcribe voice messages for free on Telegram and WhatsApp. No API keys. No costs. Runs on your Mac.
The Problem
Voice transcription APIs cost money:
- OpenAI Whisper: $0.006/minute
- Groq: $0.001/minute
- AssemblyAI: $0.01/minute
If you transcribe a lot of Telegram voice messages, it adds up.
The Solution
This skill runs Whisper locally on your Mac. Same quality, zero cost.
- ✅ Free forever
- ✅ Private (audio never leaves your Mac)
- ✅ Fast (~1 second per message)
- ✅ Works offline
⚠️ Important Notes
- First run downloads ~1.5GB model — be patient, this only happens once
- First transcription is slow — model loads into memory (~10-30 seconds), then it's instant
- Already using OpenAI API for transcription? Replace your existing
config with the one belowtools.media.audio
Quick Start
1. Install dependencies
pip3 install -r requirements.txt
2. Start the daemon
python3 scripts/daemon.py
First run will download the Whisper model (~1.5GB). Wait for "Ready" message.
3. Add to OpenClaw config
Add this to your
~/.openclaw/openclaw.json:
{ "tools": { "media": { "audio": { "enabled": true, "models": [ { "type": "cli", "command": "~/.openclaw/workspace/skills/local-whisper/scripts/transcribe.sh", "args": ["{{MediaPath}}"], "timeoutSeconds": 60 } ] } } } }
4. Restart gateway
openclaw gateway restart
Now voice messages from Telegram, WhatsApp, etc. will be transcribed locally for free!
Manual test
./scripts/transcribe.sh voice_message.ogg
Use Case: Telegram Voice Messages
Instead of paying for OpenAI API to transcribe incoming voice messages, point OpenClaw to this local daemon. Free transcription forever.
Auto-Start on Login
cp com.local-whisper.plist ~/Library/LaunchAgents/ launchctl load ~/Library/LaunchAgents/com.local-whisper.plist
API
Daemon runs at
localhost:8787:
curl -X POST http://localhost:8787/transcribe -F "file=@audio.ogg" # {"text": "Hello world", "language": "en"}
Translation
Any language → English:
./scripts/transcribe.sh spanish_audio.ogg --translate
Requirements
- macOS with Apple Silicon (M1/M2/M3/M4)
- Python 3.9+
License
MIT