Awesome-openclaw-skills voice-transcribe

Transcribe audio files using OpenAI's gpt-4o-mini-transcribe model with vocabulary hints and text replacements. Requires uv (https://docs.astral.sh/uv/).

install

source · Clone the upstream repo

git clone https://github.com/sundial-org/awesome-openclaw-skills

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/sundial-org/awesome-openclaw-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/voice-transcribe" ~/.claude/skills/sundial-org-awesome-openclaw-skills-voice-transcribe && rm -rf "$T"

OpenClaw · Install into ~/.openclaw/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/sundial-org/awesome-openclaw-skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/voice-transcribe" ~/.openclaw/skills/sundial-org-awesome-openclaw-skills-voice-transcribe && rm -rf "$T"

manifest: skills/voice-transcribe/SKILL.md

source content

voice-transcribe

transcribe audio files using openai's gpt-4o-mini-transcribe model.

when to use

when receiving voice memos (especially via whatsapp), just run:

uv run /Users/darin/clawd/skills/voice-transcribe/transcribe <audio-file>

then respond based on the transcribed content.

fixing transcription errors

if darin says a word was transcribed wrong, add it to

vocab.txt

(for hints) or

replacements.txt

(for guaranteed fix). see sections below.

supported formats

mp3, mp4, mpeg, mpga, m4a, wav, webm, ogg, opus

examples

# transcribe a voice memo
transcribe /tmp/voice-memo.ogg

# pipe to other tools
transcribe /tmp/memo.ogg | pbcopy

setup

add your openai api key to

/Users/darin/clawd/skills/voice-transcribe/.env

OPENAI_API_KEY=sk-...

custom vocabulary

add words to

vocab.txt

(one per line) to help the model recognize names/jargon:

Clawdis
Clawdbot

text replacements

if the model still gets something wrong, add a replacement to

replacements.txt

wrong spelling -> correct spelling

notes

assumes english (no language detection)
uses gpt-4o-mini-transcribe model specifically
caches by sha256 of audio file