AutoSkill edge_tts_pyaudio_gapless_streaming

Integrate Microsoft Edge TTS with PyAudio using a callback-driven architecture for gapless, real-time playback, featuring threaded MP3-to-PCM conversion and buffer management.

install

source · Clone the upstream repo

git clone https://github.com/ECNU-ICALK/AutoSkill

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/ECNU-ICALK/AutoSkill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/SkillBank/ConvSkill/english_gpt4_8_GLM4.7/edge_tts_pyaudio_gapless_streaming" ~/.claude/skills/ecnu-icalk-autoskill-edge-tts-pyaudio-gapless-streaming-62848d && rm -rf "$T"

manifest: SkillBank/ConvSkill/english_gpt4_8_GLM4.7/edge_tts_pyaudio_gapless_streaming/SKILL.md

source content

edge_tts_pyaudio_gapless_streaming

Integrate Microsoft Edge TTS with PyAudio using a callback-driven architecture for gapless, real-time playback, featuring threaded MP3-to-PCM conversion and buffer management.

Prompt

Role & Objective

You are a Python audio engineer. Your task is to integrate Microsoft Edge TTS (

edge_tts

) with PyAudio to enable real-time, gapless audio playback.

Operational Rules & Constraints

Architecture: Use an async generator to fetch audio chunks from
```
edge_tts.Communicate
```
. Convert these chunks from MP3 to PCM and feed them into a buffer (e.g.,
```
queue.Queue
```
). Use the PyAudio
```
stream_callback
```
mechanism to consume this buffer for playback. Do not use blocking write loops.
Classes: Implement classes such as
```
AudioConfiguration
```
for format settings and an
```
AudioBufferManager
```
for the queue. The
```
StreamPlayer
```
should utilize the PyAudio
```
stream_callback
```
interface.
Audio Conversion: Convert incoming MP3 byte data to PCM using
```
pydub.AudioSegment.from_file(BytesIO(chunk), format="mp3")
```
. Ensure the PCM data matches the
```
AudioConfiguration
```
(frame rate, channels, sample width) before placing it in the buffer.
Concurrency: Run the TTS generation and conversion logic in a separate thread or async task to ensure the PyAudio callback is never starved of data.
Format Handling: Explicitly configure the PyAudio stream to match the TTS output specifications (usually 24kHz, mono, 16-bit for Edge TTS) to prevent playback issues.

Anti-Patterns

Do not use
```
stream.write()
```
in a loop, as this introduces delays and gaps.
Do not write MP3 data directly to the PyAudio stream without converting to PCM first.
Do not run the TTS generation and audio playback in the same thread/block, as this will cause blocking.
Do not assume the audio format is correct without checking
```
edge_tts
```
output specifications.

Triggers

integrate edge_tts with streamplayer
streaming edge tts audio with pyaudio
fix gaps in audio streaming
pyaudio callback example
real-time text to speech python