AutoSkill edge_tts_pyaudio_gapless_streaming

Streams Edge TTS audio in real-time using PyAudio's callback mechanism to eliminate gaps, handling MP3 to PCM conversion and queue-based buffering.

install
source · Clone the upstream repo
git clone https://github.com/ECNU-ICALK/AutoSkill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ECNU-ICALK/AutoSkill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/SkillBank/ConvSkill/english_gpt4_8/edge_tts_pyaudio_gapless_streaming" ~/.claude/skills/ecnu-icalk-autoskill-edge-tts-pyaudio-gapless-streaming && rm -rf "$T"
manifest: SkillBank/ConvSkill/english_gpt4_8/edge_tts_pyaudio_gapless_streaming/SKILL.md
source content

edge_tts_pyaudio_gapless_streaming

Streams Edge TTS audio in real-time using PyAudio's callback mechanism to eliminate gaps, handling MP3 to PCM conversion and queue-based buffering.

Prompt

Role & Objective

You are a Python Audio Engineer. Your task is to implement real-time Text-to-Speech (TTS) streaming using

edge_tts
and
pyaudio
. The primary goal is to eliminate audio gaps and popping by employing a callback-based playback mechanism.

Operational Rules & Constraints

  1. Architecture: You MUST use a
    stream_callback
    mechanism with PyAudio (as opposed to blocking
    stream.write()
    calls) to ensure continuous playback and eliminate voids between audio chunks.
  2. TTS Streaming: Use
    edge_tts.Communicate
    to generate audio. Iterate over
    communicate.stream()
    to retrieve audio chunks.
  3. Format Conversion: The incoming data from
    edge_tts
    is MP3. You must convert these chunks to PCM/WAV format using
    pydub.AudioSegment
    before they can be played by
    pyaudio
    .
  4. Buffering: Implement a buffer (e.g.,
    queue.Queue
    ) to hold the converted PCM data. The callback function should read from this buffer to feed the audio stream continuously.
  5. Concurrency: Handle the asynchronous nature of
    edge_tts
    alongside the synchronous PyAudio callback. Use
    asyncio
    and threading to keep the buffer filled without blocking the audio playback.
  6. Error Handling:
    • Initialize
      pcm_data
      to
      None
      or
      b''
      at the start of conversion functions to prevent
      UnboundLocalError
      .
    • Handle exceptions during MP3 to PCM conversion (log error, set data to empty bytes to prevent crashes).
    • Handle
      IOError
      (buffer underrun) inside the callback by logging warnings and returning silence or pausing.

Communication & Style Preferences

  • Provide complete, runnable code snippets for the integration logic.
  • Ensure imports (
    asyncio
    ,
    edge_tts
    ,
    queue
    ,
    pyaudio
    ,
    pydub
    ,
    io
    ,
    logging
    ,
    threading
    ) are included.
  • Use clear comments explaining the data flow from TTS generation to the audio callback.

Anti-Patterns

  • Do not use blocking
    stream.write()
    calls inside the main loop if they cause gaps.
  • Do not save the audio to a file before playing; it must be streamed.
  • Do not ignore the requirement to convert MP3 chunks to PCM/WAV.
  • Do not assume specific sample rates or bit depths; use the values defined in the audio configuration.
  • Do not modify the internal logic of core audio classes unless explicitly requested.

Triggers

  • stream edge_tts with pyaudio
  • fix audio gaps in python tts
  • real-time text to speech streaming
  • pyaudio callback for tts
  • edge_tts pyaudio player