Agent-Skills azure-speech
Expert knowledge for Azure AI Speech development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using Speech/Voice SDKs, STT/TTS, custom speech or neural voices, Voice Live/avatars, or telephony flows, and other Azure AI Speech related development tasks. Not for Azure Communication Services (use azure-communication-services), Azure AI Bot Service (use azure-bot-service), Azure Notification Hubs (use azure-notification-hubs), Azure AI Video Indexer (use azure-video-indexer).
git clone https://github.com/MicrosoftDocs/Agent-Skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/MicrosoftDocs/Agent-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/azure-speech" ~/.claude/skills/microsoftdocs-agent-skills-azure-speech && rm -rf "$T"
skills/azure-speech/SKILL.mdAzure AI Speech Skill
This skill provides expert guidance for Azure AI Speech. Covers troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. It combines local quick-reference content with remote documentation fetching capabilities.
How to Use This Skill
IMPORTANT for Agent: Use the Category Index below to locate relevant sections. For categories with line ranges (e.g.,
), useL35-L120with the specified lines. For categories with file links (e.g.,read_file), use[security.md](security.md)on the linked reference fileread_file
IMPORTANT for Agent: If
is more than 3 months old, suggest the user pull the latest version from the repository. Ifmetadata.generated_attools are not available, suggest the user install it: Installation Guidemcp_microsoftdocs
This skill requires network access to fetch documentation content:
- Preferred: Use
with query stringmcp_microsoftdocs:microsoft_docs_fetch
. Returns Markdown.from=learn-agent-skill - Fallback: Use
with query stringfetch_webpage
. Returns Markdown.from=learn-agent-skill&accept=text/markdown
Category Index
| Category | Lines | Description |
|---|---|---|
| Troubleshooting | L37-L47 | Diagnosing and fixing common Azure Speech/Text-to-Speech/Voice Live API and SDK errors, container and Foundry issues, CRL/compatibility problems, and retrieving session/transcription IDs for support. |
| Best Practices | L48-L64 | Best practices for audio/video prep, custom voice/avatars, latency and memory tuning, phrase/keyword optimization, and handling real-time Voice Live interactions and interruptions |
| Decision Making | L65-L83 | Guidance on choosing speech features (batch STT, custom/embedded/personal/Whisper), evaluating models/devices, and step‑by‑step migration between Speech API versions and services |
| Architecture & Design Patterns | L84-L88 | Architectural guidance for building call center voice agents using Azure AI Speech with Voice Live and Azure Communication Services, including integration patterns and design best practices. |
| Limits & Quotas | L89-L97 | Quotas, limits, and usage patterns for Azure Speech: batch TTS, custom/pro voice training & deployment, and short audio STT, plus throttling and capacity planning guidance. |
| Security | L98-L109 | Configuring security for Azure AI Speech: auth (Entra, RBAC), network isolation (VNet, Private Link, sovereign clouds), BYOS storage, encryption/keys, and voice talent consent management. |
| Configuration | L110-L144 | Configuring Azure AI Speech/Voice: audio inputs, logging, storage, SSML, languages/voices, custom speech & voice training, batch/real-time settings, and Voice Live/avatars options. |
| Integrations & Coding Patterns | L145-L168 | Integrating Azure AI Speech into apps and voice agents: SDK/REST usage, telephony, TTS/avatars, translation, LLM/Foundry/Voice Live flows, consent, and automation patterns. |
| Deployment | L169-L180 | Deploying and scaling Azure AI Speech: Docker/Kubernetes containers, on-prem STT/TTS, custom speech models/endpoints, language ID, and batch/long-form synthesis workflows. |
Troubleshooting
| Topic | URL |
|---|---|
| Troubleshoot common Azure text to speech issues | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/faq-tts |
| Retrieve Speech to text session and transcription IDs for support | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-get-speech-session-id |
| Resolve common Azure Speech in Foundry issues | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/known-issues |
| Resolve Azure AI Speech SDK CRL compatibility issues | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-to-sdk-1-48-2 |
| Troubleshoot Speech service container deployments | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-faq |
| Troubleshoot common Azure Speech SDK issues | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/troubleshooting |
| Resolve common Voice Live API issues in Speech | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-faq |
Best Practices
Decision Making
Architecture & Design Patterns
| Topic | URL |
|---|---|
| Design call center voice agents with Voice Live and ACS | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-telephony |
Limits & Quotas
| Topic | URL |
|---|---|
| Manage custom speech model and endpoint lifecycle | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-model-and-endpoint-lifecycle |
| Deploy professional voice models to custom endpoints | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/professional-voice-deploy-endpoint |
| Train professional voice models and understand duration | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/professional-voice-train-voice |
| Use Speech-to-text REST API for short audio | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/rest-speech-to-text-short |
| Apply Azure Speech quotas, limits, and throttling guidance | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-services-quotas-and-limits |