Claude-skill-registry azure-ai
Build AI solutions with Azure AI services including OpenAI, Cognitive Services, Document Intelligence, and AI Search. Use for enterprise AI, document processing, and intelligent applications on Azure.
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/azure-ai" ~/.claude/skills/majiayu000-claude-skill-registry-azure-ai-2643ac && rm -rf "$T"
manifest:
skills/data/azure-ai/SKILL.mdsource content
Azure AI Skill
Complete guidance for building, configuring, troubleshooting, and managing Azure AI services.
Quick Reference
Service Categories
| Category | Services |
|---|---|
| AI Platform | Microsoft Foundry (Azure AI Foundry), Azure AI Hub, AI Projects |
| Generative AI | Azure OpenAI Service (GPT-4, GPT-4o, o1, DALL-E, Whisper) |
| Search & RAG | Azure AI Search (vector, semantic, hybrid, agentic retrieval) |
| AI Agents | Azure AI Agent Service, Foundry Agent Service, Multi-agent Orchestration |
| Document AI | Document Intelligence (OCR, form extraction, prebuilt models) |
| Cognitive Services | Vision, Speech, Language, Translator, Content Safety |
| ML Platform | Azure Machine Learning (MLOps, training, deployment) |
| Governance | Responsible AI, Content Filtering, Safety Evaluations |
Common CLI Prefixes
az cognitiveservices # Cognitive Services & Azure OpenAI az search # Azure AI Search az ml # Azure Machine Learning az ai # Azure AI resources (newer)
1. Microsoft Foundry (Azure AI Foundry)
Overview
Microsoft Foundry is the unified platform for enterprise AI operations, combining:
- AI Hub: Shared infrastructure (connections, compute, policies)
- AI Projects: Workspaces for building AI applications
- Model Catalog: Pre-trained models from Azure OpenAI, Meta, Mistral, Cohere
- Prompt Flow: Visual orchestration for LLM workflows
Portal Access
- Foundry (New): https://ai.azure.com
- Foundry (Classic): https://ai.azure.com/build (legacy)
Create AI Hub & Project
# Create resource group az group create --name rg-ai-foundry --location eastus # Create AI Hub (shared infrastructure) az ml workspace create \ --name ai-hub-prod \ --resource-group rg-ai-foundry \ --kind hub \ --location eastus # Create AI Project (linked to hub) az ml workspace create \ --name ai-project-chatbot \ --resource-group rg-ai-foundry \ --kind project \ --hub-id /subscriptions/{sub}/resourceGroups/rg-ai-foundry/providers/Microsoft.MachineLearningServices/workspaces/ai-hub-prod
Python SDK Setup
# Install SDK # pip install azure-ai-projects azure-identity from azure.ai.projects import AIProjectClient from azure.identity import DefaultAzureCredential # Initialize client project = AIProjectClient( credential=DefaultAzureCredential(), endpoint="https://<hub-name>.api.azureml.ms", project_name="ai-project-chatbot" ) # List models in project for model in project.models.list(): print(f"{model.name}: {model.description}")
Connections Management
# List connections in hub az ml connection list --workspace-name ai-hub-prod --resource-group rg-ai-foundry # Create Azure OpenAI connection az ml connection create \ --file connection.yml \ --workspace-name ai-hub-prod \ --resource-group rg-ai-foundry
Connection YAML example:
# connection.yml name: aoai-connection type: azure_open_ai target: https://<openai-resource>.openai.azure.com/ api_key: <your-api-key> api_version: "2024-10-21"
2. Azure OpenAI Service
Deployment Types
| Type | Use Case | Billing |
|---|---|---|
| Standard | Development, testing | Pay-per-token |
| Global Standard | Production, global routing | Pay-per-token |
| Provisioned (PTU) | High-throughput, predictable latency | Reserved capacity |
| Data Zone | Data residency requirements | Region-specific |
Available Models (as of 2024)
- GPT-4o (latest multimodal) - Text, images, audio
- GPT-4 Turbo - 128k context window
- GPT-4 - 8k/32k context
- o1-preview / o1-mini - Reasoning models
- DALL-E 3 - Image generation
- Whisper - Speech-to-text
- text-embedding-ada-002 / text-embedding-3-large - Embeddings
Create Azure OpenAI Resource
# Create Cognitive Services account for OpenAI az cognitiveservices account create \ --name openai-prod \ --resource-group rg-ai \ --kind OpenAI \ --sku S0 \ --location eastus \ --custom-domain openai-prod # Deploy a model az cognitiveservices account deployment create \ --name openai-prod \ --resource-group rg-ai \ --deployment-name gpt-4o-deployment \ --model-name gpt-4o \ --model-version "2024-08-06" \ --model-format OpenAI \ --sku-name Standard \ --sku-capacity 10
List Deployments & Models
# List all deployments az cognitiveservices account deployment list \ --name openai-prod \ --resource-group rg-ai \ --output table # List available models in region az cognitiveservices account list-models \ --name openai-prod \ --resource-group rg-ai
Python SDK Usage
# pip install openai from openai import AzureOpenAI client = AzureOpenAI( api_key="<your-key>", api_version="2024-10-21", azure_endpoint="https://openai-prod.openai.azure.com" ) # Chat completion response = client.chat.completions.create( model="gpt-4o-deployment", # deployment name messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Explain quantum computing"} ], max_tokens=500, temperature=0.7 ) print(response.choices[0].message.content) # Embeddings embedding = client.embeddings.create( model="text-embedding-3-large", input="The quick brown fox" ) print(f"Vector dimension: {len(embedding.data[0].embedding)}") # Image generation image = client.images.generate( model="dall-e-3", prompt="A futuristic city skyline at sunset", size="1024x1024", quality="hd" ) print(image.data[0].url)
Content Filtering
# View content filter configuration az cognitiveservices account show \ --name openai-prod \ --resource-group rg-ai \ --query properties.contentFilterConfiguration
Configure custom content filter policy:
# Categories: hate, violence, sexual, self-harm # Severity levels: safe, low, medium, high filter_config = { "hate": {"severity": "medium", "blocking": True}, "violence": {"severity": "low", "blocking": True}, "sexual": {"severity": "medium", "blocking": True}, "self_harm": {"severity": "low", "blocking": True} }
3. Azure AI Search
Search Modes
| Mode | Features | Use Case |
|---|---|---|
| Keyword | BM25 ranking, full-text | Traditional search |
| Vector | Embedding similarity | Semantic similarity |
| Hybrid | Keyword + Vector | Best of both |
| Semantic | Re-ranking with language models | Improved relevance |
| Agentic Retrieval | Knowledge store for AI agents | RAG applications |
Create Search Service
# Create search service az search service create \ --name search-prod \ --resource-group rg-ai \ --sku standard \ --location eastus \ --partition-count 1 \ --replica-count 1 # Get admin keys az search admin-key show \ --service-name search-prod \ --resource-group rg-ai # Get query keys az search query-key list \ --service-name search-prod \ --resource-group rg-ai
Create Vector Index
# pip install azure-search-documents from azure.search.documents.indexes import SearchIndexClient from azure.search.documents.indexes.models import ( SearchIndex, SearchField, SearchFieldDataType, VectorSearch, HnswAlgorithmConfiguration, VectorSearchProfile, SemanticConfiguration, SemanticField, SemanticPrioritizedFields, SemanticSearch ) from azure.core.credentials import AzureKeyCredential # Initialize client index_client = SearchIndexClient( endpoint="https://search-prod.search.windows.net", credential=AzureKeyCredential("<admin-key>") ) # Define index with vector field index = SearchIndex( name="documents-index", fields=[ SearchField(name="id", type=SearchFieldDataType.String, key=True), SearchField(name="title", type=SearchFieldDataType.String, searchable=True), SearchField(name="content", type=SearchFieldDataType.String, searchable=True), SearchField(name="category", type=SearchFieldDataType.String, filterable=True, facetable=True), SearchField( name="content_vector", type=SearchFieldDataType.Collection(SearchFieldDataType.Single), searchable=True, vector_search_dimensions=1536, # text-embedding-ada-002 vector_search_profile_name="vector-profile" ) ], vector_search=VectorSearch( algorithms=[ HnswAlgorithmConfiguration(name="hnsw-config") ], profiles=[ VectorSearchProfile( name="vector-profile", algorithm_configuration_name="hnsw-config" ) ] ), semantic_search=SemanticSearch( configurations=[ SemanticConfiguration( name="semantic-config", prioritized_fields=SemanticPrioritizedFields( title_field=SemanticField(field_name="title"), content_fields=[SemanticField(field_name="content")] ) ) ] ) ) # Create index index_client.create_or_update_index(index)
Hybrid Search Query
from azure.search.documents import SearchClient from azure.search.documents.models import VectorizedQuery search_client = SearchClient( endpoint="https://search-prod.search.windows.net", index_name="documents-index", credential=AzureKeyCredential("<query-key>") ) # Get query embedding (from Azure OpenAI) query_embedding = get_embedding("What is machine learning?") # Hybrid search (keyword + vector) results = search_client.search( search_text="machine learning", vector_queries=[ VectorizedQuery( vector=query_embedding, k_nearest_neighbors=5, fields="content_vector" ) ], query_type="semantic", semantic_configuration_name="semantic-config", top=10 ) for result in results: print(f"{result['title']}: {result['@search.score']}")
Agentic Retrieval (Knowledge Store)
# pip install azure-ai-projects from azure.ai.projects import AIProjectClient from azure.ai.projects.models import AgentKnowledgeStore # Create knowledge store linked to search index knowledge_store = project.agents.knowledge_stores.create( name="docs-knowledge", index_name="documents-index", search_endpoint="https://search-prod.search.windows.net", semantic_configuration="semantic-config" ) # Use in agent agent = project.agents.create( name="doc-assistant", model="gpt-4o", knowledge_store_ids=[knowledge_store.id] )
4. Azure AI Agents
Agent Types
| Type | Description | Use Case |
|---|---|---|
| Foundry Agent | Managed agent with tools | Chat assistants |
| Code Interpreter | Python execution sandbox | Data analysis |
| File Search | Document retrieval | RAG applications |
| Function Calling | Custom function execution | API integration |
| Multi-Agent | Orchestrated agent swarm | Complex workflows |
Create Basic Agent
# pip install azure-ai-projects azure-ai-agents from azure.ai.projects import AIProjectClient from azure.ai.agents import AgentsClient from azure.identity import DefaultAzureCredential # Initialize project = AIProjectClient( credential=DefaultAzureCredential(), endpoint="https://<hub>.api.azureml.ms", project_name="my-project" ) # Create agent with tools agent = project.agents.create_agent( model="gpt-4o", name="data-analyst", instructions="You are a data analyst. Analyze data and create visualizations.", tools=[ {"type": "code_interpreter"}, {"type": "file_search"} ] ) # Create thread and run thread = project.agents.create_thread() message = project.agents.create_message( thread_id=thread.id, role="user", content="Analyze the sales data and create a trend chart" ) run = project.agents.create_run( thread_id=thread.id, agent_id=agent.id ) # Wait for completion import time while run.status in ["queued", "in_progress"]: time.sleep(1) run = project.agents.get_run(thread_id=thread.id, run_id=run.id) # Get response messages = project.agents.list_messages(thread_id=thread.id) for msg in messages.data: if msg.role == "assistant": print(msg.content[0].text.value)
Function Calling Agent
# Define custom functions tools = [ { "type": "function", "function": { "name": "get_weather", "description": "Get current weather for a location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "City name" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"] } }, "required": ["location"] } } } ] agent = project.agents.create_agent( model="gpt-4o", name="weather-assistant", instructions="Help users with weather information.", tools=tools ) # Handle function calls in run loop while run.status == "requires_action": tool_calls = run.required_action.submit_tool_outputs.tool_calls tool_outputs = [] for call in tool_calls: if call.function.name == "get_weather": args = json.loads(call.function.arguments) result = fetch_weather(args["location"]) # Your function tool_outputs.append({ "tool_call_id": call.id, "output": json.dumps(result) }) run = project.agents.submit_tool_outputs( thread_id=thread.id, run_id=run.id, tool_outputs=tool_outputs )
Multi-Agent Orchestration
# Supervisor pattern - one agent coordinates others supervisor = project.agents.create_agent( model="gpt-4o", name="supervisor", instructions="""You are a supervisor coordinating a team: - researcher: Finds information - writer: Creates content - reviewer: Reviews and edits Delegate tasks and synthesize results.""" ) researcher = project.agents.create_agent( model="gpt-4o", name="researcher", instructions="You research topics and provide factual information.", tools=[{"type": "file_search"}] ) writer = project.agents.create_agent( model="gpt-4o", name="writer", instructions="You write clear, engaging content based on research." ) reviewer = project.agents.create_agent( model="gpt-4o", name="reviewer", instructions="You review content for accuracy, clarity, and style." ) # Orchestration logic handles routing between agents
5. Document Intelligence
Prebuilt Models
| Model | Use Case |
|---|---|
| read | General OCR, text extraction |
| layout | Tables, figures, structure |
| invoice | Invoice data extraction |
| receipt | Receipt parsing |
| id-document | IDs, passports, driver licenses |
| business-card | Contact information |
| tax documents | W-2, 1099, etc. |
| mortgage | Loan documents |
| health-insurance | Insurance cards |
| contract | Legal documents |
Create Document Intelligence Resource
az cognitiveservices account create \ --name doc-intel-prod \ --resource-group rg-ai \ --kind FormRecognizer \ --sku S0 \ --location eastus
Python SDK Usage
# pip install azure-ai-documentintelligence from azure.ai.documentintelligence import DocumentIntelligenceClient from azure.ai.documentintelligence.models import AnalyzeDocumentRequest from azure.core.credentials import AzureKeyCredential client = DocumentIntelligenceClient( endpoint="https://doc-intel-prod.cognitiveservices.azure.com", credential=AzureKeyCredential("<key>") ) # Analyze invoice with open("invoice.pdf", "rb") as f: poller = client.begin_analyze_document( model_id="prebuilt-invoice", analyze_request=AnalyzeDocumentRequest(bytes_source=f.read()) ) result = poller.result() for invoice in result.documents: print(f"Vendor: {invoice.fields.get('VendorName', {}).get('content')}") print(f"Total: {invoice.fields.get('InvoiceTotal', {}).get('content')}") print(f"Date: {invoice.fields.get('InvoiceDate', {}).get('content')}") # Line items for item in invoice.fields.get("Items", {}).get("valueArray", []): print(f" - {item.get('content')}") # Layout analysis (tables, figures) poller = client.begin_analyze_document( model_id="prebuilt-layout", analyze_request=AnalyzeDocumentRequest(url_source="https://example.com/doc.pdf") ) result = poller.result() for table in result.tables: print(f"Table: {table.row_count} rows x {table.column_count} cols") for cell in table.cells: print(f" [{cell.row_index},{cell.column_index}]: {cell.content}")
Custom Model Training
# Train custom extraction model training_data = "https://storage.blob.core.windows.net/training-data?sv=..." poller = client.begin_build_document_model( build_request={ "modelId": "custom-contract-model", "description": "Custom contract extraction", "azureBlobSource": { "containerUrl": training_data } } ) model = poller.result() print(f"Model ID: {model.model_id}") print(f"Fields: {list(model.doc_types.values())[0].field_schema.keys()}")
6. Cognitive Services
Vision
# pip install azure-ai-vision-imageanalysis from azure.ai.vision.imageanalysis import ImageAnalysisClient from azure.ai.vision.imageanalysis.models import VisualFeatures from azure.core.credentials import AzureKeyCredential client = ImageAnalysisClient( endpoint="https://vision-prod.cognitiveservices.azure.com", credential=AzureKeyCredential("<key>") ) # Analyze image result = client.analyze( image_url="https://example.com/image.jpg", visual_features=[ VisualFeatures.CAPTION, VisualFeatures.TAGS, VisualFeatures.OBJECTS, VisualFeatures.DENSE_CAPTIONS, VisualFeatures.READ, # OCR VisualFeatures.SMART_CROPS, VisualFeatures.PEOPLE ] ) print(f"Caption: {result.caption.text} ({result.caption.confidence:.2f})") print(f"Tags: {', '.join([t.name for t in result.tags.list])}") for obj in result.objects.list: print(f"Object: {obj.tags[0].name} at {obj.bounding_box}")
Speech
# pip install azure-cognitiveservices-speech import azure.cognitiveservices.speech as speechsdk speech_config = speechsdk.SpeechConfig( subscription="<key>", region="eastus" ) # Speech-to-text audio_config = speechsdk.AudioConfig(filename="audio.wav") recognizer = speechsdk.SpeechRecognizer( speech_config=speech_config, audio_config=audio_config ) result = recognizer.recognize_once() print(f"Recognized: {result.text}") # Text-to-speech speech_config.speech_synthesis_voice_name = "en-US-JennyNeural" synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config) result = synthesizer.speak_text_async("Hello, this is Azure Speech.").get() audio_data = result.audio_data
Language
# pip install azure-ai-textanalytics from azure.ai.textanalytics import TextAnalyticsClient from azure.core.credentials import AzureKeyCredential client = TextAnalyticsClient( endpoint="https://language-prod.cognitiveservices.azure.com", credential=AzureKeyCredential("<key>") ) documents = ["Azure AI is amazing! I love using it for my projects."] # Sentiment analysis result = client.analyze_sentiment(documents)[0] print(f"Sentiment: {result.sentiment} ({result.confidence_scores})") # Key phrase extraction result = client.extract_key_phrases(documents)[0] print(f"Key phrases: {result.key_phrases}") # Entity recognition result = client.recognize_entities(documents)[0] for entity in result.entities: print(f"Entity: {entity.text} ({entity.category})") # Language detection result = client.detect_language(documents)[0] print(f"Language: {result.primary_language.name}")
Translator
# pip install azure-ai-translation-text from azure.ai.translation.text import TextTranslationClient from azure.core.credentials import AzureKeyCredential client = TextTranslationClient( credential=AzureKeyCredential("<key>"), region="eastus" ) # Translate text result = client.translate( body=["Hello, how are you?"], to_language=["es", "fr", "de"] ) for translation in result[0].translations: print(f"{translation.to}: {translation.text}") # Detect language result = client.detect_language(body=["Bonjour le monde"]) print(f"Detected: {result[0].language} ({result[0].score})")
7. Content Safety
Categories & Severity Levels
| Category | Description | Severity (0-7) |
|---|---|---|
| Hate | Discriminatory content | 0=safe, 2=low, 4=medium, 6=high |
| Violence | Violent content | 0=safe, 2=low, 4=medium, 6=high |
| Sexual | Sexual content | 0=safe, 2=low, 4=medium, 6=high |
| SelfHarm | Self-harm content | 0=safe, 2=low, 4=medium, 6=high |
Create Content Safety Resource
az cognitiveservices account create \ --name content-safety-prod \ --resource-group rg-ai \ --kind ContentSafety \ --sku S0 \ --location eastus
Python SDK Usage
# pip install azure-ai-contentsafety from azure.ai.contentsafety import ContentSafetyClient from azure.ai.contentsafety.models import AnalyzeTextOptions, TextCategory from azure.core.credentials import AzureKeyCredential client = ContentSafetyClient( endpoint="https://content-safety-prod.cognitiveservices.azure.com", credential=AzureKeyCredential("<key>") ) # Analyze text request = AnalyzeTextOptions( text="Sample text to analyze for safety", categories=[ TextCategory.HATE, TextCategory.VIOLENCE, TextCategory.SEXUAL, TextCategory.SELF_HARM ] ) result = client.analyze_text(request) for category_result in result.categories_analysis: print(f"{category_result.category}: severity {category_result.severity}") # Check if content should be blocked (threshold-based) def should_block(result, threshold=4): for cat in result.categories_analysis: if cat.severity >= threshold: return True return False if should_block(result): print("Content blocked due to safety concerns")
Image Moderation
from azure.ai.contentsafety.models import AnalyzeImageOptions, ImageData # Analyze image with open("image.jpg", "rb") as f: image_data = f.read() request = AnalyzeImageOptions( image=ImageData(content=image_data) ) result = client.analyze_image(request) for category in result.categories_analysis: print(f"{category.category}: {category.severity}")
8. Azure Machine Learning
Workspace Management
# Create ML workspace az ml workspace create \ --name ml-workspace-prod \ --resource-group rg-ai \ --location eastus # List workspaces az ml workspace list --resource-group rg-ai --output table # Create compute cluster az ml compute create \ --name gpu-cluster \ --type AmlCompute \ --size Standard_NC6s_v3 \ --min-instances 0 \ --max-instances 4 \ --workspace-name ml-workspace-prod \ --resource-group rg-ai
Model Registration & Deployment
# pip install azure-ai-ml from azure.ai.ml import MLClient from azure.ai.ml.entities import Model, ManagedOnlineEndpoint, ManagedOnlineDeployment from azure.identity import DefaultAzureCredential ml_client = MLClient( credential=DefaultAzureCredential(), subscription_id="<sub-id>", resource_group_name="rg-ai", workspace_name="ml-workspace-prod" ) # Register model model = ml_client.models.create_or_update( Model( name="my-classifier", path="./model", description="Image classification model" ) ) # Create online endpoint endpoint = ManagedOnlineEndpoint( name="classifier-endpoint", auth_mode="key" ) ml_client.online_endpoints.begin_create_or_update(endpoint).result() # Deploy model deployment = ManagedOnlineDeployment( name="blue", endpoint_name="classifier-endpoint", model=model.id, instance_type="Standard_DS3_v2", instance_count=1 ) ml_client.online_deployments.begin_create_or_update(deployment).result() # Set traffic endpoint.traffic = {"blue": 100} ml_client.online_endpoints.begin_create_or_update(endpoint).result()
Training Jobs
from azure.ai.ml import command from azure.ai.ml.entities import Environment # Define training job job = command( code="./src", command="python train.py --epochs ${{inputs.epochs}} --lr ${{inputs.lr}}", inputs={ "epochs": 10, "lr": 0.001 }, environment=Environment( image="mcr.microsoft.com/azureml/pytorch-2.0-cuda11.8:latest" ), compute="gpu-cluster", display_name="training-run" ) # Submit job returned_job = ml_client.jobs.create_or_update(job) print(f"Job URL: {returned_job.studio_url}") # Monitor job from azure.ai.ml.entities import Job status = ml_client.jobs.get(returned_job.name) print(f"Status: {status.status}")
MLflow Integration
import mlflow from azure.ai.ml import MLClient # Set tracking URI ml_client = MLClient(...) mlflow_tracking_uri = ml_client.workspaces.get(ml_client.workspace_name).mlflow_tracking_uri mlflow.set_tracking_uri(mlflow_tracking_uri) # Log experiment with mlflow.start_run(): mlflow.log_param("learning_rate", 0.001) mlflow.log_metric("accuracy", 0.95) mlflow.log_artifact("model.pkl") mlflow.sklearn.log_model(model, "model")
9. Observability & Tracing
Application Insights Integration
# pip install azure-monitor-opentelemetry from azure.monitor.opentelemetry import configure_azure_monitor from opentelemetry import trace # Configure (use connection string from Azure Portal) configure_azure_monitor( connection_string="InstrumentationKey=...;IngestionEndpoint=..." ) tracer = trace.get_tracer(__name__) # Create spans for AI operations with tracer.start_as_current_span("llm-inference") as span: span.set_attribute("model", "gpt-4o") span.set_attribute("tokens.input", 100) span.set_attribute("tokens.output", 250) response = call_openai(prompt) span.set_attribute("tokens.total", response.usage.total_tokens)
Prompt Flow Tracing
from promptflow.tracing import start_trace # Enable tracing start_trace( resource_attributes={ "service.name": "chatbot-service", "service.version": "1.0.0" } ) # Traces are automatically captured for: # - Azure OpenAI calls # - Azure AI Search queries # - Custom function calls
Azure AI Evaluation
# pip install azure-ai-evaluation from azure.ai.evaluation import GroundednessEvaluator, RelevanceEvaluator # Evaluate response quality groundedness = GroundednessEvaluator() relevance = RelevanceEvaluator() result = groundedness.evaluate( query="What is Azure AI?", context="Azure AI is Microsoft's cloud AI platform...", response="Azure AI provides machine learning and cognitive services." ) print(f"Groundedness score: {result['groundedness']}") result = relevance.evaluate( query="What is Azure AI?", response="Azure AI provides machine learning and cognitive services." ) print(f"Relevance score: {result['relevance']}")
10. Responsible AI
Six Principles
- Fairness - AI systems should treat all people fairly
- Reliability & Safety - AI systems should perform reliably and safely
- Privacy & Security - AI systems should be secure and respect privacy
- Inclusiveness - AI systems should empower everyone
- Transparency - AI systems should be understandable
- Accountability - People should be accountable for AI systems
Content Filtering Configuration
# Azure OpenAI content filter settings content_filter_config = { "prompt": { "hate": {"filtering": True, "severity_threshold": "medium"}, "violence": {"filtering": True, "severity_threshold": "medium"}, "sexual": {"filtering": True, "severity_threshold": "medium"}, "self_harm": {"filtering": True, "severity_threshold": "medium"} }, "completion": { "hate": {"filtering": True, "severity_threshold": "medium"}, "violence": {"filtering": True, "severity_threshold": "medium"}, "sexual": {"filtering": True, "severity_threshold": "medium"}, "self_harm": {"filtering": True, "severity_threshold": "medium"} } }
Model Evaluation for Bias
from azure.ai.evaluation import HateSpeechEvaluator, ViolenceEvaluator # Evaluate model outputs for harmful content hate_evaluator = HateSpeechEvaluator() violence_evaluator = ViolenceEvaluator() # Batch evaluation results = [] for response in model_responses: hate_score = hate_evaluator.evaluate(response=response) violence_score = violence_evaluator.evaluate(response=response) results.append({ "response": response, "hate_score": hate_score, "violence_score": violence_score })
Troubleshooting
Common Issues
Authentication Errors
# Check logged in identity az account show # Re-login az login # Use service principal az login --service-principal -u <app-id> -p <password> --tenant <tenant-id> # Check role assignments az role assignment list --assignee <identity>
Quota Exceeded
# Check current usage az cognitiveservices usage list \ --name openai-prod \ --resource-group rg-ai # Request quota increase via Azure Portal > Quotas
Model Not Available
# List available models in region az cognitiveservices account list-models \ --name openai-prod \ --resource-group rg-ai \ --output table # Check model availability by region # https://learn.microsoft.com/azure/ai-services/openai/concepts/models
Rate Limiting (429 Errors)
import time from tenacity import retry, wait_exponential, stop_after_attempt @retry(wait=wait_exponential(min=1, max=60), stop=stop_after_attempt(5)) def call_with_retry(): return client.chat.completions.create(...)
Search Index Issues
# Check index status az search service show --name search-prod --resource-group rg-ai # Rebuild index # Use indexer reset via REST API or SDK
Logging & Diagnostics
# Enable diagnostic logging az monitor diagnostic-settings create \ --name ai-diagnostics \ --resource /subscriptions/{sub}/resourceGroups/rg-ai/providers/Microsoft.CognitiveServices/accounts/openai-prod \ --logs '[{"category": "RequestResponse", "enabled": true}]' \ --workspace /subscriptions/{sub}/resourceGroups/rg-ai/providers/Microsoft.OperationalInsights/workspaces/log-analytics-prod # Query logs az monitor log-analytics query \ --workspace log-analytics-prod \ --analytics-query "AzureDiagnostics | where ResourceProvider == 'MICROSOFT.COGNITIVESERVICES'"
Best Practices
Cost Optimization
- Use Provisioned Throughput (PTU) for predictable high-volume workloads
- Implement caching for repeated queries
- Use smaller models when possible (GPT-4o-mini vs GPT-4o)
- Set max_tokens appropriately to avoid waste
- Batch requests when possible
Security
- Use Managed Identities instead of API keys
- Store keys in Azure Key Vault
- Enable Private Endpoints for network isolation
- Configure RBAC with least privilege
- Enable audit logging
Performance
- Deploy to regions close to users
- Use Global Standard deployment for automatic routing
- Implement retry logic with exponential backoff
- Use streaming for long responses
- Pre-compute embeddings for known content
Reliability
- Deploy across multiple regions
- Implement circuit breaker patterns
- Set up alerts for quota and errors
- Have fallback models configured
- Regular backup of custom models and configurations