Awesome-omni-skill ai-agent-development

Build production-ready AI agents with Microsoft Foundry and Agent Framework. Use when creating AI agents, selecting LLM models, implementing agent orchestration, adding tracing/observability, or evaluating agent quality. Covers agent architecture, model selection, multi-agent workflows, and production deployment.

install
source · Clone the upstream repo
git clone https://github.com/diegosouzapw/awesome-omni-skill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/ai-agents/ai-agent-development" ~/.claude/skills/diegosouzapw-awesome-omni-skill-ai-agent-development && rm -rf "$T"
manifest: skills/ai-agents/ai-agent-development/SKILL.md
safety · automated scan (medium risk)
This is a pattern-based risk scan, not a security review. Our crawler flagged:
  • pip install
  • references API keys
Always read a skill's source content before installing. Patterns alone don't mean the skill is malicious — but they warrant attention.
source content

AI Agent Development

Purpose: Build production-ready AI agents with Microsoft Foundry and Agent Framework.
Scope: Agent architecture, model selection, orchestration, observability, evaluation.


When to Use This Skill

  • Building AI agents with Microsoft Foundry or Agent Framework
  • Selecting LLM models for agent scenarios
  • Implementing multi-agent orchestration workflows
  • Adding tracing and observability to AI agents
  • Evaluating agent quality and response accuracy

Prerequisites

  • Python 3.11+ or .NET 8+
  • agent-framework-azure-ai package
  • Microsoft Foundry workspace with deployed model

Quick Start

Installation

Python (Recommended):

pip install agent-framework-azure-ai --pre  # --pre required during preview

.NET:

dotnet add package Microsoft.Agents.AI.AzureAI --prerelease
dotnet add package Microsoft.Agents.AI.Workflows --prerelease

Model Selection

Top Production Models (Microsoft Foundry):

ModelBest ForContextCost/1M
gpt-5.2Enterprise agents, structured outputs200K/100KTBD
gpt-5.1-codex-maxAgentic coding workflows272K/128K$3.44
claude-opus-4-5Complex agents, coding, computer use200K/64K$10
gpt-5.1Multi-step reasoning200K/100K$3.44
o3Advanced reasoning200K/100K$3.5

Deploy Model:

Ctrl+Shift+P
AI Toolkit: Deploy Model


Agent Patterns

Single Agent

from pathlib import Path
from agent_framework.openai import OpenAIChatClient

# Load prompt from file — NEVER embed prompts as inline strings
prompt = Path("prompts/assistant.md").read_text(encoding="utf-8")

client = OpenAIChatClient(
    model="gpt-5.1",
    api_key=os.getenv("FOUNDRY_API_KEY"),
    endpoint=os.getenv("FOUNDRY_ENDPOINT")
)

agent = {
    "name": "Assistant",
    "instructions": prompt,  # Loaded from prompts/assistant.md
    "tools": []  # Add tools as needed
}

response = await client.chat(
    messages=[{"role": "user", "content": "Hello"}],
    agent=agent
)

Multi-Agent Orchestration

from pathlib import Path
from agent_framework.workflows import SequentialWorkflow

# Each agent loads its prompt from a dedicated file
researcher = {
    "name": "Researcher",
    "instructions": Path("prompts/researcher.md").read_text(encoding="utf-8")
}
writer = {
    "name": "Writer",
    "instructions": Path("prompts/writer.md").read_text(encoding="utf-8")
}

workflow = SequentialWorkflow(
    agents=[researcher, writer],
    handoff_strategy="on_completion"
)

result = await workflow.run(query="Write about AI agents")

Advanced Patterns: Search github.com/microsoft/agent-framework for:

  • Group Chat, Concurrent, Conditional, Loop
  • Human-in-the-Loop, Reflection, Fan-out/Fan-in
  • MCP, Multimodal, Custom Executors

Best Practices

Prompt & Template File Management

RULE: NEVER embed prompts or output templates as inline strings in code. Always store them as separate files.

Why: Prompts are content, not code. Separating them enables:

  • Version control diffs that show exactly what changed in a prompt
  • Non-developer editing (PMs, prompt engineers) without touching code
  • A/B testing different prompts without code changes
  • Reuse across agents, languages, and test harnesses
  • Clear separation of concerns (logic vs. content)

Directory Convention:

project/
  prompts/                    # All system/agent prompts
    assistant.md              # One file per agent role
    researcher.md
    writer.md
    reviewer.md
  templates/                  # Output templates used by agents
    report-template.md        # Structured output templates
    email-template.md
    summary-template.md
  config/
    models.yaml               # Model configuration

Loading Pattern:

from pathlib import Path

# Load prompt
prompt = Path("prompts/assistant.md").read_text(encoding="utf-8")

# Load output template and inject into prompt
template = Path("templates/report-template.md").read_text(encoding="utf-8")
prompt_with_template = f"{prompt}\n\n## Output Format\n{template}"

Rules:

  • MUST store all system prompts in
    prompts/
    directory as
    .md
    or
    .txt
    files
  • MUST store output format templates in
    templates/
    directory
  • MUST NOT embed prompt text longer than one sentence directly in code
  • SHOULD use Markdown format for prompts (readable, supports structure)
  • SHOULD name files after the agent role:
    prompts/{agent-name}.md
  • SHOULD include a brief comment header in each prompt file (purpose, version, model target)
  • MAY use template variables (
    {variable}
    ) for dynamic content injected at runtime

Development

DO:

  • Plan agent architecture before coding (Research → Design → Implement)
  • Use Microsoft Foundry models for production
  • Implement tracing from day one
  • Test with evaluation datasets before deployment
  • Use structured outputs for reliable agent responses
  • Implement error handling and retry logic
  • Version your agents and track changes
  • Store all prompts as separate files in
    prompts/
    directory
  • Store output templates as separate files in
    templates/
    directory

DON'T:

  • Hardcode API keys or endpoints
  • Embed prompts or output templates as multi-line strings in code
  • Skip tracing setup (critical for debugging)
  • Deploy without evaluation
  • Use GitHub models in production (free tier has limits)
  • Ignore token limits and context windows
  • Mix agent logic with business logic

Security

  • Store credentials in environment variables or Azure Key Vault
  • Validate all tool inputs and outputs
  • Implement rate limiting for agent APIs
  • Log agent actions for audit trails
  • Use role-based access control (RBAC) for Foundry resources
  • Review OWASP Top 10 for AI: owasp.org/AI-Security-and-Privacy-Guide

Performance

  • Cache model responses when appropriate
  • Use batch processing for multiple requests
  • Monitor token usage and costs
  • Implement timeout handling
  • Use async/await for I/O operations
  • Consider model size vs. latency tradeoffs

Monitoring

  • Track key metrics: latency, success rate, token usage, cost
  • Set up alerts for failures and anomalies
  • Use structured logging with context
  • Integrate with Azure Monitor / Application Insights
  • Review traces regularly for optimization opportunities

Production Checklist

Development

  • Agent architecture documented
  • Model selected and deployed
  • Tools/plugins implemented and tested
  • Error handling with retries
  • Structured outputs configured
  • No hardcoded secrets
  • All prompts stored as separate files in
    prompts/
    (not inline in code)
  • All output templates stored in
    templates/
    (not inline in code)

Model Change Management (MANDATORY)

  • Model version pinned explicitly (e.g.,
    gpt-5.1-2026-01-15
    )
  • Model version configurable via environment variable
  • Evaluation baseline saved for current model
  • A/B evaluation run before any model switch
  • Structured output schema verified after model change
  • Tool/function-calling accuracy verified after model change
  • Model change documented in changelog with eval results
  • Weekly evaluation monitoring configured for drift detection
  • Alert threshold set for score drops > 10% from baseline

Model Change Test Automation (MANDATORY)

  • Agent designed as model-agnostic (model injected via config)
  • config/models.yaml
    defines model test matrix with thresholds
  • Tested against ≥2 models (primary + fallback from different provider)
  • Multi-model comparison pipeline in CI/CD (weekly + on model config change)
  • Deployment gated on threshold checks (CI fails on regression)
  • Validated fallback model designated and documented
  • Comparison report generated per run (JSON + human-readable)
  • Cost and latency evaluators included alongside quality metrics

Observability

  • OpenTelemetry tracing enabled
  • Trace viewer tested
  • Structured logging implemented
  • Metrics collection configured

Evaluation

  • Evaluation dataset created
  • Evaluators defined (built-in + custom)
  • Evaluation runs passing
  • Results meet quality thresholds
  • Multi-model comparison run (2+ models tested)
  • Fallback model validated and documented
  • Model comparison baseline saved

Security & Compliance

  • Credentials in Key Vault/env vars
  • Input validation implemented
  • RBAC configured
  • Audit logging enabled
  • OWASP AI Top 10 reviewed

Operations

  • Health checks implemented
  • Rate limiting configured
  • Monitoring alerts set up
  • Deployment strategy defined
  • Rollback plan documented
  • Cost monitoring enabled

Resources

Official Documentation:

AI Toolkit:

  • Model Catalog:
    Ctrl+Shift+P
    AI Toolkit: Model Catalog
  • Trace Viewer:
    Ctrl+Shift+P
    AI Toolkit: Open Trace Viewer
  • Playground:
    Ctrl+Shift+P
    AI Toolkit: Model Playground

Security:


Related: AGENTS.md for agent behavior guidelines • Skills.md for general production practices

Last Updated: January 17, 2026

Scripts

ScriptPurposeUsage
scaffold-agent.py
Scaffold AI agent project (Python/.NET) with tracing & eval
python scripts/scaffold-agent.py --name my-agent [--pattern multi-agent] [--with-eval]
validate-agent-checklist.ps1
Validate agent project against production checklist
./scripts/validate-agent-checklist.ps1 [-Path ./my-agent] [-Strict]
check-model-drift.ps1
Validate model pinning, data drift signals, and judge LLM readiness
./scripts/check-model-drift.ps1 [-Path ./my-agent] [-Strict]
run-model-comparison.py
Run eval suite against multiple models and generate comparison report
python scripts/run-model-comparison.py --config config/models.yaml --dataset evaluation/core.jsonl

Troubleshooting

IssueSolution
Model not foundVerify model deployment in Foundry portal and check endpoint URL
Tracing not appearingEnsure AIInferenceInstrumentor().instrument() called before agent creation
Agent loops indefinitelySet max_turns limit and add termination conditions

References