Awesome-omni-skill ai-agents

Production-grade AI agent patterns with MCP integration, agentic RAG, handoff orchestration, multi-layer guardrails, observability, token economics, ROI frameworks, and build-vs-not decision guidance (modern best practices)

install
source · Clone the upstream repo
git clone https://github.com/diegosouzapw/awesome-omni-skill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data-ai/ai-agents-vasilyu1983" ~/.claude/skills/diegosouzapw-awesome-omni-skill-ai-agents-2a2eec && rm -rf "$T"
manifest: skills/data-ai/ai-agents-vasilyu1983/SKILL.md
source content

AI Agents Development — Production Skill Hub

Modern Best Practices (January 2026): deterministic control flow, bounded tools, auditable state, MCP-based tool integration, handoff-first orchestration, multi-layer guardrails, OpenTelemetry tracing, and human-in-the-loop controls (OWASP LLM Top 10: https://owasp.org/www-project-top-10-for-large-language-model-applications/).

This skill provides production-ready operational patterns for designing, building, evaluating, and deploying AI agents. It centralizes procedures, checklists, decision rules, and templates used across RAG agents, tool-using agents, OS agents, and multi-agent systems.

No theory. No narrative. Only operational steps and templates.


When to Use This Skill

Codex should activate this skill whenever the user asks for:

  • Designing an agent (LLM-based, tool-based, OS-based, or multi-agent).
  • Scoping capability maturity and rollout risk for new agent behaviors.
  • Creating action loops, plans, workflows, or delegation logic.
  • Writing tool definitions, MCP tools, schemas, or validation logic.
  • Generating RAG pipelines, retrieval modules, or context injection.
  • Building memory systems (session, long-term, episodic, task).
  • Creating evaluation harnesses, observability plans, or safety gates.
  • Preparing CI/CD, rollout, deployment, or production operational specs.
  • Producing any template in
    /references/
    or
    /assets/
    .
  • Implementing MCP servers or integrating Model Context Protocol.
  • Setting up agent handoffs and orchestration patterns.
  • Configuring multi-layer guardrails and safety controls.
  • Evaluating whether to build an agent (build vs not decision).
  • Calculating agent ROI, token costs, or cost/benefit analysis.
  • Assessing hallucination risk and mitigation strategies.
  • Deciding when to kill an agent project (kill triggers).
  • For prompt scaffolds, retrieval tuning, or security depth, see Scope Boundaries below.

Scope Boundaries (Use These Skills for Depth)

Default Workflow (Production)


Quick Reference

Agent TypeCore Control FlowInterfacesMCP/A2AWhen to Use
Workflow Agent (FSM/DAG)Explicit state transitionsState store, tool allowlistMCPDeterministic, auditable flows
Tool-Using AgentRoute → call tool → observeTool schemas, retries/timeoutsMCPExternal actions (APIs, DB, files)
RAG AgentRetrieve → answer → citeRetriever, citations, ACLsMCPKnowledge-grounded responses
Planner/ExecutorPlan → execute steps with capsPlanner prompts, step budgetMCP (+A2A)Multi-step problems with bounded autonomy
Multi-Agent (Orchestrated)Delegate → merge → validateHandoff contracts, eval gatesA2ASpecialization with explicit handoffs
OS AgentObserve UI → act → verifySandbox, UI groundingMCPDesktop/browser control under strict guardrails
Code/SWE AgentBranch → edit → test → PRRepo access, CI gatesMCPCoding tasks with review/merge controls

Framework Selection (2026)

FrameworkArchitectureBest ForEase
LangGraphGraph-based, statefulEnterprise, compliance, auditabilityMedium
OpenAI Agents SDKTool-centric, lightweightFast prototyping, OpenAI ecosystemEasy
Google ADKCode-first, multi-languageGemini/Vertex AI, polyglot teamsMedium
Pydantic AIType-safe, graph FSMProduction Python, type safetyMedium
CrewAIRole-based crewsTeam workflows, content generationEasiest
AutoGenConversationalCode generation, researchMedium
AWS Bedrock AgentsManaged infrastructureEnterprise AWS, knowledge basesEasy

See

references/modern-best-practices.md
for detailed framework comparison and selection guide.


Decision Tree: Choosing Agent Architecture

What does the agent need to do?
    ├─ Answer questions from knowledge base?
    │   ├─ Simple lookup? → RAG Agent (LangChain/LlamaIndex + vector DB)
    │   └─ Complex multi-step? → Agentic RAG (iterative retrieval + reasoning)
    │
    ├─ Perform external actions (APIs, tools, functions)?
    │   ├─ 1-3 tools, linear flow? → Tool-Using Agent (LangGraph + MCP)
    │   └─ Complex workflows, branching? → Planning Agent (ReAct/Plan-Execute)
    │
    ├─ Write/modify code autonomously?
    │   ├─ Single file edits? → Tool-Using Agent with code tools
    │   └─ Multi-file, issue resolution? → Code/SWE Agent (HyperAgent pattern)
    │
    ├─ Delegate tasks to specialists?
    │   ├─ Fixed workflow? → Multi-Agent Sequential (A → B → C)
    │   ├─ Manager-Worker? → Multi-Agent Hierarchical (Manager + Workers)
    │   └─ Dynamic routing? → Multi-Agent Group Chat (collaborative)
    │
    ├─ Control desktop/browser?
    │   └─ OS Agent (Anthropic Computer Use + MCP for system access)
    │
    └─ Hybrid (combination of above)?
        └─ Planning Agent that coordinates:
            - Tool-using for actions (MCP)
            - RAG for knowledge (MCP)
            - Multi-agent for delegation (A2A)
            - Code agents for implementation

Protocol Selection:

  • Use MCP for: Tool access, data retrieval, single-agent integration
  • Use A2A for: Agent-to-agent handoffs, multi-agent coordination, task delegation

Core Concepts (Vendor-Agnostic)

Control Flow Options

  • Reactive: direct tool routing per user request (fast, brittle if unbounded).
  • Workflow (FSM/DAG): explicit states and transitions (default for deterministic production).
  • Planner/Executor: plan with strict budgets, then execute step-by-step (use when branching is unavoidable).
  • Orchestrated multi-agent: separate roles with validated handoffs (use when specialization is required).

Memory Types (Tradeoffs)

  • Short-term (session): cheap, ephemeral; best for conversational continuity.
  • Episodic (task): scoped to a case/ticket; supports audit and replay.
  • Long-term (profile/knowledge): high risk; requires consent, retention limits, and provenance.

Failure Handling (Production Defaults)

  • Classify errors: retriable vs fatal vs needs-human.
  • Bound retries: max attempts, backoff, jitter; avoid retry storms.
  • Fallbacks: degraded mode, smaller model, cached answers, or safe refusal.

Do / Avoid

Do

  • Do keep state explicit and serializable (replayable runs).
  • Do enforce tool allowlists, scopes, and idempotency for side effects.
  • Do log traces/metrics for model calls and tool calls (OpenTelemetry GenAI semantic conventions: https://opentelemetry.io/docs/specs/semconv/gen-ai/).

Avoid

  • Avoid runaway autonomy (unbounded loops or step counts).
  • Avoid hidden state (implicit memory that cannot be audited).
  • Avoid untrusted tool outputs without validation/sanitization.

Navigation: Economics & Decision Framework

Should You Build an Agent?

  • Build vs Not Decision Framework -
    references/build-vs-not-decision.md
    • 10-second test (volume, cost, error tolerance)
    • Red flags and immediate disqualifiers
    • Alternatives to agents (usually better)
    • Full decision tree with stage gates
    • Kill triggers during development and post-launch
    • Pre-build validation checklist

Agent ROI & Token Economics

  • Agent Economics -
    references/agent-economics.md
    • Token pricing by model (January 2026)
    • Cost per task by agent type
    • ROI calculation formula and tiers
    • Hallucination cost framework and mitigation ROI
    • Investment decision matrix
    • Monthly tracking dashboard

Navigation: Core Concepts & Patterns

Governance & Maturity

  • Agent Maturity & Governance -
    references/agent-maturity-governance.md
    • Capability maturity levels (L0-L4)
    • Identity & policy enforcement
    • Fleet control and registry management
    • Deprecation rules and kill switches

Modern Best Practices

  • Modern Best Practices -
    references/modern-best-practices.md
    • Model Context Protocol (MCP)
    • Agent-to-Agent Protocol (A2A)
    • Agentic RAG (Dynamic Retrieval)
    • Multi-layer guardrails
    • LangGraph over LangChain
    • OpenTelemetry for agents

Context Management

Core Operational Patterns

  • Operational Patterns -
    references/operational-patterns.md
    • Agent loop pattern (PLAN → ACT → OBSERVE → UPDATE)
    • OS agent action loop
    • RAG pipeline pattern
    • Tool specification
    • Memory system pattern
    • Multi-agent workflow
    • Safety & guardrails
    • Observability
    • Evaluation patterns
    • Deployment & CI/CD

Navigation: Protocol Implementation


Navigation: Agent Capabilities

Skill Packaging & Sharing


Navigation: Production Operations


Navigation: Templates (Copy-Paste Ready)

Checklists

Core Agent Templates

RAG Templates

Tool Templates

Multi-Agent Templates

Service Layer Templates


External Sources Metadata

  • Curated References -
    data/sources.json
    Authoritative sources spanning standards, protocols, and production agent frameworks

Shared Utilities (Centralized patterns — extract, don't duplicate)


Trend Awareness Protocol

IMPORTANT: When users ask recommendation questions about AI agents, you MUST use WebSearch to check current trends before answering. If WebSearch is unavailable, use

data/sources.json
+ any available web browsing tools, and explicitly state what you verified vs assumed.

Trigger Conditions

  • "What's the best agent framework for [use case]?"
  • "What should I use for [multi-agent/tool use/orchestration]?"
  • "What's the latest in AI agents?"
  • "Current best practices for [agent architecture/MCP/A2A]?"
  • "Is [LangGraph/CrewAI/AutoGen] still relevant in 2026?"
  • "[Agent framework A] vs [Agent framework B]?"
  • "Best way to build [coding agent/RAG agent/OS agent]?"
  • "What MCP servers are available?"

Required Searches

  1. Search:
    "AI agent frameworks best practices 2026"
  2. Search:
    "[LangGraph/CrewAI/AutoGen/Semantic Kernel] comparison 2026"
  3. Search:
    "AI agent trends January 2026"
  4. Search:
    "MCP servers available 2026"

What to Report

After searching, provide:

  • Current landscape: What agent frameworks are popular NOW
  • Emerging trends: New patterns gaining traction (MCP, A2A, agentic coding)
  • Deprecated/declining: Frameworks or patterns losing relevance
  • Recommendation: Based on fresh data, not just static knowledge

Example Topics (verify with fresh search)

  • Agent frameworks (LangGraph, CrewAI, AutoGen, Semantic Kernel, Pydantic AI)
  • MCP ecosystem (available servers, new integrations)
  • Agentic coding (Codex CLI, Claude Code, Cursor, Windsurf, Cline)
  • Multi-agent patterns (hierarchical, collaborative, competitive)
  • Tool use protocols (MCP, function calling)
  • Agent evaluation (SWE-Bench, AgentBench, GAIA)
  • OS/computer use agents (computer-use APIs, browser automation)

Related Skills

This skill integrates with complementary skills:

Core Dependencies

  • ../ai-llm/
    - LLM patterns, prompt engineering, and model selection for agents
  • ../ai-rag/
    - Deep RAG implementation: chunking, embedding, reranking
  • ../ai-prompt-engineering/
    - System prompt design, few-shot patterns, reasoning strategies

Production & Operations

Supporting Patterns

Usage pattern: Start here for agent architecture, then reference specialized skills for deep implementation details.


Usage Notes

  • Modern Standards: Default to MCP for tools, agentic RAG for retrieval, handoff-first for multi-agent
  • Lightweight SKILL.md: Use this file for quick reference and navigation
  • Drill-down resources: Reference detailed resources for implementation guidance
  • Copy-paste templates: Use templates when the user asks for structured artifacts
  • External sources: Reference
    data/sources.json
    for authoritative documentation links
  • No theory: Never include theoretical explanations; only operational steps

Key Modern Migrations

Traditional → Modern:

  • Custom APIs → Model Context Protocol (MCP)
  • Static RAG → Agentic RAG with contextual retrieval
  • Ad-hoc handoffs → Versioned handoff APIs with JSON Schema
  • Single guardrail → Multi-layer defense (5+ layers)
  • LangChain agents → LangGraph stateful workflows
  • Custom observability → OpenTelemetry GenAI standards
  • Model-centric → Context engineering-centric

AI-Native SDLC Template