Claude-skill-registry agentic-design

Use when building AI agent systems. Covers agent loops, tool calling, planning patterns, memory systems, multi-agent coordination, and safety guardrails. Apply when creating autonomous AI workflows, coding assistants, or task automation systems.

install

source · Clone the upstream repo

git clone https://github.com/majiayu000/claude-skill-registry

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/agentic-design" ~/.claude/skills/majiayu000-claude-skill-registry-agentic-design && rm -rf "$T"

manifest: skills/data/agentic-design/SKILL.md

Agentic Design

Core Principle

Agents are LLMs in a loop with tools. Design for reliability, observability, and graceful failure—not just capability.

When to Use This Skill

Building AI assistants with tool use
Creating autonomous task executors
Designing multi-agent systems
Implementing coding agents
Building workflow automation
Adding planning capabilities to AI

The Iron Law

AGENTS FAIL SILENTLY. BUILD FOR OBSERVABILITY.

If you can't see what your agent is doing, you can't fix it when it breaks.

Why Agentic Systems?

Benefits:

Automate complex multi-step tasks
Handle dynamic, open-ended problems
Scale human expertise
24/7 autonomous operation
Adaptable to new situations

Challenges:

Unpredictable behavior
Error propagation
Cost accumulation
Safety concerns
Debugging complexity

Agent Architecture Overview

┌─────────────────────────────────────────────────────┐
│                    AGENT LOOP                        │
│  ┌──────────┐    ┌──────────┐    ┌──────────┐      │
│  │  Observe │───▶│  Think   │───▶│   Act    │      │
│  └──────────┘    └──────────┘    └──────────┘      │
│       ▲                               │             │
│       │         ┌──────────┐          │             │
│       └─────────│  Memory  │◀─────────┘             │
│                 └──────────┘                        │
│                      │                              │
│              ┌───────┴───────┐                      │
│              ▼               ▼                      │
│        ┌──────────┐   ┌──────────┐                 │
│        │  Tools   │   │ Knowledge│                 │
│        └──────────┘   └──────────┘                 │
└─────────────────────────────────────────────────────┘

Pattern 1: ReAct (Reasoning + Acting)

The foundational agent pattern: Think → Act → Observe → Repeat.

Basic ReAct Loop

def react_agent(task, max_iterations=10):
    """ReAct pattern: Reason, Act, Observe."""
    messages = [
        {"role": "system", "content": REACT_SYSTEM_PROMPT},
        {"role": "user", "content": task}
    ]

    for i in range(max_iterations):
        # Think: Get model's reasoning and action
        response = llm.chat(messages, tools=TOOLS)

        # Check for completion
        if response.stop_reason == "end_turn":
            return extract_final_answer(response)

        # Act: Execute tool calls
        if response.tool_calls:
            for tool_call in response.tool_calls:
                # Execute tool
                result = execute_tool(
                    tool_call.name,
                    tool_call.arguments
                )

                # Observe: Add result to context
                messages.append({
                    "role": "tool",
                    "tool_call_id": tool_call.id,
                    "content": str(result)
                })

        messages.append({"role": "assistant", "content": response.content})

    return "Max iterations reached"

ReAct System Prompt

REACT_SYSTEM_PROMPT = """You are an AI assistant that solves tasks step by step.

For each step:
1. THINK: Analyze what you know and what you need
2. ACT: Use a tool to gather information or take action
3. OBSERVE: Review the result
4. REPEAT or ANSWER: Continue until you can answer

Always explain your reasoning before acting.
When you have enough information, provide a final answer.

Available tools:
{tool_descriptions}
"""

Pattern 2: Plan-and-Execute

Separate planning from execution for complex tasks.

Planner-Executor Architecture

class PlanAndExecuteAgent:
    def __init__(self, planner_llm, executor_llm, tools):
        self.planner = planner_llm
        self.executor = executor_llm
        self.tools = tools

    def run(self, task):
        # Phase 1: Create plan
        plan = self.create_plan(task)

        # Phase 2: Execute steps
        results = []
        for step in plan.steps:
            result = self.execute_step(step, results)
            results.append(result)

            # Replan if needed
            if result.needs_replanning:
                plan = self.replan(task, plan, results)

        # Phase 3: Synthesize
        return self.synthesize(task, results)

    def create_plan(self, task):
        """Generate step-by-step plan."""
        response = self.planner.chat([{
            "role": "user",
            "content": f"""Create a plan to accomplish this task:
            {task}

            Return a numbered list of steps.
            Each step should be specific and actionable.
            Consider dependencies between steps."""
        }])

        return parse_plan(response.content)

    def execute_step(self, step, previous_results):
        """Execute a single step with access to tools."""
        context = format_previous_results(previous_results)

        return react_agent(
            f"Previous context:\n{context}\n\nExecute this step:\n{step}",
            tools=self.tools
        )

    def replan(self, task, current_plan, results):
        """Adapt plan based on execution results."""
        response = self.planner.chat([{
            "role": "user",
            "content": f"""Original task: {task}

            Original plan: {current_plan}

            Results so far: {results}

            The plan needs adjustment. Create a revised plan
            for the remaining work."""
        }])

        return parse_plan(response.content)

Pattern 3: Tool Use

Tool Definition (OpenAI Format)

TOOLS = [
    {
        "type": "function",
        "function": {
            "name": "search_web",
            "description": "Search the web for current information",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "Search query"
                    },
                    "num_results": {
                        "type": "integer",
                        "description": "Number of results (default 5)",
                        "default": 5
                    }
                },
                "required": ["query"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "read_file",
            "description": "Read contents of a file",
            "parameters": {
                "type": "object",
                "properties": {
                    "path": {
                        "type": "string",
                        "description": "File path to read"
                    }
                },
                "required": ["path"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "write_file",
            "description": "Write content to a file",
            "parameters": {
                "type": "object",
                "properties": {
                    "path": {"type": "string"},
                    "content": {"type": "string"}
                },
                "required": ["path", "content"]
            }
        }
    }
]

Tool Execution with Safety

class ToolExecutor:
    def __init__(self, tools, sandbox=True):
        self.tools = {t["function"]["name"]: t for t in tools}
        self.sandbox = sandbox
        self.execution_log = []

    def execute(self, tool_name, arguments):
        """Execute tool with safety checks."""
        # Validate tool exists
        if tool_name not in self.tools:
            return {"error": f"Unknown tool: {tool_name}"}

        # Log execution
        self.execution_log.append({
            "tool": tool_name,
            "arguments": arguments,
            "timestamp": datetime.now()
        })

        # Safety checks
        if self.sandbox:
            if tool_name == "write_file":
                if not self._is_safe_path(arguments.get("path")):
                    return {"error": "Path outside sandbox"}

            if tool_name == "execute_code":
                arguments = self._sandbox_code(arguments)

        # Execute
        try:
            handler = getattr(self, f"_tool_{tool_name}")
            result = handler(**arguments)
            return {"success": True, "result": result}
        except Exception as e:
            return {"error": str(e)}

    def _is_safe_path(self, path):
        """Check if path is within allowed directories."""
        allowed_dirs = ["/tmp", "./workspace"]
        return any(path.startswith(d) for d in allowed_dirs)

    def _tool_search_web(self, query, num_results=5):
        """Web search implementation."""
        # Implementation here
        pass

    def _tool_read_file(self, path):
        """File read implementation."""
        with open(path) as f:
            return f.read()

    def _tool_write_file(self, path, content):
        """File write implementation."""
        with open(path, 'w') as f:
            f.write(content)
        return f"Written {len(content)} bytes to {path}"

Pattern 4: Memory Systems

Short-Term Memory (Conversation)

class ConversationMemory:
    def __init__(self, max_tokens=4000):
        self.messages = []
        self.max_tokens = max_tokens

    def add(self, role, content):
        self.messages.append({"role": role, "content": content})
        self._trim_if_needed()

    def _trim_if_needed(self):
        """Keep memory within token limits."""
        while self._count_tokens() > self.max_tokens:
            # Keep system message, remove oldest
            if len(self.messages) > 1:
                self.messages.pop(1)

    def get_context(self):
        return self.messages.copy()

Long-Term Memory (Vector Store)

class LongTermMemory:
    def __init__(self, vector_store):
        self.store = vector_store

    def remember(self, content, metadata=None):
        """Store information for later retrieval."""
        self.store.add(
            content=content,
            metadata={
                "timestamp": datetime.now().isoformat(),
                **(metadata or {})
            }
        )

    def recall(self, query, k=5):
        """Retrieve relevant memories."""
        return self.store.search(query, k=k)

    def summarize_and_store(self, conversation):
        """Compress conversation into memorable facts."""
        summary = llm.chat([{
            "role": "user",
            "content": f"""Extract key facts and decisions from this conversation:
            {conversation}

            Return as bullet points."""
        }])

        self.remember(
            content=summary.content,
            metadata={"type": "conversation_summary"}
        )

Working Memory (Scratchpad)

class WorkingMemory:
    def __init__(self):
        self.scratchpad = {}
        self.current_task = None
        self.subtasks = []

    def set_task(self, task):
        self.current_task = task
        self.subtasks = []

    def note(self, key, value):
        """Store intermediate result."""
        self.scratchpad[key] = value

    def get(self, key):
        return self.scratchpad.get(key)

    def get_context(self):
        """Format working memory for prompt."""
        return f"""Current Task: {self.current_task}

Subtasks: {self.subtasks}

Notes:
{json.dumps(self.scratchpad, indent=2)}
"""

Pattern 5: Multi-Agent Systems

Hierarchical Agents

class OrchestratorAgent:
    """Coordinates specialist agents."""

    def __init__(self):
        self.agents = {
            "researcher": ResearchAgent(),
            "coder": CodingAgent(),
            "reviewer": ReviewAgent(),
        }

    def run(self, task):
        # Analyze task and delegate
        plan = self.plan_delegation(task)

        results = {}
        for step in plan:
            agent_name = step["agent"]
            subtask = step["task"]

            # Delegate to specialist
            result = self.agents[agent_name].run(
                subtask,
                context=results
            )
            results[step["id"]] = result

            # Quality check
            if step.get("needs_review"):
                review = self.agents["reviewer"].run(
                    f"Review this output:\n{result}"
                )
                if not review.approved:
                    # Retry with feedback
                    result = self.agents[agent_name].run(
                        subtask,
                        feedback=review.feedback
                    )

        return self.synthesize(results)

Collaborative Agents

class CollaborativeSystem:
    """Agents that discuss and reach consensus."""

    def __init__(self, agents):
        self.agents = agents

    def debate(self, topic, rounds=3):
        """Have agents debate a topic."""
        discussion = []

        for round in range(rounds):
            for agent in self.agents:
                response = agent.respond(
                    topic=topic,
                    discussion_so_far=discussion
                )
                discussion.append({
                    "agent": agent.name,
                    "response": response
                })

        return self.synthesize_consensus(discussion)

    def divide_and_conquer(self, task):
        """Split task among agents."""
        # Decompose task
        subtasks = self.decompose(task)

        # Parallel execution
        with ThreadPoolExecutor() as executor:
            futures = {
                executor.submit(agent.run, subtask): agent
                for agent, subtask in zip(self.agents, subtasks)
            }

            results = {}
            for future in as_completed(futures):
                agent = futures[future]
                results[agent.name] = future.result()

        return self.merge_results(results)

Pattern 6: Guardrails and Safety

Input Validation

class InputGuardrail:
    def __init__(self):
        self.blocked_patterns = [
            r"ignore.*instructions",
            r"pretend.*you.*are",
            r"jailbreak",
        ]

    def validate(self, user_input):
        """Check input for prompt injection."""
        # Pattern matching
        for pattern in self.blocked_patterns:
            if re.search(pattern, user_input, re.IGNORECASE):
                return False, "Blocked pattern detected"

        # Length check
        if len(user_input) > 10000:
            return False, "Input too long"

        # LLM-based check for sophisticated attacks
        is_safe = self.llm_safety_check(user_input)
        if not is_safe:
            return False, "Potential prompt injection"

        return True, None

Output Validation

class OutputGuardrail:
    def __init__(self):
        self.validators = [
            self.check_pii,
            self.check_harmful_content,
            self.check_code_safety,
        ]

    def validate(self, output, context):
        """Validate agent output before returning."""
        for validator in self.validators:
            is_valid, reason = validator(output, context)
            if not is_valid:
                return self.sanitize(output, reason)

        return output

    def check_pii(self, output, context):
        """Check for leaked PII."""
        pii_patterns = {
            'ssn': r'\d{3}-\d{2}-\d{4}',
            'credit_card': r'\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}',
            'email': r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}',
        }

        for pii_type, pattern in pii_patterns.items():
            if re.search(pattern, output):
                return False, f"Contains {pii_type}"

        return True, None

    def check_code_safety(self, output, context):
        """Check generated code for dangerous patterns."""
        dangerous = [
            'rm -rf',
            'DROP TABLE',
            'exec(',
            'eval(',
            'os.system',
        ]

        for pattern in dangerous:
            if pattern in output:
                return False, f"Dangerous code pattern: {pattern}"

        return True, None

Action Limits

class ActionLimiter:
    def __init__(self):
        self.limits = {
            'api_calls': 100,
            'file_writes': 10,
            'cost_usd': 5.0,
        }
        self.counts = defaultdict(int)
        self.cost = 0.0

    def check(self, action_type, cost=0):
        """Check if action is within limits."""
        self.counts[action_type] += 1
        self.cost += cost

        if self.counts[action_type] > self.limits.get(action_type, float('inf')):
            raise LimitExceeded(f"{action_type} limit reached")

        if self.cost > self.limits['cost_usd']:
            raise LimitExceeded(f"Cost limit reached: ${self.cost:.2f}")

    def require_approval(self, action):
        """Flag dangerous actions for human approval."""
        dangerous_actions = ['delete_file', 'send_email', 'deploy']

        if action in dangerous_actions:
            return self.request_human_approval(action)

        return True

Pattern 7: Error Handling and Recovery

class ResilientAgent:
    def __init__(self, max_retries=3):
        self.max_retries = max_retries

    def run_with_recovery(self, task):
        """Execute with automatic error recovery."""
        errors = []

        for attempt in range(self.max_retries):
            try:
                result = self.execute(task)

                # Validate result
                if self.validate_result(result):
                    return result

                errors.append(f"Invalid result: {result}")

            except ToolExecutionError as e:
                errors.append(f"Tool error: {e}")
                # Try alternative approach
                task = self.reframe_task(task, e)

            except RateLimitError as e:
                # Exponential backoff
                wait_time = 2 ** attempt
                time.sleep(wait_time)

            except Exception as e:
                errors.append(f"Unexpected error: {e}")

        # All retries failed
        return self.graceful_failure(task, errors)

    def reframe_task(self, task, error):
        """Adjust task based on error."""
        return f"""Previous attempt failed with: {error}

Try a different approach for: {task}"""

    def graceful_failure(self, task, errors):
        """Return partial results or helpful error."""
        return {
            "status": "failed",
            "task": task,
            "errors": errors,
            "partial_results": self.get_partial_results(),
            "suggestions": self.suggest_manual_steps(task)
        }

Observability

Structured Logging

import structlog

logger = structlog.get_logger()

class ObservableAgent:
    def __init__(self, name):
        self.name = name
        self.trace_id = None

    def run(self, task):
        self.trace_id = generate_trace_id()

        with logger.bind(
            agent=self.name,
            trace_id=self.trace_id,
            task=task[:100]
        ):
            logger.info("agent_started")

            try:
                result = self._execute(task)
                logger.info("agent_completed", result_length=len(str(result)))
                return result

            except Exception as e:
                logger.error("agent_failed", error=str(e))
                raise

    def _log_tool_call(self, tool_name, args, result):
        logger.info(
            "tool_called",
            tool=tool_name,
            args=args,
            result_preview=str(result)[:200]
        )

    def _log_llm_call(self, messages, response, tokens, cost):
        logger.info(
            "llm_called",
            message_count=len(messages),
            response_length=len(response),
            tokens=tokens,
            cost_usd=cost
        )

Tracing with LangSmith/Phoenix

from langsmith import traceable

@traceable(name="agent_run")
def run_agent(task):
    """Full agent run with tracing."""
    plan = create_plan(task)

    for step in plan:
        result = execute_step(step)

    return synthesize_results()

@traceable(name="tool_execution")
def execute_tool(name, args):
    """Traced tool execution."""
    return tool_registry[name](**args)

Testing Agents

Unit Testing Tools

def test_search_tool():
    """Test individual tool."""
    result = tools.search_web("test query")

    assert "results" in result
    assert len(result["results"]) > 0

def test_tool_error_handling():
    """Test tool handles errors gracefully."""
    result = tools.read_file("/nonexistent/path")

    assert "error" in result
    assert "not found" in result["error"].lower()

Integration Testing

def test_agent_simple_task():
    """Test agent on known task."""
    agent = create_agent()

    result = agent.run("What is 2 + 2?")

    assert "4" in result

def test_agent_tool_use():
    """Test agent uses tools correctly."""
    agent = create_agent()

    result = agent.run("Search for the current weather in NYC")

    assert agent.tool_calls_made > 0
    assert "weather" in result.lower()

Evaluation Datasets

EVAL_DATASET = [
    {
        "task": "Find the capital of France",
        "expected_tools": ["search_web"],
        "expected_answer_contains": ["Paris"],
    },
    {
        "task": "Create a file called test.txt with 'hello'",
        "expected_tools": ["write_file"],
        "expected_side_effects": [("file_exists", "test.txt")],
    },
]

def evaluate_agent(agent, dataset):
    """Evaluate agent on test dataset."""
    results = []

    for case in dataset:
        result = agent.run(case["task"])

        score = {
            "task": case["task"],
            "tools_correct": check_tools(agent.tool_calls, case["expected_tools"]),
            "answer_correct": check_answer(result, case.get("expected_answer_contains", [])),
            "side_effects_correct": check_side_effects(case.get("expected_side_effects", [])),
        }
        results.append(score)

    return calculate_metrics(results)

Common Mistakes

Mistake	Impact	Fix
No iteration limit	Infinite loops, cost explosion	Set max_iterations
Missing error handling	Silent failures	Wrap tool calls in try/except
No observability	Can't debug issues	Add structured logging
Overly complex tools	LLM confusion	Keep tools focused
No output validation	Harmful/wrong outputs	Add guardrails
Single LLM for all	Suboptimal cost/quality	Use appropriate model per task

Integration with Skills

Use with:

```
rag-architecture
```
- RAG as a retrieval tool
```
llm-integration
```
- API patterns and error handling
```
subagent-driven-development
```
- Coordinating agent tasks
```
test-driven-development
```
- Testing agent behavior

Checklist

Before deploying an agent:

Authority

Based on:

Anthropic agent design patterns
OpenAI function calling best practices
LangChain agent architectures
Academic research on LLM agents
Production systems at scale

Bottom Line: Agents = LLM + Tools + Loop. Design for failure, log everything, limit actions, validate outputs. Capability without safety is a liability.