AI Agents

What Is an Agent?

An agent is an AI system that can reason, plan, use tools, and take actions in a loop until a task is complete. Unlike a chatbot that answers one question at a time, an agent works autonomously across multiple steps.

Chatbot:  User asks → Model answers → Done

Agent:    User asks → Model thinks → Uses tool → Observes result
          → Thinks again → Uses another tool → Observes
          → ... (loop continues) → Delivers final result

Chatbot vs Agent

	Chatbot	Agent
Interaction	Single turn Q&A	Multi-step autonomous execution
Tools	None (or minimal)	Multiple tools used in sequence
Planning	None	Breaks tasks into steps
Memory	Current conversation	Conversation + tool results + state
Autonomy	Responds to each message	Works independently toward a goal
Example	"What is recursion?"	"Refactor this module and fix the tests"

The Agent Loop

Every agent follows the same fundamental loop:

          ┌──────────────┐
          │   OBSERVE    │ ← Read environment (tool results, errors, files)
          └──────┬───────┘
                 │
          ┌──────▼───────┐
          │    THINK     │ ← Reason about what to do next
          └──────┬───────┘
                 │
          ┌──────▼───────┐
          │     ACT      │ ← Call a tool, write code, execute command
          └──────┬───────┘
                 │
                 │ Loop until task is complete
                 └──────────────→ back to OBSERVE

This loop continues until the agent decides the task is complete, hits a maximum iteration limit, or encounters an unrecoverable error.

The ReAct Pattern

ReAct (Reasoning + Acting) is the most common agent pattern. The model explicitly writes out its reasoning:

User: "What's the population of the capital of France?"

Thought: I need to find the capital of France first, then look up its population.
Action: search(query="capital of France")
Observation: The capital of France is Paris.

Thought: Now I know it's Paris. I need to find Paris's population.
Action: search(query="population of Paris 2024")
Observation: The population of Paris is approximately 2.1 million (city proper).

Thought: I have the answer now.
Answer: The population of Paris, the capital of France, is approximately 2.1 million.

The explicit Thought steps make the agent's reasoning transparent and debuggable.

Agent Components

Component	Role	Analogy
LLM	Reasoning, planning, decision-making	Brain
Tools	Actions the agent can take (APIs, file ops, search)	Hands
Memory	Conversation history, tool results, notes	Notepad
Planning	Breaking tasks into steps, prioritizing	Strategy
Orchestrator	Controls the loop, manages state, enforces limits	Manager

Example: Claude Code as an Agent

Claude Code is a real-world agent you use daily. Here is how it processes a request like "Fix the failing test in auth.test.ts":

1. OBSERVE  → Read the test file
2. THINK    → "The test expects a 200 status but the route returns 401"
3. ACT      → Read the auth middleware code
4. OBSERVE  → See that the token validation has a bug
5. THINK    → "The token expiry check is using < instead of >"
6. ACT      → Edit the middleware file
7. ACT      → Run the tests
8. OBSERVE  → All tests pass
9. RESPOND  → "Fixed the token expiry comparison in auth middleware"

Types of Agents

1. Simple Agent (Single Loop)

One LLM running the observe-think-act loop with tools:

User → [LLM + Tools] → Result

Best for: straightforward tasks with clear steps.

2. Multi-Agent System

Multiple specialized agents collaborating:

User → Orchestrator Agent
           ├── Research Agent (searches, gathers info)
           ├── Coding Agent (writes implementation)
           └── Review Agent (checks quality)

Each agent has its own system prompt, tools, and expertise. They pass results to each other.

3. Hierarchical Agent

An orchestrator delegates to worker agents:

Orchestrator: "Build a user auth system"
  → Worker 1: "Design the database schema"
  → Worker 2: "Implement the API endpoints"
  → Worker 3: "Write the tests"
  → Orchestrator: Combines and verifies results

When to Use Which

Approach	When to Use
Simple agent	Single-domain tasks, one skill set needed
Multi-agent	Complex tasks spanning multiple domains
Hierarchical	Large projects with clear task decomposition

Agent Frameworks

Framework	Description	Best For
Claude Agent SDK	Anthropic's official SDK for building agents with Claude	Production agents with Claude
LangChain	Popular framework with many integrations	Prototyping, diverse tool ecosystem
LangGraph	Graph-based agent workflows (from LangChain team)	Complex multi-step flows with branching
CrewAI	Multi-agent role-based collaboration	Teams of specialized agents
AutoGen	Microsoft's multi-agent conversation framework	Research, conversational agents

Claude Agent SDK — Quick Example

python
from claude_agent_sdk import Agent, tool

@tool
def read_file(path: str) -> str:
    """Read contents of a file."""
    with open(path) as f:
        return f.read()

@tool
def write_file(path: str, content: str) -> str:
    """Write content to a file."""
    with open(path, "w") as f:
        f.write(content)
    return f"Wrote {len(content)} chars to {path}"

agent = Agent(
    model="claude-sonnet-4-6-20250514",
    tools=[read_file, write_file],
    system="You are a coding assistant. Read files, make changes, verify results."
)

result = agent.run("Add error handling to the parse_config function in config.py")

Challenges & Risks

Challenge	Description	Mitigation
Hallucinated actions	Agent "uses" a tool incorrectly or invents data	Validate all tool inputs, verify outputs
Infinite loops	Agent keeps trying the same failing approach	Set max iterations (e.g., 25 steps)
Cost explosion	Long agent runs burn through tokens	Set token/cost budgets, use cheaper models for sub-tasks
Safety	Agent takes destructive actions (deletes files, sends emails)	Require human approval for dangerous operations
Context overflow	Long runs fill the context window	Summarize intermediate results, prune history
Error cascading	Early mistake compounds across steps	Build in self-verification checkpoints

When to Use Agents vs Simple Prompting

Simple prompt:    "Translate this text to French"           → No agent needed
Single tool call: "What's the weather?"                     → Tool use, not an agent
Agent:            "Research competitors and write a report"  → Multi-step, needs planning
Agent:            "Fix all TypeScript errors in this repo"   → Iterative, needs tools

Decision Framework

Factor	Simple Prompting	Tool Use	Full Agent
Steps involved	1	1-2	3+
Tools needed	0	1	Multiple
Planning required	No	No	Yes
Iteration needed	No	No	Yes
Autonomy level	None	Low	High

Building Your First Agent — Checklist

Define the task scope clearly (what can the agent do, what can't it?)
Choose tools carefully (start with 3-5, not 30)
Write excellent tool descriptions (the model reads them)
Set iteration limits (max 10-25 steps)
Set a cost/token budget
Add human-in-the-loop for destructive actions
Log every step for debugging
Test with adversarial inputs

Resources

Previous: 15 - Tool Use & Function Calling | Next: 17 - RAG (Retrieval-Augmented Generation)