AI Agents

What Is an Agent?

An agent is an AI system that can reason, plan, use tools, and take actions in a loop until a task is complete. Unlike a chatbot that answers one question at a time, an agent works autonomously across multiple steps.

Chatbot:  User asks → Model answers → Done

Agent:    User asks → Model thinks → Uses tool → Observes result
          → Thinks again → Uses another tool → Observes
          → ... (loop continues) → Delivers final result

Chatbot vs Agent

ChatbotAgent
InteractionSingle turn Q&AMulti-step autonomous execution
ToolsNone (or minimal)Multiple tools used in sequence
PlanningNoneBreaks tasks into steps
MemoryCurrent conversationConversation + tool results + state
AutonomyResponds to each messageWorks independently toward a goal
Example"What is recursion?""Refactor this module and fix the tests"

The Agent Loop

Every agent follows the same fundamental loop:

          ┌──────────────┐
          │   OBSERVE    │ ← Read environment (tool results, errors, files)
          └──────┬───────┘
                 │
          ┌──────▼───────┐
          │    THINK     │ ← Reason about what to do next
          └──────┬───────┘
                 │
          ┌──────▼───────┐
          │     ACT      │ ← Call a tool, write code, execute command
          └──────┬───────┘
                 │
                 │ Loop until task is complete
                 └──────────────→ back to OBSERVE

This loop continues until the agent decides the task is complete, hits a maximum iteration limit, or encounters an unrecoverable error.


The ReAct Pattern

ReAct (Reasoning + Acting) is the most common agent pattern. The model explicitly writes out its reasoning:

User: "What's the population of the capital of France?"

Thought: I need to find the capital of France first, then look up its population.
Action: search(query="capital of France")
Observation: The capital of France is Paris.

Thought: Now I know it's Paris. I need to find Paris's population.
Action: search(query="population of Paris 2024")
Observation: The population of Paris is approximately 2.1 million (city proper).

Thought: I have the answer now.
Answer: The population of Paris, the capital of France, is approximately 2.1 million.

The explicit Thought steps make the agent's reasoning transparent and debuggable.


Agent Components

ComponentRoleAnalogy
LLMReasoning, planning, decision-makingBrain
ToolsActions the agent can take (APIs, file ops, search)Hands
MemoryConversation history, tool results, notesNotepad
PlanningBreaking tasks into steps, prioritizingStrategy
OrchestratorControls the loop, manages state, enforces limitsManager

Example: Claude Code as an Agent

Claude Code is a real-world agent you use daily. Here is how it processes a request like "Fix the failing test in auth.test.ts":

1. OBSERVE  → Read the test file
2. THINK    → "The test expects a 200 status but the route returns 401"
3. ACT      → Read the auth middleware code
4. OBSERVE  → See that the token validation has a bug
5. THINK    → "The token expiry check is using < instead of >"
6. ACT      → Edit the middleware file
7. ACT      → Run the tests
8. OBSERVE  → All tests pass
9. RESPOND  → "Fixed the token expiry comparison in auth middleware"

Types of Agents

1. Simple Agent (Single Loop)

One LLM running the observe-think-act loop with tools:

User → [LLM + Tools] → Result

Best for: straightforward tasks with clear steps.

2. Multi-Agent System

Multiple specialized agents collaborating:

User → Orchestrator Agent
           ├── Research Agent (searches, gathers info)
           ├── Coding Agent (writes implementation)
           └── Review Agent (checks quality)

Each agent has its own system prompt, tools, and expertise. They pass results to each other.

3. Hierarchical Agent

An orchestrator delegates to worker agents:

Orchestrator: "Build a user auth system"
  → Worker 1: "Design the database schema"
  → Worker 2: "Implement the API endpoints"
  → Worker 3: "Write the tests"
  → Orchestrator: Combines and verifies results

When to Use Which

ApproachWhen to Use
Simple agentSingle-domain tasks, one skill set needed
Multi-agentComplex tasks spanning multiple domains
HierarchicalLarge projects with clear task decomposition

Agent Frameworks

FrameworkDescriptionBest For
Claude Agent SDKAnthropic's official SDK for building agents with ClaudeProduction agents with Claude
LangChainPopular framework with many integrationsPrototyping, diverse tool ecosystem
LangGraphGraph-based agent workflows (from LangChain team)Complex multi-step flows with branching
CrewAIMulti-agent role-based collaborationTeams of specialized agents
AutoGenMicrosoft's multi-agent conversation frameworkResearch, conversational agents

Claude Agent SDK — Quick Example

python
from claude_agent_sdk import Agent, tool @tool def read_file(path: str) -> str: """Read contents of a file.""" with open(path) as f: return f.read() @tool def write_file(path: str, content: str) -> str: """Write content to a file.""" with open(path, "w") as f: f.write(content) return f"Wrote {len(content)} chars to {path}" agent = Agent( model="claude-sonnet-4-6-20250514", tools=[read_file, write_file], system="You are a coding assistant. Read files, make changes, verify results." ) result = agent.run("Add error handling to the parse_config function in config.py")

Challenges & Risks

ChallengeDescriptionMitigation
Hallucinated actionsAgent "uses" a tool incorrectly or invents dataValidate all tool inputs, verify outputs
Infinite loopsAgent keeps trying the same failing approachSet max iterations (e.g., 25 steps)
Cost explosionLong agent runs burn through tokensSet token/cost budgets, use cheaper models for sub-tasks
SafetyAgent takes destructive actions (deletes files, sends emails)Require human approval for dangerous operations
Context overflowLong runs fill the context windowSummarize intermediate results, prune history
Error cascadingEarly mistake compounds across stepsBuild in self-verification checkpoints

When to Use Agents vs Simple Prompting

Simple prompt:    "Translate this text to French"           → No agent needed
Single tool call: "What's the weather?"                     → Tool use, not an agent
Agent:            "Research competitors and write a report"  → Multi-step, needs planning
Agent:            "Fix all TypeScript errors in this repo"   → Iterative, needs tools

Decision Framework

FactorSimple PromptingTool UseFull Agent
Steps involved11-23+
Tools needed01Multiple
Planning requiredNoNoYes
Iteration neededNoNoYes
Autonomy levelNoneLowHigh

Building Your First Agent — Checklist

  • Define the task scope clearly (what can the agent do, what can't it?)
  • Choose tools carefully (start with 3-5, not 30)
  • Write excellent tool descriptions (the model reads them)
  • Set iteration limits (max 10-25 steps)
  • Set a cost/token budget
  • Add human-in-the-loop for destructive actions
  • Log every step for debugging
  • Test with adversarial inputs

Resources


Previous: 15 - Tool Use & Function Calling | Next: 17 - RAG (Retrieval-Augmented Generation)