Tool Use & Function Calling

The Core Idea

LLMs can only generate text. They cannot check the weather, query a database, or call an API. Tool use bridges this gap: you describe available functions to the model, and it decides when and how to call them.

The model does not execute anything. It returns a structured request saying "I want to call this function with these arguments." You execute it and send the result back.

The Tool Use Loop

1. User:      "What's the weather in Paris?"
2. LLM:       I should call get_weather(city="Paris")     ← tool_use block
3. You:       Execute get_weather("Paris") → {"temp": 22, "condition": "sunny"}
4. You:       Send tool result back to the LLM
5. LLM:       "It's 22°C and sunny in Paris right now!"   ← final response

This is a multi-turn loop — not a single request/response. The model may call zero tools, one tool, or many tools before giving a final answer.

Defining Tools

You describe tools as JSON schemas so the model knows what's available:

json
{
  "name": "get_weather",
  "description": "Get current weather for a city. Use this when the user asks about weather conditions.",
  "input_schema": {
    "type": "object",
    "properties": {
      "city": {
        "type": "string",
        "description": "City name, e.g. 'Paris' or 'New York'"
      },
      "units": {
        "type": "string",
        "enum": ["celsius", "fahrenheit"],
        "description": "Temperature units (default: celsius)"
      }
    },
    "required": ["city"]
  }
}

The model reads the name, description, and input_schema to decide when and how to call the tool. Clear descriptions are critical — they are the model's only guide.

Claude Tool Use — Full Example (Python)

python
import anthropic
import json

client = anthropic.Anthropic()

# Define tools
tools = [
    {
        "name": "get_weather",
        "description": "Get current weather for a city",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {"type": "string", "description": "City name"},
            },
            "required": ["city"]
        }
    }
]

# Your actual function
def get_weather(city: str) -> dict:
    # In reality, call a weather API here
    return {"temp": 22, "condition": "sunny", "city": city}

# Step 1: Send user message with tools
response = client.messages.create(
    model="claude-sonnet-4-6-20250514",
    max_tokens=1024,
    tools=tools,
    messages=[{"role": "user", "content": "What's the weather in Paris?"}]
)

# Step 2: Check if model wants to use a tool
if response.stop_reason == "tool_use":
    # Find the tool_use block
    tool_block = next(b for b in response.content if b.type == "tool_use")
    
    # Step 3: Execute the function
    result = get_weather(**tool_block.input)
    
    # Step 4: Send result back
    final = client.messages.create(
        model="claude-sonnet-4-6-20250514",
        max_tokens=1024,
        tools=tools,
        messages=[
            {"role": "user", "content": "What's the weather in Paris?"},
            {"role": "assistant", "content": response.content},
            {
                "role": "user",
                "content": [{
                    "type": "tool_result",
                    "tool_use_id": tool_block.id,
                    "content": json.dumps(result)
                }]
            }
        ]
    )
    print(final.content[0].text)
    # "It's 22°C and sunny in Paris right now!"

OpenAI Function Calling

Same concept, different format:

python
from openai import OpenAI

client = OpenAI()

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather for a city",
        "parameters": {
            "type": "object",
            "properties": {
                "city": {"type": "string", "description": "City name"}
            },
            "required": ["city"]
        }
    }
}]

response = client.chat.completions.create(
    model="gpt-4o",
    tools=tools,
    messages=[{"role": "user", "content": "Weather in Paris?"}]
)

# Check tool_calls in the response
tool_call = response.choices[0].message.tool_calls[0]
# Execute function, then send result back with role: "tool"

Key Differences

Aspect	Claude (Anthropic)	GPT (OpenAI)
Tool response field	`input_schema`	`parameters`
Response block type	`tool_use` content block	`tool_calls` on message
Result role	`user` with `tool_result`	`tool` role
Stop reason	`stop_reason: "tool_use"`	`finish_reason: "tool_calls"`

MCP — Model Context Protocol

What Is It?

MCP is Anthropic's open standard for connecting AI models to external tools and data sources. Think of it as USB-C for AI tools — one standard interface that works everywhere.

The Problem MCP Solves

Before MCP, every AI app had to build custom integrations:

Claude App  ──custom code──→  GitHub API
Claude App  ──custom code──→  Slack API
Claude App  ──custom code──→  Database
ChatGPT App ──different code──→  GitHub API   (duplicated work!)

With MCP:

Any AI Client ──MCP──→ GitHub MCP Server ──→ GitHub API
Any AI Client ──MCP──→ Slack MCP Server  ──→ Slack API
Any AI Client ──MCP──→ DB MCP Server     ──→ Database

How MCP Works

Component	Role
MCP Server	Exposes tools (e.g., "search_files", "run_query") via the MCP protocol
MCP Client	AI app that discovers and calls tools from MCP servers
Transport	Communication channel (stdio for local, HTTP+SSE for remote)

A single MCP server can serve Claude Code, Cursor, VS Code, or any MCP-compatible client.

Agentic Tool Use

When the model calls multiple tools in sequence to complete a complex task, it becomes an agent (more in 16 - AI Agents):

User: "Find all Python files with TODO comments and create a GitHub issue for each"

LLM → search_files(pattern="*.py", content="TODO")    → [file1.py:23, file2.py:45]
LLM → read_file("file1.py", line=23)                  → "TODO: refactor auth logic"
LLM → read_file("file2.py", line=45)                  → "TODO: add input validation"
LLM → create_issue(title="Refactor auth logic", ...)   → Issue #42
LLM → create_issue(title="Add input validation", ...)  → Issue #43
LLM → "Done! Created issues #42 and #43."

The model loops autonomously: observe, decide, act, observe again.

Best Practices

Tool Descriptions

BAD:  "name": "query"
      "description": "Queries data"

GOOD: "name": "search_products"
      "description": "Search the product catalog by name, category, or price range.
       Use this when the user asks about available products, pricing, or wants
       to find a specific item. Returns up to 10 results sorted by relevance."

Input Validation

Always validate tool inputs before executing — the model can hallucinate parameters:

python
def execute_tool(name: str, inputs: dict):
    if name == "delete_file":
        path = inputs.get("path", "")
        if ".." in path or path.startswith("/etc"):
            return {"error": "Invalid path — access denied"}
    # ... proceed with execution

Error Handling

Return errors as tool results — don't crash. The model can often recover:

python
try:
    result = call_external_api(inputs)
    return {"success": True, "data": result}
except Exception as e:
    return {"success": False, "error": str(e)}
# The model sees the error and can adjust its approach

Common Pitfalls

Pitfall	Solution
Vague tool descriptions	Write descriptions as if explaining to a new developer
Too many tools (>20)	Group related tools or use a tool-selection step
No input validation	Always validate and sanitize tool inputs
Infinite tool loops	Set a max iteration count (e.g., 10 tool calls)
Ignoring tool errors	Return errors as results so the model can adapt

Resources

Previous: 14 - AI APIs & SDKs | Next: 16 - AI Agents