The Claude Agent SDK lets you build autonomous AI agents in Python — programs that use Claude to reason, call tools, maintain conversation state, and complete multi-step tasks without constant human input. This tutorial walks through the SDK from a basic agent to production-ready patterns.
If you've used the Anthropic API directly, the Agent SDK builds on top of it. You still use the same anthropic Python package, but the SDK adds structure for the patterns that every agent needs: tool dispatch, conversation memory, streaming, and error recovery.
What the Agent SDK Is (and What It Isn't)
The Claude Agent SDK is Anthropic's official Python framework for building agents powered by Claude. It wraps the anthropic library and provides:
- Tool use scaffolding: define Python functions, decorate them, and the SDK handles the JSON schema generation and dispatch loop automatically.
- Conversation management: the agent maintains message history across turns, so you don't manually assemble the messages array on every call.
- Streaming support: real-time token output with a clean async interface.
- Agentic loop control: the SDK decides when to call tools, when to stop, and how many iterations to allow — all configurable.
What it isn't: a framework for multi-agent orchestration or a replacement for systems like LangGraph or CrewAI when you need complex agent-to-agent coordination. For single-agent tasks — which covers the majority of real-world use cases — the SDK is the right layer.
Installation and Setup
Install the Anthropic Python SDK with agent support:
pip install anthropic
You need Python 3.9+. Set your API key:
export ANTHROPIC_API_KEY="sk-ant-your-key-here"
Or set it programmatically:
import anthropic
client = anthropic.Anthropic(api_key="sk-ant-your-key-here")
For production, always use an environment variable or secrets manager — never hardcode the key.
Your First Agent: Hello World
A minimal agent that uses Claude to answer a question:
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[
{"role": "user", "content": "What is 2 + 2?"}
]
)
print(response.content[0].text)
That's a single API call, not yet an agent. An agent adds tools and a loop. Here's the same structure with an explicit conversation loop:
import anthropic
client = anthropic.Anthropic()
messages = []
def chat(user_input: str) -> str:
messages.append({"role": "user", "content": user_input})
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=messages
)
assistant_message = response.content[0].text
messages.append({"role": "assistant", "content": assistant_message})
return assistant_message
# Multi-turn conversation preserved in memory
print(chat("My name is Alex."))
print(chat("What's my name?")) # Claude remembers: Alex
This is the foundation. The messages list is the agent's memory. Every turn appends to it. Claude has full context of everything said so far.
Adding Tools
Tools are where agents become useful. A tool is a Python function Claude can decide to call. The SDK sends Claude a JSON schema describing your tools; Claude outputs a tool call when it wants to use one; your code executes it and returns the result.
Here's a complete agent with two tools — a calculator and a file reader:
import anthropic
import json
client = anthropic.Anthropic()
# Define tools as Python functions
def calculate(expression: str) -> str:
"""Evaluate a math expression safely."""
try:
# Only allow safe math operations
allowed = set('0123456789+-*/()., ')
if not all(c in allowed for c in expression):
return "Error: invalid characters in expression"
result = eval(expression)
return str(result)
except Exception as e:
return f"Error: {e}"
def read_file(path: str) -> str:
"""Read a text file and return its contents."""
try:
with open(path, 'r') as f:
return f.read()[:2000] # Limit output
except FileNotFoundError:
return f"File not found: {path}"
# Tool schemas for the API
tools = [
{
"name": "calculate",
"description": "Evaluate a mathematical expression. Use this for any arithmetic.",
"input_schema": {
"type": "object",
"properties": {
"expression": {"type": "string", "description": "Math expression to evaluate"}
},
"required": ["expression"]
}
},
{
"name": "read_file",
"description": "Read a file from the filesystem.",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string", "description": "File path to read"}
},
"required": ["path"]
}
}
]
TOOL_MAP = {"calculate": calculate, "read_file": read_file}
def run_agent(user_message: str) -> str:
messages = [{"role": "user", "content": user_message}]
while True:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=tools,
messages=messages
)
# If Claude is done, return the text response
if response.stop_reason == "end_turn":
return response.content[0].text
# If Claude wants to use a tool
if response.stop_reason == "tool_use":
# Add Claude's response to messages
messages.append({"role": "assistant", "content": response.content})
# Execute each tool call
tool_results = []
for block in response.content:
if block.type == "tool_use":
tool_fn = TOOL_MAP.get(block.name)
if tool_fn:
result = tool_fn(**block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})
# Add tool results and loop back
messages.append({"role": "user", "content": tool_results})
# Claude decides when to call tools
print(run_agent("What is 847 * 293? Then calculate 847 * 293 / 12."))
The key pattern is the while True agentic loop: Claude runs, decides whether to use a tool, you execute it, and you loop until Claude returns stop_reason == "end_turn". This is the core of every Claude agent.
Streaming Responses
For user-facing applications, streaming lets you display tokens as they arrive instead of waiting for the full response:
import anthropic
client = anthropic.Anthropic()
def stream_agent(user_message: str):
with client.messages.stream(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": user_message}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
print() # newline at end
stream_agent("Explain how neural networks learn, step by step.")
For streaming with tools, use the async client:
import anthropic
import asyncio
async_client = anthropic.AsyncAnthropic()
async def stream_with_tools(user_message: str):
async with async_client.messages.stream(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=tools, # from earlier example
messages=[{"role": "user", "content": user_message}]
) as stream:
async for event in stream:
if hasattr(event, "delta") and hasattr(event.delta, "text"):
print(event.delta.text, end="", flush=True)
asyncio.run(stream_with_tools("What is 1234 * 5678?"))
Conversation Memory and System Prompts
Real agents need persistent context. Here's a pattern for an agent with a system prompt and multi-session memory:
import anthropic
import json
from pathlib import Path
client = anthropic.Anthropic()
SYSTEM_PROMPT = """You are a helpful coding assistant.
You have access to a calculator tool for math.
Be concise and specific. Return working code when asked."""
class Agent:
def __init__(self, memory_file: str = None):
self.messages = []
self.memory_file = memory_file
if memory_file and Path(memory_file).exists():
self.messages = json.loads(Path(memory_file).read_text())
def chat(self, user_input: str) -> str:
self.messages.append({"role": "user", "content": user_input})
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=2048,
system=SYSTEM_PROMPT,
tools=tools,
messages=self.messages
)
# Handle tool use
while response.stop_reason == "tool_use":
self.messages.append({"role": "assistant", "content": response.content})
tool_results = self._execute_tools(response.content)
self.messages.append({"role": "user", "content": tool_results})
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=2048,
system=SYSTEM_PROMPT,
tools=tools,
messages=self.messages
)
reply = response.content[0].text
self.messages.append({"role": "assistant", "content": reply})
# Persist to file if set
if self.memory_file:
Path(self.memory_file).write_text(json.dumps(self.messages))
return reply
def _execute_tools(self, content):
results = []
for block in content:
if hasattr(block, 'type') and block.type == "tool_use":
fn = TOOL_MAP.get(block.name)
result = fn(**block.input) if fn else "Tool not found"
results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})
return results
agent = Agent(memory_file="agent_memory.json")
print(agent.chat("Calculate the compound interest on ,000 at 7% for 10 years."))
print(agent.chat("Now show me that in code.")) # Remembers previous context
Production Patterns
Three patterns that matter in production:
Rate limit handling. The API returns 429 errors under high load. Use exponential backoff:
import time
import anthropic
client = anthropic.Anthropic()
def call_with_retry(messages, max_retries=3):
for attempt in range(max_retries):
try:
return client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=messages
)
except anthropic.RateLimitError:
if attempt == max_retries - 1:
raise
wait = 2 ** attempt # 1s, 2s, 4s
time.sleep(wait)
Max iteration guard. Always limit your agentic loop to prevent runaway costs:
MAX_ITERATIONS = 10
def safe_agent_loop(messages):
for i in range(MAX_ITERATIONS):
response = client.messages.create(...)
if response.stop_reason == "end_turn":
return response
if response.stop_reason == "tool_use":
# handle tools...
pass
raise RuntimeError("Agent exceeded max iterations")
Context window management. Long conversations hit Claude's context limit. Trim old messages when the history grows:
def trim_history(messages, keep_last=20):
"""Keep the system message and last N turns."""
if len(messages) > keep_last:
return messages[-keep_last:]
return messages
Agent SDK vs Raw API: When to Use Each
Use the Agent SDK pattern (agentic loop + tools) when:
- Claude needs to take actions — read files, call APIs, run code
- The task requires multiple steps where each step depends on the previous
- You're building something users interact with over multiple turns
- You want Claude to decide when a task is done
Use a single API call when:
- You need one transformation: classify this text, summarize this document, extract these fields
- Latency is critical and you can't afford the loop overhead
- The task is deterministic — same input should always give same output
The pattern in this tutorial — system prompt, tool dispatch loop, conversation memory — covers roughly 80% of real-world agent use cases. For the remaining 20% that needs multi-agent coordination, parallel tool execution, or complex state machines, look at frameworks like LangGraph or build your own orchestration on top of these primitives.
Next Steps
The Like One Academy has a full Claude Agent SDK course covering streaming in depth, parallel tool execution, multi-agent patterns, and deploying agents to production. The concepts in this tutorial form the foundation for everything in that course.
Start with the hello world example, add one tool that connects to something real — a database query, an API call, a file system operation — and you'll have the core of a working agent in under 50 lines of Python.