What Python package do I need to use the Claude Agent SDK?

Install the Anthropic Python SDK with pip install anthropic. The agent capabilities — tool use, agentic loop, streaming — are all built into the core SDK. No separate package is required for Python agents.

How do I prevent a Claude agent from running forever?

Add a MAX_ITERATIONS guard to your agentic loop. Set a limit (typically 10-20 iterations) and raise an error or return gracefully when exceeded. Also set a max_tokens budget and use the stop_reason field to detect when Claude has finished rather than polling indefinitely.

Can Claude call external APIs as tools?

Yes. Any Python function can be a tool — HTTP requests, database queries, file system operations, shell commands. Define the function, write a JSON schema describing its inputs, add it to the tools list, and Claude will call it when appropriate. The SDK handles the tool use loop automatically.

How do I give a Claude agent persistent memory across sessions?

Serialize the messages list to JSON and save it to a file or database at the end of each session. On startup, reload it into memory before the first API call. This gives Claude full context of all previous conversations. For long-running agents, implement a trimming strategy to avoid hitting the context window limit.

What is stop_reason tool_use in the Claude API?

When Claude wants to call a tool, the API returns stop_reason='tool_use' instead of 'end_turn'. This signals that Claude has paused and is waiting for tool results before continuing. Your code should execute the requested tool, append the result to the messages array, and call the API again to continue the agent loop.

Claude Agent SDK: Python Tutorial

Step-by-step Python tutorial: build a Claude agent with tool use, streaming, and memory. Working code from hello world to production patterns.

The Claude Agent SDK lets you build autonomous AI agents in Python — programs that use Claude to reason, call tools, maintain conversation state, and complete multi-step tasks without constant human input. This tutorial walks through the SDK from a basic agent to production-ready patterns.

If you've used the Anthropic API directly, the Agent SDK builds on top of it. You still use the same anthropic Python package, but the SDK adds structure for the patterns that every agent needs: tool dispatch, conversation memory, streaming, and error recovery.

What the Agent SDK Is (and What It Isn't)

The Claude Agent SDK is Anthropic's official Python framework for building agents powered by Claude. It wraps the anthropic library and provides:

Tool use scaffolding: define Python functions, decorate them, and the SDK handles the JSON schema generation and dispatch loop automatically.
Conversation management: the agent maintains message history across turns, so you don't manually assemble the messages array on every call.
Streaming support: real-time token output with a clean async interface.
Agentic loop control: the SDK decides when to call tools, when to stop, and how many iterations to allow — all configurable.

What it isn't: a framework for multi-agent orchestration or a replacement for systems like LangGraph or CrewAI when you need complex agent-to-agent coordination. For single-agent tasks — which covers the majority of real-world use cases — the SDK is the right layer.

Installation and Setup

Install the Anthropic Python SDK with agent support:

pip install anthropic

You need Python 3.9+. Set your API key:

export ANTHROPIC_API_KEY="sk-ant-your-key-here"

Or set it programmatically:

import anthropic

client = anthropic.Anthropic(api_key="sk-ant-your-key-here")

For production, always use an environment variable or secrets manager — never hardcode the key.

Your First Agent: Hello World

A minimal agent that uses Claude to answer a question:

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "What is 2 + 2?"}
    ]
)

print(response.content[0].text)

That's a single API call, not yet an agent. An agent adds tools and a loop. Here's the same structure with an explicit conversation loop:

import anthropic

client = anthropic.Anthropic()
messages = []

def chat(user_input: str) -> str:
    messages.append({"role": "user", "content": user_input})
    
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        messages=messages
    )
    
    assistant_message = response.content[0].text
    messages.append({"role": "assistant", "content": assistant_message})
    return assistant_message

# Multi-turn conversation preserved in memory
print(chat("My name is Alex."))
print(chat("What's my name?"))  # Claude remembers: Alex

This is the foundation. The messages list is the agent's memory. Every turn appends to it. Claude has full context of everything said so far.

Adding Tools

Tools are where agents become useful. A tool is a Python function Claude can decide to call. The SDK sends Claude a JSON schema describing your tools; Claude outputs a tool call when it wants to use one; your code executes it and returns the result.

Here's a complete agent with two tools — a calculator and a file reader:

import anthropic
import json

client = anthropic.Anthropic()

# Define tools as Python functions
def calculate(expression: str) -> str:
    """Evaluate a math expression safely."""
    try:
        # Only allow safe math operations
        allowed = set('0123456789+-*/()., ')
        if not all(c in allowed for c in expression):
            return "Error: invalid characters in expression"
        result = eval(expression)
        return str(result)
    except Exception as e:
        return f"Error: {e}"

def read_file(path: str) -> str:
    """Read a text file and return its contents."""
    try:
        with open(path, 'r') as f:
            return f.read()[:2000]  # Limit output
    except FileNotFoundError:
        return f"File not found: {path}"

# Tool schemas for the API
tools = [
    {
        "name": "calculate",
        "description": "Evaluate a mathematical expression. Use this for any arithmetic.",
        "input_schema": {
            "type": "object",
            "properties": {
                "expression": {"type": "string", "description": "Math expression to evaluate"}
            },
            "required": ["expression"]
        }
    },
    {
        "name": "read_file",
        "description": "Read a file from the filesystem.",
        "input_schema": {
            "type": "object",
            "properties": {
                "path": {"type": "string", "description": "File path to read"}
            },
            "required": ["path"]
        }
    }
]

TOOL_MAP = {"calculate": calculate, "read_file": read_file}

def run_agent(user_message: str) -> str:
    messages = [{"role": "user", "content": user_message}]
    
    while True:
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=1024,
            tools=tools,
            messages=messages
        )
        
        # If Claude is done, return the text response
        if response.stop_reason == "end_turn":
            return response.content[0].text
        
        # If Claude wants to use a tool
        if response.stop_reason == "tool_use":
            # Add Claude's response to messages
            messages.append({"role": "assistant", "content": response.content})
            
            # Execute each tool call
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    tool_fn = TOOL_MAP.get(block.name)
                    if tool_fn:
                        result = tool_fn(**block.input)
                        tool_results.append({
                            "type": "tool_result",
                            "tool_use_id": block.id,
                            "content": result
                        })
            
            # Add tool results and loop back
            messages.append({"role": "user", "content": tool_results})

# Claude decides when to call tools
print(run_agent("What is 847 * 293? Then calculate 847 * 293 / 12."))

The key pattern is the while True agentic loop: Claude runs, decides whether to use a tool, you execute it, and you loop until Claude returns stop_reason == "end_turn". This is the core of every Claude agent.

Streaming Responses

For user-facing applications, streaming lets you display tokens as they arrive instead of waiting for the full response:

import anthropic

client = anthropic.Anthropic()

def stream_agent(user_message: str):
    with client.messages.stream(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        messages=[{"role": "user", "content": user_message}]
    ) as stream:
        for text in stream.text_stream:
            print(text, end="", flush=True)
    print()  # newline at end

stream_agent("Explain how neural networks learn, step by step.")

For streaming with tools, use the async client:

import anthropic
import asyncio

async_client = anthropic.AsyncAnthropic()

async def stream_with_tools(user_message: str):
    async with async_client.messages.stream(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        tools=tools,  # from earlier example
        messages=[{"role": "user", "content": user_message}]
    ) as stream:
        async for event in stream:
            if hasattr(event, "delta") and hasattr(event.delta, "text"):
                print(event.delta.text, end="", flush=True)

asyncio.run(stream_with_tools("What is 1234 * 5678?"))

Conversation Memory and System Prompts

Real agents need persistent context. Here's a pattern for an agent with a system prompt and multi-session memory:

import anthropic
import json
from pathlib import Path

client = anthropic.Anthropic()

SYSTEM_PROMPT = """You are a helpful coding assistant. 
You have access to a calculator tool for math.
Be concise and specific. Return working code when asked."""

class Agent:
    def __init__(self, memory_file: str = None):
        self.messages = []
        self.memory_file = memory_file
        if memory_file and Path(memory_file).exists():
            self.messages = json.loads(Path(memory_file).read_text())
    
    def chat(self, user_input: str) -> str:
        self.messages.append({"role": "user", "content": user_input})
        
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=2048,
            system=SYSTEM_PROMPT,
            tools=tools,
            messages=self.messages
        )
        
        # Handle tool use
        while response.stop_reason == "tool_use":
            self.messages.append({"role": "assistant", "content": response.content})
            tool_results = self._execute_tools(response.content)
            self.messages.append({"role": "user", "content": tool_results})
            response = client.messages.create(
                model="claude-sonnet-4-6",
                max_tokens=2048,
                system=SYSTEM_PROMPT,
                tools=tools,
                messages=self.messages
            )
        
        reply = response.content[0].text
        self.messages.append({"role": "assistant", "content": reply})
        
        # Persist to file if set
        if self.memory_file:
            Path(self.memory_file).write_text(json.dumps(self.messages))
        
        return reply
    
    def _execute_tools(self, content):
        results = []
        for block in content:
            if hasattr(block, 'type') and block.type == "tool_use":
                fn = TOOL_MAP.get(block.name)
                result = fn(**block.input) if fn else "Tool not found"
                results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": result
                })
        return results

agent = Agent(memory_file="agent_memory.json")
print(agent.chat("Calculate the compound interest on ,000 at 7% for 10 years."))
print(agent.chat("Now show me that in code."))  # Remembers previous context

Production Patterns

Three patterns that matter in production:

Rate limit handling. The API returns 429 errors under high load. Use exponential backoff:

import time
import anthropic

client = anthropic.Anthropic()

def call_with_retry(messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.messages.create(
                model="claude-sonnet-4-6",
                max_tokens=1024,
                messages=messages
            )
        except anthropic.RateLimitError:
            if attempt == max_retries - 1:
                raise
            wait = 2 ** attempt  # 1s, 2s, 4s
            time.sleep(wait)

Max iteration guard. Always limit your agentic loop to prevent runaway costs:

MAX_ITERATIONS = 10

def safe_agent_loop(messages):
    for i in range(MAX_ITERATIONS):
        response = client.messages.create(...)
        if response.stop_reason == "end_turn":
            return response
        if response.stop_reason == "tool_use":
            # handle tools...
            pass
    raise RuntimeError("Agent exceeded max iterations")

Context window management. Long conversations hit Claude's context limit. Trim old messages when the history grows:

def trim_history(messages, keep_last=20):
    """Keep the system message and last N turns."""
    if len(messages) > keep_last:
        return messages[-keep_last:]
    return messages

Agent SDK vs Raw API: When to Use Each

Use the Agent SDK pattern (agentic loop + tools) when:

Claude needs to take actions — read files, call APIs, run code
The task requires multiple steps where each step depends on the previous
You're building something users interact with over multiple turns
You want Claude to decide when a task is done

Use a single API call when:

You need one transformation: classify this text, summarize this document, extract these fields
Latency is critical and you can't afford the loop overhead
The task is deterministic — same input should always give same output

The pattern in this tutorial — system prompt, tool dispatch loop, conversation memory — covers roughly 80% of real-world agent use cases. For the remaining 20% that needs multi-agent coordination, parallel tool execution, or complex state machines, look at frameworks like LangGraph or build your own orchestration on top of these primitives.

Next Steps

The Like One Academy has a full Claude Agent SDK course covering streaming in depth, parallel tool execution, multi-agent patterns, and deploying agents to production. The concepts in this tutorial form the foundation for everything in that course.

Start with the hello world example, add one tool that connects to something real — a database query, an API call, a file system operation — and you'll have the core of a working agent in under 50 lines of Python.