Local AI Agents.
Build AI agents that use tools, make decisions, and execute multi-step tasks -- all running on your hardware with no API keys.
After this lesson you'll know
- What AI agents are and how they differ from simple chatbots
- Building a tool-using agent with Ollama and Python
- The ReAct pattern for agent reasoning
- Safety guardrails for autonomous local agents
From Chat to Agent
A chatbot takes a prompt and returns a response. An agent takes a goal and figures out the steps to achieve it, using tools along the way. The difference is autonomy: a chatbot answers questions, an agent solves problems.
An agent loop looks like this:
- Observe: Read the current state (user query, previous results, tool outputs)
- Think: Decide what to do next
- Act: Call a tool (search files, run code, query a database, read a webpage)
- Repeat: Use the tool output to decide the next action, until the goal is achieved
Cloud-based agents (like OpenAI's Assistants API) require API keys and send your data to external servers. Local agents run entirely on your machine: Ollama provides the brain, Python provides the tools, and your data stays private.
The ReAct Pattern
ReAct (Reasoning + Acting) is the standard pattern for building agents. The model alternates between thinking (reasoning about what to do) and acting (calling tools). Here's the core prompt:
ReAct Agent Prompt
SYSTEM_PROMPT = """You are an AI assistant with access to tools.
For each user request, think step by step, then use tools as needed.
Available tools:
- search_files(query): Search local files for content matching query
- read_file(path): Read a file's contents
- run_python(code): Execute Python code and return output
- search_web(query): Search the web (if online)
Respond in this format:
THOUGHT: [your reasoning about what to do next]
ACTION: [tool_name(arguments)]
OBSERVATION: [tool output will be inserted here]
... repeat THOUGHT/ACTION/OBSERVATION as needed ...
ANSWER: [final answer to the user]
Always think before acting. Never fabricate tool outputs."""
The key insight: the model generates the tool call as text, your code parses and executes it, then feeds the result back to the model. The model never actually runs code or accesses files -- your Python wrapper does that, with whatever safety checks you define.
This lesson is for Pro members
Unlock all 518+ lessons across 52 courses with Academy Pro.
Already a member? Sign in to access your lessons.