Database Patterns
Your AI needs a brain — a structured place to store memory, context, and agent state. These three patterns are the foundation of every persistent AI system, from simple chatbots to autonomous agents.
Why Database Design Matters for AI
Most AI tutorials focus on prompts and models. But the database is where AI becomes useful. Without persistent memory, every conversation starts from zero. Without structured state, your agent cannot plan or learn. Without event logs, you cannot debug or improve.
The three-table architecture below is not theoretical — it is the exact schema powering production AI systems. Each table serves a distinct purpose, and separating them lets you query and optimize each independently.
Pattern 1: Key-Value Brain
Instead of creating a new column for every piece of information, you store everything as key-value pairs. This means your AI agent can learn new things without database migrations. No ALTER TABLE. No downtime. Just insert a new key.
Keys use dot notation (like identity.name, session.active_work) for namespacing — the dots create a hierarchy, like folders on your computer. Values are JSONB — a Postgres data type that stores structured JSON data. It can hold strings, numbers, arrays, or nested objects, and you can query inside it.
-- Pattern 1: Key-Value Brain (flexible, no migrations needed)
CREATE TABLE brain_context (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
key TEXT UNIQUE NOT NULL, -- dot notation: 'identity.name'
value JSONB NOT NULL DEFAULT '{}', -- any structured data
category TEXT DEFAULT 'general', -- namespace: session, directive, system
description TEXT, -- human-readable note
priority INT DEFAULT 5 CHECK (priority BETWEEN 1 AND 10),
created_at TIMESTAMPTZ DEFAULT now(),
updated_at TIMESTAMPTZ DEFAULT now()
);
-- Fast lookups by key and category
CREATE INDEX idx_brain_key ON brain_context(key);
CREATE INDEX idx_brain_category ON brain_context(category);
The power of this pattern is the upsert — a single query that either creates a new key or updates an existing one. Your agent never has to check "does this key exist?" before writing.
-- If 'session.active_work' exists → update it
-- If it doesn't exist → create it
INSERT INTO brain_context (key, value, category, description)
VALUES (
'session.active_work',
'{"task": "deploy v2", "status": "in_progress"}',
'session',
'Current work in progress'
)
ON CONFLICT (key) DO UPDATE SET
value = EXCLUDED.value,
description = EXCLUDED.description,
updated_at = now();
session.* (what the agent is doing now), directive.* (rules and behaviors), identity.* (who the agent is), system.* (infrastructure state), infrastructure.* (deployment config). Each category can hold hundreds of keys.
Pattern 2: Append-Only Memory
Deleting data destroys context. An AI agent that forgets past interactions cannot learn or improve. The append-only pattern means you never delete, always append — building a complete history the agent can search through.
The embedding column is what makes this powerful. It stores a vector — a list of numbers that represent the meaning of text. AI models convert text into these vectors, and you can find semantically similar memories even when the words are completely different (e.g., searching "deployment" finds a memory about "pushed code to Vercel").
-- Enable pgvector extension (run once per database)
CREATE EXTENSION IF NOT EXISTS vector;
-- Pattern 2: Append-Only Memory (never delete, always append)
CREATE TABLE agent_memory (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
content TEXT NOT NULL, -- what happened
role TEXT DEFAULT 'system', -- user, assistant, system
importance INT DEFAULT 5 CHECK (importance BETWEEN 1 AND 10),
embedding vector(384), -- semantic meaning (384 = BGE-small)
metadata JSONB DEFAULT '{}', -- tags, source, context
created_at TIMESTAMPTZ DEFAULT now()
);
-- Create an index for fast vector similarity searches
CREATE INDEX idx_memory_embedding
ON agent_memory
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
To find relevant memories, you compute the embedding of your query and search by cosine similarity — a measure of how close two vectors are in meaning-space. The <=> operator is pgvector's similarity function.
-- $1 is the embedding of your search query
-- (generated by BGE-small or another embedding model)
SELECT content, importance,
1 - (embedding <=> $1::vector) AS similarity
FROM agent_memory
WHERE importance >= 7 -- only high-importance memories
ORDER BY embedding <=> $1::vector
LIMIT 5;
vector(384) must match your embedding model. BGE-small produces 384 dimensions (free via HuggingFace). OpenAI's ada-002 produces 1536. Larger dimensions capture more nuance but use more storage and are slightly slower to search.
This lesson is for Pro members
Unlock all 520+ lessons across 52 courses with Academy Pro.
Already a member? Sign in to access your lessons.