Anatomy of a Prompt.
Watch your words get sliced into tokens in real-time. Understand the hidden structure behind every AI interaction.
After this lesson you'll know
- How AI breaks text into tokens (not words, not characters)
- What the context window is and why it matters
- How system prompts shape AI behavior
- What temperature controls and when to adjust it
Type anything and watch it get tokenized.
Every great prompt has up to five components.
A well-crafted prompt is not just a question — it is an instruction manual. The best prompts give the AI everything it needs to succeed. Here are the five building blocks, from most essential to most advanced:
The core instruction. Every prompt has one. "Summarize this article." "Write a Python function that sorts a list." "Translate this to Spanish." Be specific about the action: "summarize" is clearer than "tell me about." A vague task gets a vague answer. A precise task gets a precise answer.
Give the AI the information it needs to do the job well. This could be a document to summarize, data to analyze, or background about your situation. The more relevant context you provide, the better the response. Think of it as briefing a consultant before they start work — they cannot help you if they do not understand your situation.
"You are a senior data scientist" produces different output than "you are a high school teacher" — even with the same question. Roles shape vocabulary, depth, assumptions, and focus. A security engineer reviews code for vulnerabilities. A UX designer reviews it for user experience. Same code, completely different analysis. Roles are typically set in the system prompt.
Tell the AI what shape the answer should take. "Respond in bullet points." "Use JSON format." "Write exactly 3 paragraphs." "Include a code example." Without format instructions, the AI guesses — and it might guess wrong. Specifying format is especially important when you are building applications that need to parse the AI's output programmatically.
"Do not use jargon." "Keep it under 200 words." "Never make promises about delivery dates." "Only use information from the provided document." Constraints are guardrails. They prevent common failure modes and keep the AI within bounds. Especially critical for customer-facing applications where the AI should not make promises or share sensitive information.
Here is how all five components look in a single prompt:
# ROLE (who the AI should be)
You are a senior technical writer with 10 years of experience
writing developer documentation.
# CONTEXT (background information)
Here is our API endpoint documentation for the /users route:
[... documentation text ...]
# TASK (what you want done)
Rewrite this documentation to be beginner-friendly.
# FORMAT (how the output should look)
Use markdown. Include a "Quick Start" section with a code
example, followed by a detailed reference table.
# CONSTRAINTS (what to avoid)
Do not assume the reader knows REST APIs.
Keep sentences under 20 words. No jargon without definitions.
The four things that shape every AI response.
AI does not read words or characters. It reads tokens — subword chunks like "un" + "believ" + "able." Common words are one token; rare words get split. A token is roughly 4 characters or 0.75 words. This matters because you pay per token and your context window is measured in tokens.
Everything the model can see at once: your prompt, conversation history, and its response. Claude Opus 4.6 has a 1M token context window. GPT-4o has 128K. Once you exceed the window, the oldest content gets dropped. This is why long conversations can "forget" earlier context.
A hidden message processed before any user input. It defines the AI's persona, rules, and constraints. The user never sees it, but it shapes every response. Think of it as giving the AI a job description before it starts work.
Controls randomness in the output. Low (0.0) = always picks the most likely word = deterministic, focused, correct. High (1.0) = sometimes picks unlikely words = creative, surprising, error-prone. Use low for code and facts, high for brainstorming.
Here is a real API call showing all four concepts in action:
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-6", # which model
max_tokens=500, # ← context window budget
temperature=0.2, # ← low = factual, precise
system="You are a helpful coding tutor. " # ← system prompt
"Explain concepts simply. "
"Always include a code example.",
messages=[ # ← user message (tokenized)
{"role": "user",
"content": "What is a list comprehension in Python?"}
]
)
# Check token usage
print(f"Input tokens: {response.usage.input_tokens}")
print(f"Output tokens: {response.usage.output_tokens}")
print(response.content[0].text)
Every parameter in this call maps to one of the four concepts. The system message is the invisible instruction. The messages content gets tokenized. max_tokens limits the context window budget. temperature controls creativity.
This lesson is for Pro members
Unlock all 520+ lessons across 52 courses with Academy Pro.
Already a member? Sign in to access your lessons.