Chain-of-Thought Reasoning
Unlock Claude's deeper reasoning — from simple CoT to extended thinking, with production code
The Power of "Think Step by Step"
Chain-of-thought (CoT) prompting is one of the most powerful techniques in prompt engineering. By asking Claude to show its reasoning before giving a final answer, you dramatically improve accuracy on complex tasks — math, logic, coding, analysis, and multi-step decisions.
The reason is simple: when Claude writes out intermediate steps, each step becomes context for the next. It can catch errors mid-reasoning, check its own logic, and build on solid intermediate conclusions. Without CoT, the model jumps directly from question to answer — and on complex problems, that jump often lands wrong.
Step 2: 25% off: $200 x 0.75 = $150
Step 3: Additional 10% off sale price: $150 x 0.90 = $135
Step 4: Total discount = $200 - $135 = $65 = 32.5%
CoT in the API — Three Approaches
There are three ways to use chain-of-thought with Claude, each with different tradeoffs:
import anthropic
client = anthropic.Anthropic()
# Just add "Think step by step" to your prompt
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{
"role": "user",
"content": (
"Think step by step.\n\n"
"A store has a 25% off sale, then offers an additional "
"10% off the sale price. What is the total discount "
"on a $200 item?"
)
}]
)
print(response.content[0].text)
# Claude shows: Step 1... Step 2... Step 3... Answer: $135 (32.5% off)
# For production: enforce CoT format in the system prompt
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=2048,
system=(
"You are a precise analytical assistant.\n\n"
"For EVERY question, respond in this exact format:\n"
"<reasoning>\n"
"Step-by-step analysis here...\n"
"</reasoning>\n\n"
"<answer>\n"
"Final answer here.\n"
"</answer>\n\n"
"<confidence>HIGH/MEDIUM/LOW</confidence>"
),
messages=[{
"role": "user",
"content": "Should we use a SQL or NoSQL database for a social media feed?"
}]
)
# Parse the structured output
text = response.content[0].text
reasoning = text.split("<reasoning>")[1].split("</reasoning>")[0]
answer = text.split("<answer>")[1].split("</answer>")[0]
print(f"Answer: {answer.strip()}")
# Extended thinking: Claude reasons internally before responding
# Available on Opus 4.6 and Sonnet 4.6
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=16000,
thinking={
"type": "enabled",
"budget_tokens": 10000 # tokens for internal reasoning
},
messages=[{
"role": "user",
"content": "Analyze the tradeoffs between microservices and a monolith for a startup with 5 engineers."
}]
)
# Response has both thinking and text blocks
for block in response.content:
if block.type == "thinking":
print(f"Internal reasoning ({len(block.thinking)} chars):")
print(block.thinking[:200] + "...")
elif block.type == "text":
print(f"\nFinal answer:\n{block.text}")
Simplest. Add "think step by step." Reasoning visible in output. Costs output tokens.
Parseable. XML tags separate reasoning from answer. Best for production APIs.
Most powerful. Internal reasoning with budget. Best for hard problems. Separate thinking block in response.
How the Thinking Process Works
When Claude uses chain-of-thought, its reasoning unfolds in a visible sequence. For a math puzzle, it might look like this:
This pattern applies to math puzzles, logic problems, and code debugging alike. The key is that each step becomes context for the next — allowing Claude to catch errors mid-reasoning rather than jumping to a wrong conclusion.
This lesson is for Pro members
Unlock all 520+ lessons across 52 courses with Academy Pro.
Already a member? Sign in to access your lessons.