Monitoring & Detection

Catching attacks in production: anomaly detection, logging patterns, and abuse signals

Why Pre-Deployment Testing Is Not Enough

Red teaming finds vulnerabilities before launch. But attacks evolve. New jailbreak techniques appear weekly. Users find creative exploits you never imagined. Production monitoring is your last line of defense — the system that catches attacks your pre-deployment testing missed.

Real-world analogy: A home inspection checks for problems before you move in. But you still need smoke detectors, security cameras, and carbon monoxide alarms after you are living there. Pre-deployment testing is the inspection. Monitoring is the smoke detector.

What to Monitor

Input patterns

Log user inputs and scan for injection signatures, unusual length, encoded content, and repeated attack patterns from the same user.

Output anomalies

Watch for responses that contain system prompt fragments, PII patterns, unusual URLs, or content that violates your policies.

Tool usage patterns

Track which tools are called, how often, and with what parameters. Spikes in database queries or file reads may indicate exfiltration attempts.

Behavioral drift

Does the agent stay in character? Monitor for responses that break persona, discuss off-topic subjects, or exhibit behavior inconsistent with the system prompt.

Cost anomalies

Sudden cost spikes may indicate abuse — someone running the agent in circles, exfiltrating data through many queries, or exploiting tool loops.

🔒

This lesson is for Pro members

Unlock all 355+ lessons across 36 courses with Academy Pro. Founding members get 90% off — forever.

Go Pro — $4.90/mo ← Back to course

Already a member? Sign in to access your lessons.