The Production AI Checklist.
Everything that must be true before your AI system goes live.
After this lesson you'll know
- A comprehensive pre-launch checklist covering all nine domains
- How to run a production readiness review for AI systems
- Common launch failures and how to prevent each one
- Post-launch monitoring and iteration practices
The Nine Domains
Production readiness for AI systems spans nine domains. Missing any one of them leads to a specific class of failure. This checklist synthesizes everything from the previous nine lessons into a single actionable reference. This is not a wish list. Every item exists because a real team learned its importance through a production incident. Treat unchecked items as risks, not aspirations.
How to use this checklist: Review it before every launch, major model change, or significant architecture update. Each item should have an owner and a verification method (automated test, manual review, or monitoring alert). Items marked [P0] are launch blockers. Items marked [P1] should be resolved within two weeks of launch.
Domain 1-3: Foundation
**1. Architecture** - [ ] [P0] System decomposed into independent, testable components (Gateway, Router, Pipeline/Orchestrator) - [ ] [P0] No single points of failure in the critical path - [ ] [P1] Component boundaries are documented with input/output contracts - [ ] [P1] Architecture diagram exists and is current **2. Reliability** - [ ] [P0] Retry logic with exponential backoff and jitter on all external calls - [ ] [P0] Circuit breakers on all model API calls - [ ] [P0] Fallback chain with cross-provider models configured - [ ] [P0] Timeout budgets allocated across all pipeline stages - [ ] [P1] Graceful degradation tested: kill each dependency and verify behavior - [ ] [P1] Local model fallback available as last resort **3. Security** - [ ] [P0] Input guardrails: injection pattern detection active - [ ] [P0] Output guardrails: PII detection, system prompt leak prevention - [ ] [P0] API keys in secret manager, not in code or prompts - [ ] [P0] Rate limiting per user/API key - [ ] [P1] Sandwich defense in prompt architecture - [ ] [P1] AI-generated code runs in sandbox with no network access - [ ] [P1] Red team exercise completed within last 30 daysThis lesson is for Pro members
Unlock all 518+ lessons across 52 courses with Academy Pro.
Already a member? Sign in to access your lessons.