3. Online Observability — The Live Part

Offline eval tells you the system can be good. Observability tells you whether it is good right now. This is where the JD's "tokens/sec, cost-per-request" lives.

3.1 The Three Pillars (extended for LLMs)
3.2 Tracing — Why It's the Most Important Tool
3.3 The LLM-Specific Metrics (this is what the JD verbatim calls out)
3.4 The Stack
3.5 Sampling Strategy
3.6 PII and Logging — The Security Tie-In