"I have a million pages of internal documents. I need an LLM-powered Q&A that answers from THOSE documents, with citations, and updates when the documents do."

1. Why RAG exists

2. Basic RAG Pipeline

3. Chunking strategies

4. Advanced RAG toolkit

5. Failure Modes and the Eval Loop

6. When NOT to Use RAG

Diagnostics

Clarify → receive answers → name the diagnosis → propose fix → discuss trade-offs

The two patterns to internalize

Pattern 1 — when no eval exists, building it IS the first task

Customers without eval are guessing. Your first job is always to remove the guessing. A 30-query eval set takes 2–4 hours and changes every downstream conversation. Always include "build the eval" as step 1 unless the customer explicitly has one.

The line for the interview:

"If they don't have eval, the first deliverable is the eval — not because of process, but because every fix from here is a coin flip without it. Two hours of golden-set construction saves weeks of guessed fixes."

Pattern 2 — the retrieval-vs-generation split is your highest-leverage diagnostic

Every "answer is wrong" investigation should fork on this question:

Was the right info in the retrieved chunks?
        ↓                          ↓
       YES                        NO
        ↓                          ↓
Generation problem        Retrieval problem
(prompt, model,           (parsing, chunking,
 lost-in-the-middle,       embedding, ranking,
 hallucination)            filtering)

One question. Halves the search space. Always ask it.