7. The Customer Mismatch Patterns ⭐ (FDE gold)

These are the canonical RRK trap scenarios. Memorize the failure mode and the right fix.

What they want: a model that answers questions about their docs
What they need: RAG, not fine-tuning
Why: small-dataset fine-tuning does not reliably encode facts — it biases distributions. Facts live in a retrievable knowledge base. Hallucinations don't go away because you fine-tuned.
FDE phrasing: "Fine-tuning is great for teaching the model how to do something, but not for teaching it what to know. For knowledge, we want retrieval — that way when your docs update next quarter, we don't have to retrain."

Catastrophic forgetting from over-tuning on narrow data
Fixes: use LoRA (preserves base weights), lower learning rate, fewer epochs, mix in general-purpose data, or just accept that this needs a separate adapter not a single tuned model

Maybe — if you can fine-tune a smaller cheaper model (Flash, distilled) to match Pro quality on your narrow task
Not if you're tuning the same-size model. Then cost is unchanged or worse.

This is a fine-tuning use case — but specifically preference tuning (DPO) with curated good/bad style pairs, not SFT
Or, often easier: a strong system prompt with 3-5 few-shot examples gets 80% of the way there