Prompt Engineering

The interview-ready synthesis: if asked "how do you approach prompt engineering in production,"

the structure is — layered prompt with strict schema, eval suite from day one, versioned in git like code, and clear-eyed about when the bug isn't in the prompt at all. That last clause is what separates an FDE answer from a prompt-influencer answer.

Quick reframe for your protocol: I'd suggest tightening the sequence to:

Define the bug (what does "inconsistent" mean, examples please)
Scope (new vs. always, % of traffic, any recent changes — deploys, model version bumps, traffic mix shifts)
Cheapest checks first (sampling params, model version, any A/B test running)
Then the eval/versioning audit (do they even know if it got worse?)
Then prompt structure
Then retrieval / context (is the variance coming from retrieved chunks?)

Notice the principle: cheapest hypothesis tested first. Checking the temperature config is 30 seconds. Auditing their entire prompt structure is 2 hours. An interviewer grading "structured diagnosis" wants to see you order by cost-to-test.