What is RAG? (SOLVED)
RAG has become the foundational architecture for production GenAI applications at companies like Notion, Duolingo, and Morgan Stanley. Interviewers expect you to explain the full retrieval pipeline — not just define the acronym. Follow along to master what RAG is, when to use it over fine-tuning, and how to articulate trade-offs that separate junior from senior candidates.

TL;DR — Quick Answer
RAG combines retrieval from external knowledge bases with LLM generation, reducing hallucinations and enabling up-to-date answers without retraining.
The Interview Question
Explain Retrieval-Augmented Generation (RAG). How does it work, and when would you choose RAG over fine-tuning?
Deep Explanation
Retrieval-Augmented Generation (RAG) is an architecture that enhances LLM responses by retrieving relevant documents from a knowledge base before generation. Instead of relying solely on the model's parametric memory (weights learned during training), RAG grounds each response in externally retrieved evidence — dramatically reducing hallucinations and enabling knowledge that post-dates the model's training cutoff.
The standard RAG pipeline has four stages:
Sign in to unlock full answer
Get deep explanations, PDF export & all RAG questions