RAG: Retrieval-Augmented Generation (Lewis et al., 2020)
The canonical, neutral primary source for retrieval-augmented-generation — the one the wiki’s synthesis explicitly said it still wanted (the existing RAG framing came only from advocates of alternatives, llm-wiki/gbrain). “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” Patrick Lewis, Ethan Perez, et al. (Facebook AI Research, 2020) — the paper that coined “RAG” and defined the architecture everything since builds on.
What it introduced
A model with two memories:
- Parametric memory — a pre-trained seq2seq generator (BART), the knowledge baked into weights.
- Non-parametric memory — a dense vector index of Wikipedia, retrieved by a neural DPR (Dense Passage Retrieval) retriever.
The retriever pulls passages for a query; the generator conditions on them to produce the answer. Two variants: RAG-Sequence (one retrieved set conditions the whole output) and RAG-Token (different passages can inform each generated token).
Why it’s canonical
- Set state-of-the-art on three open-domain QA tasks, beating both parametric-only seq2seq and task-specific retrieve-and-extract pipelines.
- Generated “more specific, diverse, and factual” language than a parametric-only baseline.
- Named the two properties that became RAG’s whole selling point: provenance (you can cite which passage produced an answer) and updatable knowledge (swap the index, no retraining).
How it lands in the wiki’s RAG critique
This is the definition the wiki’s whole RAG-gap taxonomy critiques from — and it dates the baseline (2020), grounding the advocate framings. The original paper’s pitch (provenance + updatability) is real and uncontested; the wiki’s added claims are about where naive vector RAG falls short — no-accumulation (llm-wiki), exact-token (BM25, hybrid-retrieval-rag), factual-connection (typed knowledge-graph, gbrain), and temporal-validity (temporal-knowledge-graph, agent-memory-knowledge-graphs). Having the primary source separates what RAG actually claimed from what its critics extrapolate: Lewis et al. never claimed accumulation or factual-graph connection — those are genuinely beyond the 2020 design, so the critiques extend rather than refute it.
Related
retrieval-augmented-generation · hybrid-retrieval-rag · knowledge-graph · llm-wiki · gbrain · temporal-knowledge-graph