LLM Wiki (gist) — source summary
By andrej-karpathy. A short “idea file” published as a GitHub gist, meant to be
pasted into an LLM agent so it can collaboratively instantiate a personal knowledge
base. Fetched 2026-05-29 from gist.github.com/karpathy/442a6bf555914893e9891c11519de94f.
What it argues
- Defines the llm-wiki pattern: an LLM incrementally builds and maintains a persistent, interlinked markdown wiki, rather than re-deriving knowledge per query.
- Positions this against retrieval-augmented-generation (RAG) as the status quo it improves on.
- Specifies a three-layer architecture (immutable raw sources / LLM-owned wiki /
a schema file such as
CLAUDE.md) and three operations (ingest, query, lint). - Recommends two navigation files: a content-oriented
index.mdand a chronologicallog.mdwith a grep-parseable entry prefix. - Surveys optional tooling: obsidian as the browsing “IDE”, qmd for search, plus Marp (slides), Dataview (frontmatter queries), and the Obsidian Web Clipper.
- Traces the lineage to vannevar-bush‘s memex (1945).
Key claims
- The bottleneck in knowledge bases is bookkeeping, not reading or thinking; humans abandon wikis because maintenance cost grows faster than value.
- LLMs make maintenance cost near-zero (don’t get bored, can touch ~15 files per pass), so the wiki stays current — the compounding “persistent artifact.”
- Division of labor: the human curates sources, directs analysis, and asks questions; the LLM does everything else.
- The index-file approach scales “surprisingly well” to ~100 sources / hundreds of pages without embedding-based RAG infrastructure.
So what
This is the founding document for this wiki — its design choices (schema.org typing, index-by-type, no-git/no-Obsidian) are a concrete instantiation of the abstract pattern described here. The doc is deliberately implementation-agnostic and explicitly modular (“pick what’s useful, ignore what isn’t”).