LLM Wiki (pattern)
A pattern for personal knowledge bases in which an LLM incrementally builds and maintains a persistent, interlinked markdown wiki that sits between a person and their raw sources. Defined in llm-wiki-gist.
Core idea
The wiki is a persistent, compounding artifact. Rather than re-discovering knowledge per query (as in retrieval-augmented-generation), the LLM compiles each source into the wiki once and keeps it current — cross-references, flagged contradictions, and synthesis are already in place. The wiki gets richer with every source ingested and every question asked.
Architecture — three layers
- Raw sources — immutable curated documents; the source of truth.
- The wiki — LLM-owned markdown pages (summaries, entity/concept pages, an evolving synthesis). The LLM creates, updates, and cross-references these.
- The schema — a config file (e.g.
CLAUDE.md) defining structure, conventions, and workflows; co-evolved with use. This is what makes the LLM a disciplined maintainer rather than a generic chatbot.
Three operations
- Ingest — add a source; the LLM summarizes it, updates entity/concept pages and the synthesis, and logs it (~10–15 pages touched).
- Query — ask a question; the LLM answers with citations and can file good answers back as new pages so explorations compound.
- Lint — periodic health check for contradictions, stale claims, orphans, missing pages/links, and data gaps.
Why it works
The hard part of a knowledge base is bookkeeping, which humans abandon. LLMs make maintenance near-free, so the wiki stays maintained. Division of labor: human curates and directs; LLM does the rest. Lineage traces to vannevar-bush‘s memex — the maintenance problem Bush couldn’t solve is the one the LLM handles.
Tooling (optional, modular)
obsidian (browsing/graph view), qmd (search at scale), Marp (slides), Dataview (frontmatter queries). All optional — “pick what’s useful.”
Compared to gbrain
gbrain (garry-tan) is the same idea taken further: where the LLM Wiki keeps a human in the loop and navigates by a hand-maintained index, GBrain adds an autonomous “dream cycle” daemon, a self-wiring knowledge-graph, and hybrid vector+BM25 retrieval at ~100K+ pages. The LLM Wiki is the minimal, legible end of this family; GBrain is the maximal, automated end. Both treat a git repo of markdown as the system of record.
Implementations
- gbrain — heavyweight, automated (DB + vector + knowledge graph + daemon).
- llm-wiki-agent — lightweight coding-agent skill, markdown-only, no API key; a
near-twin of this wiki. Together with this wiki itself, that’s multiple
independent realizations converging on the same
raw/→wiki/+ index/log/synthesis + ingest/query/lint design — evidence the pattern is a natural attractor, not one author’s idiosyncrasy.
Related
retrieval-augmented-generation · memex · associative-trails · gbrain · llm-wiki-agent · vannevar-bush · andrej-karpathy
Linked from
- index
- log
- synthesis
- agent-memory-knowledge-graphs
- agent-skills
- andrej-karpathy
- as-we-may-think
- associative-trails
- augmenting-human-intellect
- claude-fable-5
- compound-engineering
- douglas-engelbart
- enterprise-context-layer
- gbrain
- knowledge-graph
- lean-theorem-prover
- llm-wiki-agent
- llm-wiki-gist
- logseq
- memex
- microfilm
- model-context-protocol
- niklas-luhmann
- notion
- obsidian
- qmd
- rag-original-paper
- retrieval-augmented-generation
- roam-research-guide
- roam-research
- spaced-repetition
- tana
- tana-supertags-guide
- tech-adoption-curve-twenty-years
- technology-adoption-curve
- tiddlywiki5
- temporal-knowledge-graph
- tools-for-thought
- vannevar-bush
- zettelkasten-introduction
- zettelkasten