Software Source Code source updated Fri May 29 2026 00:00:00 GMT+0000 (Coordinated Universal Time)

GBrain — source summary

By garry-tan (President & CEO of Y Combinator). An open-source (MIT) “brain layer” for AI agents: an LLM-maintained personal/team knowledge base in markdown that adds synthesis, a self-wiring knowledge graph, and gap analysis on top of search. Delivered via Telegram, ingested 2026-05-29. README in raw/gbrain.md.

On-thesis: this is the most direct comparison point to llm-wiki in the wiki — same family (LLM + persistent markdown brain, git repo as system of record), but pushed much further toward automation and scale. It also bridges to the agent tooling now in the sibling agentic-tooling-wiki (links below resolve cross-wiki): it runs over model-context-protocol, plugs into claude-cowork, and is the memory layer of gstack — i.e. a cluster-A knowledge base used as a coding harness’s persistence. This is the cleanest single seam between the two wikis.

What it is

The “production brain” behind Tan’s OpenClaw and Hermes agent deployments — 146,646 pages, 24,585 people, 5,339 companies, 66 cron jobs. The agent ingests meetings, emails, tweets, voice calls, and ideas continuously; enriches every person/company; “fixes its own citations and consolidates memory overnight.” Also runs as a multi-user company brain (per-login scoped access, OAuth, fuzz-tested for zero cross-tenant leaks) — explicitly aimed at YC’s “company-brain” Request for Startups.

How it differs from llm-wiki

Autonomous “dream cycle” vs. in-conversation maintenance: cron jobs run while you sleep — dedup people pages, fix citations, score salience, find contradictions, prep tomorrow’s tasks. Maintenance is not just cheap (the LLM Wiki claim) but continuous and unattended. This is the strongest answer yet to Bush’s “who maintains the trails.”
Self-wiring knowledge-graph: every page write extracts entity refs and writes typed edges (works_at, invested_in, founded, advises, attended, …) with zero LLM calls — automated, scaled associative-trails.
Synthesis layer (gbrain think): cited prose answers plus explicit gap analysis (flags stale pages, uncited claims, contradictions, holes) — the LLM Wiki’s synthesis + lint, exposed as a query.
Hybrid retrieval (vector HNSW on pgvector + BM25 + reciprocal-rank fusion + reranker + per-query graph signals) vs. the LLM Wiki’s index-only navigation.
Schema packs: a typed page taxonomy (gbrain-base-v2: person, company, concept, source, deal, …) that the agent can evolve — independent convergence on typed pages, like this wiki’s schema.org choice.

Benchmarks (see retrieval-augmented-generation)

P@5 49.1% / R@5 97.9% on a 240-page rich-prose corpus; +31.4 points P@5 from the graph over a graph-disabled variant and over ripgrep-BM25 + vector-only RAG. Supports the public LongMemEval benchmark; BrainBench scorecards in the sibling gbrain-evals.

Architecture notes

Brain repo = a git repo of markdown (system of record), synced into Postgres for retrieval. Two engines, one contract: PGLite (zero-config, ~50K pages) and Postgres+pgvector (shared/large). Exposes 30+ tools over model-context-protocol (stdio + HTTP/OAuth). Ships 43 markdown agent-skills as a skillpack; designed to be installed and operated by an agent.

llm-wiki · llm-wiki-agent · knowledge-graph · associative-trails · memex · retrieval-augmented-generation · qmd · model-context-protocol · claude-cowork · garry-tan

GBrain — source summary

What it is

How it differs from llm-wiki

Benchmarks (see retrieval-augmented-generation)

Architecture notes

Related

Linked from