Spokes.wiki Search Graph Growth About

About

The operating manuals behind the hub. Rendered verbatim as text.

HUB.md

# Hub — Router Operating Manual

> This dir (`/home/claude-dev/projects/`) is the **hub**. The wikis beside it are
> **spokes**. A session started here acts as the **router**: it classifies each
> incoming source and dispatches it to the right spoke. A session started *inside* a
> spoke ingests there directly and ignores this file.

## Files

- `wikis.md` — the routing registry (domain → spoke). Read it first.
- `QUALITY.md` — the shared **quality rubric** (source tiers, freshness, the 4 dimensions, the ingest
  gate, the Quality Cycle, the scorecard). All quality operations reference it.
- `quality-log.md` — append-only per-spoke quality scorecards.
- `log.md` — append-only log of routing decisions (`route`, `park`, `split`).
- `_inbox/` — holding pen for sources that match no spoke. NOT a wiki.
- `CLAUDE.template.md` — template for scaffolding a new spoke on spin-out.

## Routing flow (never blocks)

A source arriving via the Telegram channel is an implicit "ingest this." The router
**never asks the human and never waits for permission** — the routing decision
*replaces* the old topic-fit question.

1. **Fetch + read** the source.
2. **Classify** (LLM judgment) against `wikis.md`:
   - Read each spoke's `domain`; `keywords`/`sample-pages` are tiebreakers only.
   - **Clear match to exactly one spoke** → route there (step 3).
   - **Plausibly two spokes** → pick the most specific; note the runner-up in the log.
   - **Broad source matching three+ spokes** (e.g. a vendor roundup) → identify the
     **dominant in-scope substance**, route the whole source to the single best-fit
     spoke, and ingest only that substance; record the other facets as **cross-spoke
     context inside the source-summary page** (not as separate pages, not parked). Log
     every runner-up spoke. Do **not** split one source across spokes.
   - **No clear match** → park (step 4).
3. **Route:** open the destination spoke's `CLAUDE.md` and perform its documented
   **Ingest** operation with that spoke as the working directory (write the source
   summary, touch its Thing pages, update its synthesis/index, append its `ingest`
   log entry). Then append a `route` entry to this hub `log.md`.
4. **Park:** write `_inbox/<slug>.md` (format below) and append a `park` entry here.
   Do **not** create a wiki for a single stray.

## Holding pen — `_inbox/`

A parked source is a lightweight record (this is neutral hub storage, NOT a spoke's
immutable `raw/`):

```yaml
---
url: https://…
fetched: YYYY-MM-DD
cluster: <tentative-topic-tag>
---

<2–3 sentence summary + one line on why it did not route to an existing spoke>
```

## Spin-out — create a new spoke

**Trigger:** when **≥3 sources cohere into a clear topic cluster** — counting both
`_inbox/` records (same `cluster:` tag, or evident on review) **and** a trigger source
arriving now that completes the cluster (it need not be parked first). Single/double
strays just wait.

Procedure:
1. `mkdir -p <topic>-wiki/raw <topic>-wiki/wiki`
2. Copy `CLAUDE.template.md` → `<topic>-wiki/CLAUDE.md`; fill `<<WIKI_TITLE>>`,
   `<<DOMAIN_ONE_LINER>>`, `<<DATE>>`.
3. Create empty spine files in the new spoke: `index.md`, `log.md`, `synthesis.md`
   (with their usual headers).
4. Add a registry block for the new spoke in `wikis.md` (domain copied from the new
   `CLAUDE.md` header).
5. Ingest each clustered source into the new spoke via its normal Ingest; delete the
   corresponding `_inbox/<slug>.md` records.
6. Append a `split` entry to this hub `log.md`.

## Quality (see `QUALITY.md`)

Routing places a source; **quality** keeps the corpus good. Two touchpoints, both defined in
`QUALITY.md`:

- **Ingest quality gate (soft).** Every ingest records the source **tier** (T1–T4), dedups before
  creating, checks gap-relevance, integrates into synthesis, and sources every claim. A weak source is
  still ingested with the weakness recorded — the gate **never blocks or drops** a source.
- **Quality Cycle (periodic).** The reframed daily loop (memory `daily-spoke-expansion-loop`) audits
  every spoke **weakest-scorecard-first**, writes a scorecard to `quality-log.md`, and fixes the
  highest-value issues (gaps, staleness, orphans/dupes/broken links, synthesis drift). A pass may
  improve a spoke with **zero new pages**.
- **Entity discovery (see `ENTITIES.md`).** Every ingest also pages the source's agents/context — people,
  organizations, events, series — as canonical, schema.org-typed nodes (denser graph). Reuse existing
  nodes cross-wiki via `cd site && npm run entity-index`; respect the recursion hard stop.

**Commit-per-run:** a routing run *or* a Quality-Cycle run is "done" only when the site rebuild + verify
is **green AND the run is committed** (`route:`/`ingest:`/`park:`/`quality:`/`split|merge:`/`chore:`
message prefix). Local-only repo; no push.

## Definition of done

A routing action isn't finished until every file it touches is consistent. Run the
matching checklist before you stop.

**`route`** (ingested into a spoke):
- [ ] Source summary written in the spoke (`source: true`; `sources:` or `url:`).
- [ ] Thing pages created/updated with `[[links]]` + provenance.
- [ ] Spoke `synthesis.md` folded in (contradictions flagged); spoke `index.md` updated.
- [ ] Spoke `log.md` `ingest` entry **and** hub `log.md` `route` entry (with runner-up).
- [ ] Site rebuilt + verified green (`cd site && rm -rf .astro dist node_modules/.astro && npm run build && npm run verify`).

**`park`** (held in `_inbox/`):
- [ ] `_inbox/<slug>.md` written (frontmatter: `url`, `fetched`, `cluster`).
- [ ] Hub `log.md` `park` entry, including the updated cluster tally.
- [ ] No site rebuild needed (`_inbox/` is not rendered).

**`split`** (new spoke spun out):
- [ ] `<topic>-wiki/{raw,wiki}` scaffolded; `CLAUDE.md` filled from the template.
- [ ] Spine files created (`index.md`, `log.md`, `synthesis.md`) with a `split` log entry.
- [ ] Registry block added to `wikis.md`; hub `log.md` `split` entry.
- [ ] Each clustered source ingested; its `_inbox/<slug>.md` deleted.
- [ ] Adjacency notes updated on any sources left parked that referenced the migrated ones.
- [ ] Site rebuilt + verified green (clearing `.astro` is required after page moves).

## Edge handling

- **Fetch fails (WebFetch blocked / 403 anti-bot / JS-only SPA):** before settling for a
  stub, **fall back to the `firecrawl` skill** — `python3 ~/.claude/skills/firecrawl/scripts/firecrawl.py scrape "<url>"`
  returns clean markdown for pages WebFetch can't read (Reuters-class hard-blocks, JS-rendered
  SPAs). A hard WebFetch block is **not** a transient socket error (don't retry-3×) — it's the
  case Firecrawl exists for, so a blocked fetch is no longer a reason to ingest blind. Only if
  Firecrawl also fails (paywall it can't bypass, `429` credit limit) do you route by
  URL/title/context and record an honest stub in the destination spoke.
- **Re-seen source:** if the URL already maps to a page in a spoke, refresh that page
  in place; do not create an `_inbox` duplicate.
- **Registry vs spoke header drift:** the spoke `CLAUDE.md` header wins; correct
  `wikis.md`.
- **Ambiguous match:** never blocks — pick the most specific spoke and log the
  runner-up so a human can re-file later.

## Log format — `log.md`

```
## [YYYY-MM-DD] route | <title> → <spoke>   (runner-up: <spoke|none>)
## [YYYY-MM-DD] park  | <title> → _inbox     (cluster: <tag>)
## [YYYY-MM-DD] split | <topic>-wiki created from _inbox (<n> sources)
```

## Previewing the site (Cloudflare tunnel)

To share a live preview of the rendered site (`site/`) over a public URL:

1. **Build first if stale:** `cd site && npm run build` (output is static HTML in `site/dist/`).
2. **Serve the *static* `dist/`, not `astro preview`.** Astro/Vite's preview server rejects
   requests whose `Host` header it doesn't recognize (the tunnel domain) → **HTTP 403**. A plain
   static server has no host check:
   `python3 -m http.server 4321 --bind 127.0.0.1 --directory dist` (run in background).
3. **Get `cloudflared`** (not preinstalled; host is **aarch64**):
   `curl -fsSL -o /tmp/cloudflared https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-arm64 && chmod +x /tmp/cloudflared`
4. **Open a quick tunnel:** `/tmp/cloudflared tunnel --url http://localhost:4321` (background) —
   grab the `https://<random>.trycloudflare.com` URL it prints (`grep -oE 'https://[a-z0-9-]+\.trycloudflare\.com'`).
   Verify with `curl` before sharing.

It's an **ephemeral, public, no-auth** quick tunnel (anyone with the link can view; dies when the
process/session ends). **Tear down** when asked:
`pkill -f "cloudflared tunnel"; fuser -k 4321/tcp` (then confirm the URL returns 530/unreachable).
Alternative no-install tunnel: `ssh -R 80:localhost:4321 nokey@localhost.run` — but the same
static-server caveat applies.

CLAUDE.md

# Spokes.wiki — Operating Manual (entry point)

This directory (`/home/claude-dev/projects/`) is the **hub** of a family of
**LLM-maintained research wikis**. Each subdirectory ending in `-wiki/` is a **spoke**:
a self-contained wiki with its own `CLAUDE.md`, `raw/`, `wiki/`, and spine files
(`index.md`, `log.md`, `synthesis.md`). The human curates sources (often via Telegram);
you (the LLM) do all the bookkeeping.

## Which mode am I in?

- **Session started *inside* a spoke** (e.g. `cloud-wiki/`) → ignore this file; follow
  that spoke's own `CLAUDE.md`. You are just maintaining that one wiki.
- **Session started *here* (the hub root)** → you are the **ROUTER**. Your job is to
  send each incoming source to the right spoke. **Read `HUB.md` and follow it.**

## Router prime directive (detail lives in `HUB.md`)

A source (URL/document) arriving — especially via Telegram — is an implicit
"ingest this." **Never block on permission or topic fit.** Then, per `HUB.md`:

1. **Fetch + read** the source.
2. **Classify** it (LLM judgment) against the spoke domains in **`wikis.md`**.
3. **Route** to the best-matching spoke — adopt that spoke's `CLAUDE.md` and run its
   Ingest — and log a `route` entry in `log.md`.
4. **No clear match** → **park** it in `_inbox/` and log a `park` entry. Don't create a
   wiki for a single stray.
5. **Spin out** a new spoke (scaffold from `CLAUDE.template.md`) only once **≥3 sources
   cohere into a clear topic cluster** — whether already parked in `_inbox/` or arriving
   together (the trigger source counts toward the ≥3); log a `split` entry.

**Too few vs. too many matches.** A source matching *no* spoke is the park case (step 4).
A source matching *several* is the opposite: **route the dominant in-scope substance to
the single most-specific spoke, ingest there, and note the rest as cross-spoke context**
in the source page — don't fragment across spokes or park a partial fit. Log the
runner-up spoke(s). (Detail in `HUB.md` → Classify.)

The routing decision *replaces* the old "does this fit?" question — make the call,
record it, and let the human correct afterward.

### Transient API errors — retry, don't abandon

`API Error: The socket connection was closed unexpectedly` (and similar transient
network/socket failures) is **not** a "can't fetch / no match" signal — it's a flaky
connection. **Retry the same operation** (the `WebFetch`/fetch, the tool call) before
treating it as a real failure:

- **Retry up to 3 times** with short **exponential backoff** (~2s, 4s, 8s).
- Only the **failed call** is retried — never re-ingest or duplicate pages that already
  succeeded; if a page was partly written, resume from where it stopped (Ingest is
  idempotent — re-running refreshes a summary in place, it doesn't duplicate).
- If all 3 retries fail, **don't park or drop the source** for this reason — report the
  error to the human (via the Telegram `reply` tool when the source came that way) and
  hold the source so it can be retried later. A transient error never counts as a routing
  decision and never gets a `log.md` entry.

## Hub files

| File | Role |
|------|------|
| `HUB.md` | Full router manual — classification, routing flow, holding pen, spin-out. **Read this when routing.** |
| `wikis.md` | Routing registry: one `domain:` block per spoke. The classification table. Source of truth for a spoke's domain is its own `CLAUDE.md` header. |
| `log.md` | Append-only hub log of routing decisions (`route` / `park` / `split`). |
| `_inbox/` | Holding pen for unrouted sources. **Not** a wiki; not a spoke's immutable `raw/`. |
| `CLAUDE.template.md` | Template a new spoke is scaffolded from on spin-out. |

## Current spokes

See `wikis.md` for the authoritative list and domains (14 spokes as of 2026-06-15).
Spoke lineage — merges, renames, and spin-outs — lives in `log.md`.

## Invariants

- **Never edit or delete anything in any spoke's `raw/`** — it is the source of truth.
- **Provenance always:** every claim in a spoke traces to a source via inline
  `[[source-slug]]` links and frontmatter (`sources:` for held raw docs, `url:` for
  URL-only ingests).
- **Record, don't overwrite:** when a source contradicts an existing page, keep both
  claims and flag the conflict in that spoke's `synthesis.md` (and, for cross-spoke
  tensions, note the adjacency).
- **Write like a human.** When you draft or edit wiki *prose* — source-summary pages,
  `synthesis.md`, page bodies, log narrative — run the **`avoid-ai-writing`** skill over
  the draft (detect mode to flag, or edit mode to fix in place) before the ingest/cycle
  is done, and resolve the flagged AI-isms. Skip frontmatter, code, and quoted source
  text. It's a signal, not a verdict — don't mangle accurate phrasing to satisfy it.
  (Shared rule lives in `QUALITY.md`; every spoke's Ingest gate references it.)
- **Rebuild the site after structural changes.** The Astro site in `site/` renders all
  spokes. Any route that adds or moves pages — and every split / merge / rename — must
  finish with a clean rebuild + verify:
  `cd site && rm -rf .astro dist node_modules/.astro && npm run build && npm run verify`.
  The link check and count check must pass before the work is done. **Clearing both Astro
  caches — `.astro` *and* `node_modules/.astro` (the Astro-5 content layer) — is required
  after a page is deleted/moved/renamed**, or other pages keep stale within-wiki hrefs to
  the gone page and the link check fails.
- **Keep the hub thin.** Detailed routing logic belongs in `HUB.md`, not here; this
  file just boots the router and points at the rest.