Log — Platform Ops Wiki
Append-only history. Each entry starts with ## [YYYY-MM-DD] <op> | <title> where
<op> is ingest, query, lint, or split, so grep "^## \[" log.md | tail -5 works.
[2026-06-05] split | platform-ops-wiki created from _inbox cluster (3 sources)
Spun out by the hub router when the InfoQ piece netflix-service-topology arrived
via Telegram — the third tight ops piece, hitting the ≥3 spin-out threshold for the
platform-ops-sre cluster (the google-sre-agentic-ai park note had explicitly
flagged that a 3rd would trigger a spin-out). Scaffolded from CLAUDE.template.md;
domain = production platform engineering, SRE & observability for cloud-native
distributed systems. Migrated and ingested all three (URL-only, source: true + url:):
- netflix-service-topology → concepts service-topology, observability
- google-sre-agentic-ai → concepts site-reliability-engineering, aiops
- kubernetes-integration-tax → concepts platform-engineering, kubernetes
Created 6 concept pages (DefinedTerm) + 1 SoftwareApplication (kubernetes) + 3 source
summaries (10 pages total). synthesis frames the three as facets of one thesis: in
production the hard problem is the seams, not the components — telemetry-source fusion
(Netflix), platform-tool integration (CNCF), and investigation-signal integration
(Google SRE), meeting at the service-topology.
Did not migrate
nvidia-doca-in-silicon-security(ai-infrastructure) — distinct hardware/silicon-security layer; left parked with adjacency noted. Cross-spoke split fromagentic-tooling-wiki(builder tools vs. their ops application) recorded in synthesis. Open questions: build-vs-buy topology, AIOps reliability, eBPF cost, quantification.
[2026-06-05] lint | first health check (10 pages, day-of-spin-out)
Swept the new spoke for orphans, thin spots, @type specificity, missing cross-links, and contradictions. Findings + actions:
- Orphans: none — every page has ≥3 inbound links (min kubernetes = 3).
- Thin spots: none — smallest page kubernetes (969 B) is a legitimate substantive anchor.
- Contradictions / stale claims: none — the 3 founding sources are complementary; all pages authored same-day, nothing stale.
- Missing cross-links (2, fixed): observability named SRE/agents in prose without linking → added inbound to site-reliability-engineering + aiops; kubernetes referenced the “platform-ops practice” without linking → linked platform-ops (and dropped a stray game-engine cross-spoke mention in its adjacency note).
- @type (1, left as-is, optional): netflix-service-topology is typed
TechArticle; it is an InfoQ /news/ item, soNewsArticleis a defensible lateral alternative — but a sibling of TechArticle, not strictly more specific, so not changed. Flagged for a human call only. Site rebuilt clean; link check + count check PASSED (10/10).
[2026-06-09] ingest | +3 observability substrate (OpenTelemetry, eBPF, Prometheus) — all-spokes cron test
Answered the “build vs buy the topology” + “eBPF operational cost” open questions with the off-the-shelf CNCF stack: opentelemetry (SoftwareApplication, src — vendor-neutral traces/metrics/logs standard, not a backend), ebpf (DefinedTerm, src — sandboxed in-kernel programs; verifier/JIT/maps; CAP_BPF + complexity + kernel-version costs), prometheus (SoftwareApplication, src — pull-based time series + PromQL; 2nd CNCF project; not billing-grade). Tied to netflix-service-topology/kubernetes-integration-tax. Synthesis open questions updated (gap remaining: the topology-graph assembly above the raw signals). url-only. 10 → 13 pages.
[2026-06-10] ingest | SLOs + GitOps + distributed tracing — all-spokes pass (one foundation per pillar)
Three foundational concepts the spoke referenced but never paged. service-level-objectives (DefinedTerm, source, Google SRE book) — SLI/SLO/SLA + error budgets; the quantification backbone the open questions wanted (reliability-vs-velocity as a measured control loop; toil/MTTR become budget math), and a reframe of the AIOps reliability paradox (agents under an error budget). gitops (DefinedTerm, source, OpenGitOps/CNCF) — Git as single source of truth; four principles (declarative, versioned-immutable, pulled, continuously reconciled; Argo CD/Flux); the deployment face of “seams, not components” and a structural cousin of the aiops control loop. distributed-tracing (DefinedTerm, source, OpenTelemetry) — spans/traces/context-propagation; the per-request view of the service-topology (topology ≈ traces summed over time), the third signal beside metrics/logs, one of Netflix’s three fused telemetry sources, and the source of latency SLIs. Together they close a loop: observability(tracing) → SLIs/SLOs → reconcile/operate(GitOps/AIOps). Folded into synthesis (new 2026-06-10 section) + index (3 DefinedTerm rows). No contradictions. 13 → 16 pages.
[2026-06-12] ingest | DORA metrics (Four Keys) — dora.dev
All-spokes daily expansion. Added dora-metrics (@type DefinedTerm) — the delivery-performance quantification framework completing the “Quantification” open question that service-level-objectives half-answered. SLOs measure the running service’s reliability; DORA measures the delivery pipeline: throughput (deploy frequency, change lead time) + stability (change fail rate, failed-deployment recovery time, deployment rework rate). Captured the “speed and stability are not tradeoffs” finding and the MTTR→“Failed Deployment Recovery Time” term shift. Wired to service-level-objectives (backlink) / gitops / aiops (gives the reliability-paradox a yardstick); synthesis note added; open question reframed (frameworks named, still want them applied to this spoke’s own MTTR/toil claims). 1 new page. Authoritative (Google DORA / State of DevOps). No contradictions.
[2026-06-12] ingest | Project-as-a-Service (Belastingdienst, InfoQ/KubeCon)
Telegram drop, routed → platform-ops-wiki (platform-engineering pillar). Added source project-as-a-service
- new concept internal-developer-platform (IDP / golden paths / platform-as-a-product). Completes the
platform-engineering pillar: kubernetes-integration-tax was the problem side; the IDP is the cure —
pay integration once centrally, expose it as a self-service golden path (one YAML → namespaces/RBAC/quota via
the
opr-paasoperator; “make the right way the easiest way”). GitOps-shaped reconcile applied to project provisioning; half-social (enablement>support, Communities of Practice across 99+ teams, accelerator hackathons). Updated platform-engineering (“the cure: productize the platform”) + synthesis + index. 2 new pages. Caveat: qualitative only (no MTTR/onboarding numbers — quantification gap); golden-path→golden-cage tension recorded. (Routed pre-git; no commit yet — git is step 0 of the pending quality-mechanism plan.)