Spokes.wiki Search Graph Growth About

platform-ops-wiki

Defined Term mechanism source ↗ source url updated Wed Jun 10 2026 00:00:00 GMT+0000 (Coordinated Universal Time)

Distributed tracing

The third telemetry pillar the observability page named but didn’t have its own page for — and one of the three sources Netflix fuses (eBPF flow logs + IPC metrics + distributed traces) into its dependency graph. From the OpenTelemetry docs, the canonical vendor-neutral reference.

What it is

How it complements the other signals

The three observability signals divide labor: metrics (prometheus) answer “is something wrong, and how much?” (cheap, aggregate); logs answer “what exactly happened here?” (detailed, local); traces answer “where, across the whole request path, did the latency/error occur?” (cross-service, causal). Traces are “structured logs with context, correlation, and hierarchy baked in,” which is why they’re the signal that reveals latency sources and service dependencies that metrics and logs alone can’t.

Why it matters to the spoke

Distributed tracing is the per-request view of the same dependency structure the service-topology shows in aggregate — a topology graph is, in part, traces summed over time. It is therefore load-bearing for the spoke’s “seams, not components” thesis: a trace is the seam made visible, the literal record of a request crossing the boundaries between services where the hard problems live. It’s produced through opentelemetry instrumentation (the off-the-shelf layer), feeds the service-topology that site-reliability-engineering reads and aiops agents reason over, and supplies the latency SLIs that SLOs are defined on.

observability · service-topology · opentelemetry · prometheus · service-level-objectives · netflix-service-topology · platform-ops