Service Topology
A live dependency graph of the services in a distributed system — which service calls which, in (near) real time. A specialization of observability aimed at answering three operational questions: what depends on what?, what is a failure’s blast radius?, and is this issue local or upstream?
Canonical example
netflix-service-topology — Netflix’s internal system mapping thousands of microservices in near real-time. It fuses eBPF flow logs + IPC metrics + distributed traces into separate graph partitions, merges them, and runs a three-stage aggregation that collapses multi-hop paths through intermediaries into direct application-to-application edges. Historical queries use time-window aggregation rather than snapshots to bound storage cost.
Why it matters for ops
The topology is the map site-reliability-engineering reads during an incident; google-sre-agentic-ai explicitly has agents investigate over “observability + topology.” Without it, dependency knowledge lives only in engineers’ heads.