• Latency is a critical measure to determine whether our systems are running normally or not.
  • There are many collections libraries available that help you collect latency metrics.
  • Heat maps are useful as they help visualize latency distribution over time.
  • After narrowing down the source of the latency to a service or process, look at the host-specific and in-process reasons why latency occurred in the first place.
  • If the host is behaving normally and networking is not impacted, go and further analyze the in-process sources of latency.
  • Some language runtimes like Go allows us to internally trace runtime events in the lifetime of a request.

Full post here, 6 mins read