Sampling in observability

  • You can use sampling APIs by way of instrumentation libraries that let you set sampling strategies or rates. For ex, Go’s runtime.setCPUProfileRate, which lets you set CPU profiling rate.
  • Subcomponents of a system may need different sampling strategies, and the decision can be quite subjective: for a low-traffic background job, you might sample every task but for a handler with low latency tolerance, you may need to aggressively downsample if traffic is high, or you might sample only when certain conditions are met.
  • Consider making the sampling strategy dynamically configurable, as this can be useful for troubleshooting.
  • If collected data tracks a system end to end and the collection spans more than one process, like distributed traces or events, you might want to propagate the sampling decision from parent to child process through the header passed down.
  • If collecting data is inexpensive but transferring or storage are, you can collect 100% of the data and apply a filter later to minimize while preserving diversity in the sample, retaining edge cases specifically for debugging.
  • Never trust a sampling decision propagated from an external source; it could be a DOS attack.

Full post here, 4 mins read