The latency of an individual transaction or process is best described by a single number, namely the duration of the process in an appropriate unit. When summarising many latency measurements, a good choice is to capture the distribution of the measurements; that is, the minimum, maximum, and the percentiles. The most useful percentiles are typically the median or 50th percentile, and the first few nines - 90th, 99th, and 99.9th. The distribution of trading latencies generally varies significantly over the day, with the greatest spread between minimum and maximum being seen around market-open and -close. As a result, it is often useful to break up the trading day into buckets of a few minutes each, and capture the evolving latency profile with the distribution of latency in each bucket. This makes it easy to produce a time-series of, for example, the 99th percentile over the day. It is tempting to try to summarise a distribution by traditional statistics such as the mean, standard deviation, skewness, and curtosis. However these statistics are most meaningful when the underlying data follows a normal (Gaussian) distribution, or close to normal. In contrast, even simple latency distributions are often heavily skewed with a clear minimum and fluctuations and outliers, and realistic data are often multimodal. As a result, the traditional statistics offer very little value in capturing or describing latency, and raw percentiles are generally much more effective.
What is the best way to describe latency?
The latency of an individual transaction or process is best described by a single number, namely the duration of the process in an appropriate unit.