Common SLIs include latency, throughput, availability, and error rate; others include durability (in storage systems), end-to-end latency (for complex data processing systems, especially pipelines), and correctness.
Service level indicators are the basis of [[Service level objective]]s. A natural structure for SLOs is thus _SLI ≤ target_, or _lower bound ≤ SLI ≤ upper bound_.
Reminds me a bit of [[Indicator of compromise]] (but it's very different anyway).
> Most services consider _request latency_—how long it takes to return a response to a request—as a key SLI.
> Other common SLIs include the _error rate_, often expressed as a fraction of all requests received, and _system throughput_, typically measured in requests per second. The measurements are often aggregated: i.e., raw data is collected over a measurement window and then turned into a rate, average, or percentile.
It is common to show SLIs as part of graphs in tools such as [[Grafana]], [[Splunk]] or [[Humio]].
> Another kind of SLI important to SREs is _availability_, or the fraction of the time that a service is usable. It is often defined in terms of the fraction of well-formed requests that succeed, sometimes called _yield_.
Here we almost touch on [[Tracing]]:
> Many indicator metrics are most naturally gathered on the server side, using a monitoring system such as Borgmon (see [Practical Alerting from Time-Series Data](https://sre.google/sre-book/practical-alerting/)) or [[Prometheus]], or with periodic log analysis—for instance, HTTP 500 responses as a fraction of all requests. However, some systems should be instrumented with _client_-side collection, because not measuring behavior at the client can miss a range of problems that affect users but don’t affect server-side metrics. For example, concentrating on the response latency of the Shakespeare search backend might miss poor user latency due to problems with the page’s JavaScript: in this case, measuring how long it takes for a page to become usable in the browser is a better proxy for what the user actually experiences.