OBSERVABILITY PLATFORM
Using delta queries

Querying delta temporality metrics

Chronosphere Observability Platform supports both delta and cumulative temporality metric ingestion and storage.

  • A metrics client using delta temporality counts events that occurred since the last emission or flush of the metric value and emits only that value.
  • A metrics client using cumulative temporality keeps a strictly increasing sum and sends the value at a regular interval. A cumulative metric value represents the sum of all events observed since the process start time.

The following time series examples demonstrate the difference between the delta and cumulative time series. The delta temporality client sends the sum of events between each reporting interval, and the cumulative temporality client sends the running total at each reporting interval.

A delta counter time series emitting sparse values:

                        process
Data point values:       start       12                  7         3
                  ---------|---------|---------|---------|---------|--------- 
             Time:        1:00       1:01     1:02       1:03     1:04

In contrast, a cumulative counter time series emits values on regular intervals:

                        process
Data point values:       start      12        12        19        22
                  ---------|---------|---------|---------|---------|--------- 
             Time:        1:00       1:01     1:02       1:03     1:04

Both of these examples count the same number of events. However, the delta counter reported only the interval's count, while the cumulative counter reported the sum of all events counted up to that point in time.

A delta time series can therefore be sparse because a delta metrics client sends an update only when it observes events during the reporting interval. If the client doesn't count any events during an interval, a delta counter reports no value for that interval.

Functions for delta temporality use cases

Two query functions address the most common use cases for querying a delta time series:

  • sum_over_time(): Returns the sum of all observations in the specified sliding time window.
  • sum_per_second(): Returns the per-second rate of observations in the specified sliding time window.

sum_per_second() is a convenience function that automatically divides the sum of observations in the specified sliding time window by the sliding time window's duration to calculate the per-second rate. The following queries return the same results:

sum_per_second(http_request_count{}[5m])

sum_over_time(http_request_count{}[5m]) / 5m

Delta query examples

Get the total count of errors for a given service in 1-minute steps

The following query sums all observations in the specified one-minute sliding time window for the http_request_count time series where the error_code label equals 5xx, with a resolution of one minute when visualized in a time series chart panel.

sum_over_time(http_request_count{error_code="5xx", service="my-service"}[1m])

To ensure the chart value at each step represents the sum of observations for each step's start and end time, you must set the query's step size to be equal to the sliding time window value. For more guidance, see Best practices for adding dashboard charts.

Get the per-second error rate for a given service in 1-minute steps

The following query calculates the per-second average for the specified five-minute sliding time window for the http_request_count time series, where the error_code label equals 5xx:

sum_per_second(http_request_count{error_code="5xx", service="my-service"}[5m])

Get the P95 for a delta histogram time series

To see the 95th percentile (P95) duration for HTTP requests with a one-minute resolution, use sum_per_second(). Set the sliding time window to 1m and the step size to 1m. The histogram_quantile() function calculates the P95 value based on its observations within the sliding time window.

For Chronosphere delta histograms using an exponential bucket layout:

histogram_quantile(0.95, sum(sum_per_second(http_request_duration{}[1m])))

For fixed-bucket delta histograms:

histogram_quantile(0.95, sum by(le) (sum_per_second(http_request_duration_seconds_bucket{}[1m])))

Other functions

Users with a PromQL background might recognize that Chronosphere recommends sum_over_time() instead of increase(), and recommends sum_per_second() instead of rate(). Both increase() and rate() operate on delta temporality time series, but they can return different results because these functions estimate the expected value for a particular time by extrapolating the slope between the first and last data points in the sliding time window (opens in a new tab).

Extrapolation requires two or more data points in the sliding time window, which is not guaranteed with delta temporality, because delta metrics clients send data points only for intervals with observations to report.

Best practices for adding dashboard charts

When adding chart panels to dashboards, use the following practices to generate consistent and accurate visualizations.

Align step and time window values using $__interval

A typical use case for a chart visualization is to display the sum of observations in fixed time intervals as a bar chart. To ensure the chart value at each step represents the sum of observations within the step's start and end time, you must set the step size equal to the sliding time window's value.

To ensure the chart step size and sliding time window are equal, use the global variable $__interval as the sliding time window value.

Observability Platform calculates the $__interval value based on the query time range and pixel width of the chart to determine the chart step. Using $__interval therefore guarantees the two values are always equal.

In the dashboard panel's Settings tab:

  1. Change the display option from Line to Bar to view the time series in the bar chart format.
  2. Set Null behavior to Null as zero to represent steps that report no observations as zeroes on the chart.

It can sometimes be appropriate to use different values for a sliding time window and step. For example, a query for an SLA that defines that there should be no more than ''X'' errors in a rolling 24-hour period would set the sliding time window to 24h and the step size to 1h for one-hour resolution.

Example: Equal values for step and sliding time window

The following time series and corresponding query and chart describe the results when the sliding time window and step size are set to the same value. For this example, we can assume the $__interval value is one minute.

At each step, the query returns the sum of all data points in the one-minute range. Because the step is set to 1m, the chart correctly displays the one-minute sums in one-minute steps. Each bar height is the sum of observations within the bar's start and end time. The sum of all bars in the chart (36) equals the sum of all data points in the time series (36).

Time series:

Values:         2         3         5         7         2         5         3         6         3
       ---------|---------|---------|---------|---------|---------|---------|---------|---------|---------
  Time:        1:00:00   1:00:30   1:01:00   1:01:30   1:02:00   1:02:30   1:03:00   1:03:30   1:04:00

Query:

sum_over_time(my_counter{}[$__interval])

The resulting chart:

                                     ___________________                     ___________________
                 ___________________|                   |___________________|                   |
                |                   |                   |                   |                   |
                |                   |                   |                   |                   |
                |                   |                   |                   |                   |
                |         8         |         9         |         8         |         9         |
                |       (3+5)       |       (7+2)       |       (5+3)       |       (6+3)       |
        ________|                   |                   |                   |                   |
       |        |                   |                   |                   |                   |
       |   2    |                   |                   |                   |                   |
       ---------|-------------------|-------------------|-------------------|-------------------|---
  Time:        1:00:00             1:01:00             1:02:00             1:03:00             1:04:00

Example: Different values for step and sliding time window

The chart visualization changes drastically when the sliding time window and step do not align. In the following example, the step is set to 30s and the sliding time window is set to 1m. With a 30s step, the chart displays a one-minute rolling sum at each 30-second step.

Time series:

Values:         2         3         5         7         2         5         3         6         3
       ---------|---------|---------|---------|---------|---------|---------|---------|---------|---------
  Time:        1:00:00   1:00:30   1:01:00   1:01:30   1:02:00   1:02:30   1:03:00   1:03:30   1:04:00

Query:

sum_over_time(my_counter{}[1m])

Assuming the step is set to 30s, the resulting chart is similar to the following:

                                     _________
                                    |         |
                                    |         |
                                    |         |
                                    |         |_________                     _________ _________
                           _________|         |         |          _________|         |         |
                          |         |         |         |_________|         |         |         |
                          |         |         |         |         |         |         |         |
                 ________ |         |         |         |         |         |         |         |
                |    5    |    8    |   13    |    9    |    7    |    8    |    9    |    9    |
                |  (2+3)  |  (3+5)  |  (5+7)  |  (7+2)  |  (2+5)  |  (5+3)  |  (3+6)  |  (6+3)  |
        ________|         |         |         |         |         |         |         |         |
       |        |         |         |         |         |         |         |         |         |
       |   2    |         |         |         |         |         |         |         |         |
       ---------|-------------------|-------------------|-------------------|-------------------|---
  Time:        1:00:00             1:01:00             1:02:00             1:03:00             1:04:00