Aggregate records
The aggregate records processing rule transforms incoming logs into computed metrics at periodic intervals.
When the aggregate records rule waits for data to accumulate in your pipeline, it stores that data in memory. Increasing the value of the Time window parameter also increases the memory load on your pipeline. For example, if 100,000 records pass through your pipeline during the specified time period, and those records are 1 kB each, the aggregate records rule will add approximately 100 MB of memory load.
Configuration parameters
Use the parameters in this section to configure this processing rule. The Telemetry Pipeline web interface uses the items in the Name column to describe these parameters. Pipeline configuration files use the items in the Key column as YAML keys.
Name | Key | Description | Default |
---|---|---|---|
Time window | window | Required. How long to wait (in seconds) for data to accumulate in your pipeline before computing metrics from that data. For each interval that elapses, a new set of metrics is computed. | none |
Select keys | keys | Required. Any logs that contain matching values for all of the specified keys will be grouped together during compute operations. This parameter must be formatted as a JSON array of strings, like ["keyName1","keyName2"] . | none |
Compute keys | compute | Required. The keys and computed metrics to include in your output data. This parameter must be formatted as a JSON object. For more information, see Compute syntax. | none |
Comment | comment | A custom note or description of the rule's function. This text is displayed next to the rule's name in the Actions list in the processing rules interface. | none |
Compute syntax
The aggregate records processing rule offers three compute functions:
sum
, average
, and count
. To use these functions, the Compute keys
configuration parameter expects a JSON object with the following syntax:
{"newKey1":["count"],"newKey2":["average","KEY_TO_COMPUTE"],"newKey3":["sum","KEY_TO_COMPUTE"]}
The resulting computed metrics are formatted as a JSON object with the following characteristics:
{
"selectKey": "value",
"newKey1": "countMetric",
"newKey2": "averageMetric",
"newKey3": "sumMetric",
}
"selectKey": "value"
: This key and value match the Select keys parameter, and together represent the log groups for which metrics are computed. If Select keys includes multiple keys, all keys and their corresponding values are included.countMetric
: The number of times that"selectKey": "value"
occurred within the specified time window.averageMetric
: The mean value ofKEY_TO_COMPUTE
within the specified time window and group.sumMetric
: The sum of allKEY_TO_COMPUTE
values within the specified time window and group.
Example
Using the aggregate records rule lets you retain information about the logs that pass through your pipeline without needing to retain each individual log. For example, given this sample JSON data:
{"name": "Sophia", "profession": "designer", "age": 29,"projects": 2}
{"name": "William", "profession": "programmer", "age": 45,"projects": 0}
{"name": "Mia", "profession": "chef", "age": 32,"projects": 5}
{"name": "Benjamin", "profession": "architect", "age": 51,"projects": 1}
{"name": "Ava", "profession": "designer", "age": 27,"projects": 1}
{"name": "Michael", "profession": "programmer", "age": 38,"projects": 3}
{"name": "Abigail", "profession": "designer", "age": 42,"projects": 0}
{"name": "Daniel", "profession": "architect", "age": 35,"projects": 4}
{"name": "Emma", "profession": "programmer", "age": 48,"projects": 1}
{"name": "Jacob", "profession": "chef", "age": 31,"projects": 2}
{"name": "Olivia", "profession": "designer", "age": 24,"projects": 2}
{"name": "Matthew", "profession": "chef", "age": 39,"projects": 0}
{"name": "Isabella", "profession": "programmer", "age": 28,"projects": 3}
{"name": "Ethan", "profession": "architect", "age": 46,"projects": 1}
{"name": "Avery", "profession": "designer", "age": 33,"projects": 1}
A processing rule with the Time window value 60
, the Select keys value
["profession"]
, and the Compute keys value
{"headcount":["count"],"averageAge":["average","age"],"totalProjects":["sum","projects"]}
returns the following result:
{"averageAge":31,"headcount":5,"profession":"designer","totalProjects":6}
{"averageAge":39.75,"headcount":4,"profession":"programmer","totalProjects":7}
{"averageAge":34,"headcount":3,"profession":"chef","totalProjects":7}
{"averageAge":44,"headcount":3,"profession":"architect","totalProjects":6}
Because the original data set's profession
key had four possible values, this
rule created four groups (designer
, programmer
, chef
, and architect
), and
then computed the following values for each group:
averageAge
: The meanage
value of every person within that group.headcount
: The number of people within that group.totalProjects
: The combined number of projects completed by the people within that group.