Processing rules
Processing rules transform data as it passes through your telemetry pipeline. You can use a variety of rules to perform different operations on data after it leaves its source but before it reaches its destination.
Some example use cases for processing rules include:
- Adding a new field to each log for easier debugging and troubleshooting.
- Redacting sensitive values to preserve user privacy.
- Removing unnecessary fields to improve your data's signal-to-noise ratio.
- Converting data from one format to another.
- Turning unstructured data into structured data.
- Aggregating logs into metrics to reduce data volume while retaining key insights.
How processing rules work
Each built-in processing rule transforms data in a specific way. By using more than one processing rule together, you can create a complex sequence of transformations that's suited for your telemetry data and storage format.
Most processing rules are compatible with most data formats. Processing rules are also designed to skip logs they're incompatible with rather than displaying an error message, which means you can use rules that apply only to certain chunks of data. For example, you can apply a broad processing rule to remove a certain field even if some logs in your pipeline don't contain that field.
Format
Processing rules are run one at a time, from top to bottom. If you add multiple processing rules to the same pipeline for the same telemetry type, the output from your first rule becomes the input for your second rule, the output from your second rule becomes the input for your third rule, and so on.
Telemetry types
Requires Core Operator version 3.1.0 or later and pipeline version 24.7.3 or later.
Processing rules support logs, metrics, and traces. You can create processing rules for multiple telemetry types within the same pipeline, but each processing rule is applied only to its specified telemetry type. For example, if you create a search/replace value processing rule for metrics, this rule won't affect any logs or traces that pass through your pipeline, even if those logs or traces contain a matching key.
When raw log data passes through at least one processing rule, the data
receives a new log
field for each event. This log
field lets you treat each event
as a single unit of data.
Structured log data such as JSON doesn't receive a log
field
because you can already break structured data into discrete events.
Regex engines
Requires Telemetry Pipeline version 2.9.0 or later.
Some processing rules, like Block records and Rename keys, accept regular expressions. For most of these rules, you can specify one of the following regex engines to parse your rule:
- PCRE2 (opens in a new tab) (default)
- Oniguruma (opens in a new tab)
- POSIX (opens in a new tab)
- GNU (opens in a new tab)
- TRE (opens in a new tab)
Record accessor syntax
If your raw data is in JSON format, you can use record accessor syntax to extract nested fields within a larger JSON object.
To extract a nested field inside a standard object, use the following syntax:
$objectName.fieldName
To extract a nested field inside an array, use the following syntax:
$objectName.arrayName[X]
For example, given the following JSON object:
"log": "1234",
"kubernetes": {
"pod_name": "mypod-0",
"labels": [
{
"app": "myapp"
}
],
}
}
The expression $kubernetes.pod_name
resolves to mypod-0
, and the expression
$kubernetes.labels[0]
resolves to "app": "myapp"
.
If the name of a field or parent object contains periods, wrap the name in
quotes. For example, to access the field k8s.pod.uid
inside the object
attributes
, use the expression $attributes."k8s.pod.uid"
.
Processing rules playground
The Telemetry Pipeline web interface has a processing rules playground (opens in a new tab) you can use to test and troubleshoot processing rules, including custom Lua scripts.
For safety reasons, this playground environment is isolated from the internet and has no access to internal Telemetry Pipeline resources. Any processing rules you test here won't affect your pipeline, clusters, or logging data.
Add processing rules to your pipeline
-
Sign in to the Telemetry Pipeline web interface (opens in a new tab).
-
Navigate to Core Instances, then select the pipeline to which you'd like to add a new processing rule.
-
Click Edit.
-
Click the node in the middle of the configuration diagram.
-
In the dialog that appears, select an option from the Telemetry type tab.
-
Click Add new action to open the processing rules menu.
-
Select a processing rule from the available list.
-
Configure the available settings for that processing rule, and then click Apply.
-
Optional: Repeat steps 5 through 8 to add additional processing rules. If you add multiple rules, you can drag them to change the order in which they run.
-
Optional: Add test input and then click Run actions to preview the output of your processing rules.
-
Click Apply processing rules to finalize your processing rules, and then click Save and deploy to save your pipeline settings.
Use the toggle next to a processing rule to enable or disable that rule.
Available processing rules
Telemetry Pipeline offers the following processing rules:
Processing rule | Description |
---|---|
Add/set key/value | Adds the same key/value pair to every record. |
Aggregate records | Transforms incoming logs into computed metrics at periodic intervals. |
Allow keys | Preserves any keys that match a specified regular expression, and removes all other keys. |
Allow records | Preserves any records that contain a key whose value matches a specified regular expression, and removes all other records. |
Block keys | Removes keys that match a regular expression. |
Block records | Removes records that contain a key whose value matches a specified regular expression. |
Copy keys | Copies the value of a specified source key to the value of a specified destination key. |
Custom Lua | Write custom Lua scripts to transform your telemetry data. |
Decode CSV | Transforms log data from CSV format to JSON. |
Decode JSON | Transforms an escaped JSON string into a structured JSON object. |
Deduplicate records | Searches for any records that contain identical key/value data within a specified time frame, then removes all but the earliest of those records. |
Delete key | Deletes a specified key and its associated value from all records. |
Encode CSV | Transforms log data from JSON to CSV format. |
Encode JSON | Transforms a JSON object into an escaped string. |
Extract keys/values | Uses a regular expression to search for key/value pairs inside a string, then creates a structured object to store those key/value pairs. |
Flatten subrecord | Uses regular expression to search for key/value pairs inside a JSON object, then either moves or copies any applicable key/value pairs to the top level of the record. |
Hash key | Copies the value of a specified source key, hashes that value, then stores the hashed copy in a specified destination key. |
Join records | Combined values from multiple records into an array of values within a single record. |
Lift submap | Uses regular expressions to search for key/value pairs inside a JSON object, then either moves or copies any applicable key/value pairs out of the JSON object and into a higher level of the record. |
Multiline join | Combines multiple logs into a single log by looking for repeating patterns in log data. |
Nest keys | Moves the value of a specified source key into an object nested under a specified destination key. |
Parse | Uses a regular expression to search for values inside a string and assign a key to each value, then stores those key/value pairs in a structured object. |
Parse number | Uses a regular expression to transform a key's value from a string to a number. |
Random sampling | Preserves a percentage of records that pass through your pipeline and discards the rest. |
Redact/mask value | Obscures all or part of a specified key's value by replacing the original string with a series of repeated characters. |
Rename keys | Changes the name of a specified key. |
Search/replace value | Uses regular expressions to search for a value inside a string, then replaces that value with a different specified value. |
Split record | Splits an array of JSON objects into a series of standalone records. |