Extract keys/values
The extract keys/values processing rule uses a regular expression to search for key/value pairs inside a string, then creates a structured object to store those key/value pairs.
For a processing rule that performs a similar operation on escaped JSON strings, see decode JSON. For a processing rule that performs a similar operation on data not already formatted as key/value pairs, see parse.
Configuration parameters
Use the parameters in this section to configure this processing rule. The Telemetry Pipeline web interface uses the items in the Name column to describe these parameters. Pipeline configuration files use the items in the Key column as YAML keys.
Name | Key | Description | Default |
---|---|---|---|
Source key | src | Required. The key whose value contains key/value pairs to extract. | none |
Destination key | dst | Required. The key of the object to store your structured key/value pairs. If you specify the name of an existing key, the value of the key is overwritten. | none |
Regex | regex | Required. The regular expression for extracting key/value pairs from the value of Source key. This expression must have two capture groups: the result of the first capture group becomes the name of a key, and the result of the second capture group becomes that key's value. | none |
Regex engine | regexEngine | Required. The engine to parse your regular expression. Accepted values: GNU , Oniguruma , PCRE2 , POSIX , TRE . | PCRE2 |
Comment | comment | A custom note or description of the rule's function. This text is displayed next to the rule's name in the Actions list in the processing rules interface. | none |
Example
Using the extract keys/values rule lets you extract embedded data from a string and turn it into parsable key/value pairs. You can then use these key/value pairs in other processing rules or for general storage and analysis.
For example, given the following sample website log data:
{"log": "user_id:3,page_id:30,action:purchase"}
{"log": "user_id:4,page_id:10,action:purchase"}
{"log": "user_id:1,page_id:50,action:click"}
{"log": "user_id:5,page_id:40,action:purchase"}
{"log": "user_id:1,page_id:30,action:purchase"}
{"log": "user_id:2,page_id:40,action:click"}
{"log": "user_id:3,page_id:30,action:click"}
{"log": "user_id:1,page_id:20,action:view"}
{"log": "user_id:2,page_id:50,action:purchase"}
{"log": "user_id:2,page_id:10,action:view"}
{"log": "user_id:1,page_id:50,action:view"}
A processing rule with the Source key value log
, the Destination key
value extracted
, the Regex value (\w+)(?:\":"|\":)(\w+)
, and the Regex
engine value PCRE2
returns the following result:
{"log":"user_id:3,page_id:30,action:purchase","extracted":{"action":"purchase","user_id":"3","page_id":"30"}}
{"log":"user_id:4,page_id:10,action:purchase","extracted":{"action":"purchase","user_id":"4","page_id":"10"}}
{"log":"user_id:1,page_id:50,action:click","extracted":{"action":"click","user_id":"1","page_id":"50"}}
{"log":"user_id:5,page_id:40,action:purchase","extracted":{"action":"purchase","user_id":"5","page_id":"40"}}
{"log":"user_id:1,page_id:30,action:purchase","extracted":{"action":"purchase","user_id":"1","page_id":"30"}}
{"log":"user_id:2,page_id:40,action:click","extracted":{"action":"click","user_id":"2","page_id":"40"}}
{"log":"user_id:3,page_id:30,action:click","extracted":{"action":"click","user_id":"3","page_id":"30"}}
{"log":"user_id:1,page_id:20,action:view","extracted":{"action":"view","user_id":"1","page_id":"20"}}
{"log":"user_id:2,page_id:50,action:purchase","extracted":{"action":"purchase","user_id":"2","page_id":"50"}}
{"log":"user_id:2,page_id:10,action:view","extracted":{"action":"view","user_id":"2","page_id":"10"}}
{"log":"user_id:1,page_id:50,action:view","extracted":{"action":"view","user_id":"1","page_id":"50"}}
This rule extracted key/value pairs from the string stored in the log
key and
stored those key/value pairs in a new structured object named extracted
.