Derived labels
Derived labels are a Chronosphere Observability Platform-specific construct.
They’re non-destructive and are specifically designed for efficient operation on
individual time series at scale. For example, you might have a service named
k8s_cluster
, and another named kubernetes_cluster
, with the intent that they’re
both related to the same thing. Use derived labels to standardize on one name for the
same service, usable across Observability Platform.
Unlike Prometheus relabel rules, which overwrite existing data, derived labels standardize your label names without overwriting them permanently.
Metrics from different sources can use the same labels to mean different things.
For example, MySQL metrics could use the cluster
label for a service it provides,
gRPC based application metrics use grpc_service
, while the rest of your
applications use standard-service
as a label for one or more processes. From the
end user’s perspective, these all mean service
. Sometimes you need a combination of
variables to determine if these are the same information from different applications.
Maybe the staging MySQL cluster hosts many services, but production has a dedicated
cluster per service.
Relabel rules are the language that
Prometheus (opens in a new tab)
provides to tune scraping, determine which time series to persist, and modify a time
series before persisting it. To modify a time series, you can use relabel rules to
update a metric’s target_label
or to update multiple labels. Relabel rules
overwrite existing labels, removing labels previously associated with a metric.
Chronosphere exposes this construct directly in the Collector. Relabel rules are, by design, metric-centric. To make a change to a particular label on all metrics, edit the relabel rules for every scrape job.
Derived labels are a Chronosphere-specific construct. Unlike relabel rules, they are non-destructive and are specifically designed for efficient operation on time series at scale.
Instead of using regular expressions for matching the time series to operate on, derived labels use the same glob matching expressions used by drop rules, aggregation rules, and traffic shaping pools.
Use derived labels to augment a time series after it has been persisted. Derived labels are non-destructive, meaning they don’t make permanent changes to the underlying time series. If you remove a derived label, the label goes away, and the underlying time series remain. A relabel rule permanently changes the labels applied to a time series however, and can’t be undone.
There are these types of derived labels:
- A mapping derived label creates a new label name and pulls values from some other label on the time series. Value mapping applies to mapping derived labels. Chronosphere recommends using mapping derived labels.
- A constructed derived label creates a label name and values where one didn’t previously exist. By definition, Chronosphere already creates values for a constructed derived label, so value mapping doesn’t apply to constructed derived labels. Chronosphere recommends using mapping derived labels instead of constructed labels.
Differences between relabeling and deriving
Relabel rules | Derived labels |
---|---|
Use regular expressions (regex): flexible and allows more transformations than derived labels. | Uses globs: more efficient, matches what other Chronosphere entities use. |
Used for dropping metrics based on keep and drop rules. | Not supported, but backend drop rules support the same. |
Distributed across many Collector and service monitor configurations. | One single configuration applying to all metrics. |
Driven by transformation and not the result. | Centered around what the user wants to define. |
Allows extracting values from label values. | Doesn’t support extracting values. |
Overwrites existing labels. | Adds to existing labels. |
When to use relabel versus derived
If you’re not sure whether to use relabeling, or to use a derived label, use the following guidelines to help you decide:
Derived labels won’t apply to certain Chronosphere generated metrics to ensure the system performs as expected.
Relabel rules
- You want to remove existing labels and replace them with one or more new labels.
- You need to drop time series and scrape targets.
- You want to control configuration at the Collector. For example, you want to edit the configuration for a single service using a service monitor.
- You need to control data sent to Chronosphere. For example, dropping data to save network cost.
- You want to do a complex label modification operation, like using arbitrary regular expressions with capture groups.
Derived labels
- You want to retroactively change the labels on a previously emitted time series in a non-destructive way.
- When fixing the source or scrape location is difficult. For instance, if the data source is in a customer environment, or changing scrape configuration is prohibitively expensive in your environment.
- When you want to manage the label configuration in a label-centric way. For example, if you want to add a label to all of your metrics with some value based on the source labels, you have to change the scrape configuration for every service.
View derived labels
Select from the following methods to view existing derived labels.
To view all labels:
chronoctl derived-labels list
To list specific labels using their slugs:
chronoctl derived-labels list --slugs slug_name_1,slug_name_2
Manage derived labels
Create, update, and delete derived labels using Terraform, Chronoctl, or the Observability Platform API.
Creating, modifying, or deleting a derived label can cause unexpected behavior in any location that label was used. Adding a derived label is adding an extra label. Rules that expect a specific set of labels might not match when the derived label is present.
Create a derived label
The value_glob
is the label pattern being matched. This example matches the
patterns:
m3coordinator-read*
m3coordinator-write*
m3coordinator-admin*
These values end with *
, which matches any pattern. These patterns display
under a single derived label defined by the label_name
, which is tier
.
You can provide multiple definitions for a value
with different value_glob
patterns. Chronosphere tries them in order of definition.
This example includes a constructed derived label and a mapping derived label. Both follow the same construction rules.
To create a label with Chronoctl:
-
Create a YAML file with the desired labels. To generate a templated example resource, run the
derived-labels scaffold
command:chronoctl derived-labels scaffold
You can redirect the output to a file for editing:
chronoctl derived-labels scaffold > derived-label.yaml
-
Run this command:
chronoctl apply -f derived-label.yml
This is an example definition file for Chronoctl.
api_version: v1/config kind: DerivedLabel spec: name: Test Constructed Label slug: test-constructed-label label_name: tier description: this is a test metric_label: constructed_label: value_definitions: - value: read filters: - name: instance value_glob: m3coordinator-read* - value: write filters: - name: instance value_glob: m3coordinator-write* --- api_version: v1/config kind: DerivedLabel spec: name: Test Mapping Label slug: test-mapping-label label_name: chronosphere_service description: this is a test metric_label: mapping_label: name_mappings: - filters: - name: __name__ value_glob: grpc_* source_label: grpc_service - filters: - name: __name__ value_glob: envoy_* source_label: backend_service
Delete a derived label
Delete a derived label with Chronoctl by
using the chronoctl derived-labels delete
command, specifying the slug of
the derived label to delete.
For example, to delete the derived label with slug slug_name_1
:
chronoctl derived-labels delete slug_name_1
Use derived labels
In the following examples, the http_requests_total
and grpc_requests_total
metrics both have a label indicating they’re part of a Kubernetes cluster, but they
use different label names. Standardize this label to make it easier for end users to
consume. For example, users don’t need to know what label each metric emits when
there’s only one standardized label. When a user wants to join these metrics on the
cluster
label, they can join on the standardized label instead of having to create
a complicated query with label_replace
.
Mapping derived labels
The following example creates a mapping derived label called cluster
, which gets
its values from the source label kubernetes_cluster
if a metric matches the glob
__name__:grpc_*
. Similarly, it also gets its values from the source label
k8s_cluster
if a metric matches the glob __name__:http_*
.
This example assumes use these time series as a staring point.
http_requests_total{k8s_cluster="production", method="get", instance="auth-1a2-b3c4"}
http_requests_total{k8s_cluster="canary", method="put", instance="gateway-4s5-9f8b"}
grpc_requests_total{kubernetes_cluster="production", method="get", instance="gateway-0h8-6m2f"}
grpc_requests_total{kubernetes_cluster="canary", method="put", instance="auth-3g8-kl9m"}
api_version: v1/config
kind: DerivedLabel
spec:
name: cluster mapping label
slug: cluster-mapping-label
label_name: cluster
description: this is a mapping label for cluster
metric_label:
mapping_label:
name_mappings:
- filters:
- name: __name__
value_glob: grpc_*
source_label: kubernetes_cluster
- filters:
- name: __name__
value_glob: http_*
source_label: k8s_cluster
If you query http_requests_total{cluster="production"}
, the resulting time series is:
{__name__="http_requests_total", cluster="production", k8s_cluster="production", method="get", instance="auth-1a2-b3c4"}
.
If you query grpc_requests_total{cluster="canary"}
, the resulting time series is:
{__name__="grpc_requests_total", cluster="canary", kubernetes_cluster="canary", method="put", instance="auth-3g8-kl9m"}
Value mapping for a label that’s both physical and derived
There can be situations where a derived label definition includes a label name that already exists on a metric.
If there’s an existing metric label with the same name as the derived label:
- If
existing_label_policy = KEEP
, the label that already exists on the metric is used instead of the derived label. - If
existing_label_policy = OVERRIDE
, the derived label is used instead of the label that already exists on the metric. - If
existing_label_policy
isn’t explicitly set, Chronosphere defaults to theKEEP
behavior.
The following examples start with this set of time series:
{__name__="grpc_metric", service="labrador", instance="blah"}
{__name__="grpc_metric", service="golden", instance="blah"}
{__name__="envoy_metric", microservice="Labrador", instance="blah"}
{__name__="envoy_metric", microservice="GOLDEN", instance="blah"}
Standardize label names
Different teams or services provide source data, some of which might not conform to the intended labeling standard. For cleaner reviews, you can map all of the provided into a standardized space. Nothing can be added to the target space without explicit mapping, which preserves and augments the original labeling.
For example:
Source | Target |
---|---|
TS1, service=labrador | TS1, service=labrador , standard-service=labrador-retriever |
TS2, service=golden | TS2, service=golden , standard-service=golden-retriever |
TS3, microservice=Labrador | TS3, microservice=Labrador , standard-service=labrador-retriever |
Top-level value mappings
Top-level value mappings are mappings that apply to all name mappings, and not to a
specific name mapping. This MappingLabel includes top-level value mappings in the value_mappings
section.
label_name: standard_service
mapping_label:
name_mappings:
- filters:
- name: __name__
value_glob: grpc_*
source_label: service
- filters:
- name: __name__
value_glob: envoy_*
source_label: microservice
value_mappings:
- source_value_globs:
- labrador
- Labrador
target_value: labrador-retriever
- source_value_globs:
- golden
- GOLDEN
target_value: golden-retriever
If you were to use the query grpc_metric{standard_service="labrador-retriever"}
,
the resulting time series is:
{__name__="grpc_metric", service="labrador", standard_service="labrador-retriever", instance="blah"}
For another query envoy_metric{standard_service="labrador-retriever"}
, the resulting
time series is:
{__name__="envoy_metric", microservice="Labrador", standard_service="labrador-retriever", instance="blah"}
If you query using an non-existent target value such as
grpc_metric{standard_service="labrador"}
, it returns this time series:
{__name__="grpc_metric", service="labrador", standard_service="labrador-retriever", instance="blah"}
Value mappings inside a name mapping
Value mappings match to one specific name mapping.
For example:
label_name: standard_service
mapping_label:
name_mappings:
- filters:
- name: __name__
value_glob: grpc_*
source_label: service
value_mappings:
- source_value_globs:
- labrador
- Labrador
target_value: labrador-retriever
- filters:
- name: __name__
value_glob: envoy_*
source_label: microservice
If you query grpc_metric{standard_service="labrador-retriever"}
, with this configuration, the target value labrador-retriever
applies only to
metrics that match __name__:grpc_*
.
The resulting time series is:
{__name__="grpc_metric", service="labrador", standard_service="labrador-retriever", instance="blah"}
Consider the query envoy_metric{standard_service="labrador-retriever"}
. According
to the configuration, the value mapping for labrador-retriever
applies only to
metrics that match __name__:grpc_*
.
This query returns no results because:
standard_service="labrador-retriever"
isn’t a physical label-value pair onenvoy_metric
.- There is no top-level value mapping where
labrador-retriever
is the target value. There’s also no name mapping specific value mapping wherelabrador-retriever
is the target value for metrics that match__name__:envoy_*
.
Value mapping inside name mapping with top level mapping together
You can use Value Mappings and name mapping level value mappings in the same configuration. For example:
label_name: standard_service
mapping_label:
name_mappings:
- filters:
- name: __name__
value_glob: grpc_*
source_label: service
value_mappings:
- source_value_glob:
- labrador
- Labrador
target_value: labrador-retriever
- filters:
- name: __name__
value_glob: envoy_*
source_label: microservice
value_mappings:
- source_value_globs:
- labrador
- Labrador
target_value: golden-retriever
In this example, target value labrador-retriever
applies specifically to metrics
that matches __name__:grpc_*
because it matches to that specific name mapping,
where value labrador-retriever
applies to all name mappings.
For the query grpc_metric{standard_service="labrador-retriever"}
results in the time
series:
{__name__="grpc_service", service="labrador", standard_service="labrador-retriever", instance="blah"}
Similarly, for the query envoy_metric{standard_service="golden-retriever"}
, the
resulting time series is:
{__name__="envoy_metric", microservice="GOLDEN", standard_service="golden-retriever", instance="blah"}
However, querying envoy_metric{standard_service="labrador-retriever"}
returns no
values, because:
standard_service="labrador-retriever"
isn’t a physical label-value pair onenvoy_metric
.- There is no top-level value mapping where
labrador-retriever
is the target value. There’s also no name mapping that has a value mapping wherelabrador-retriever
is the target value for metrics that match__name__:envoy_*
.
If there is a top-level value mapping with the same target value as a name mapping specific value mapping, like this situation:
label_name: standard_service
mapping_label:
name_mappings:
- filters:
- name: __name__
value_glob: grpc_*
source_label: service
value_mappings:
- source_value_filters:
- labrador
- Labrador
target_value: golden-retriever
- filters:
- name: __name__
value_glob: envoy_*
source_label: microservice
value_mappings:
- source_value_globs:
- golden
- GOLDEN
target_value: golden-retriever
In this case, both value mappings have the same target value of golden-retriever
.
The value mappings that are name mapping specific apply over the top-level value
mappings.
Constructed derived labels
As an example, you might want to query each of these metrics by the service which the HTTP requests and gRPC requests originated from. However there’s no service label, but there is an instance label that has the name of the server instance the requests came originated from. Chronosphere experience is that this happens rarely, and recommends that you use mapped derived labels whenever possible.
This example creates a constructed derived label called service
for a metric if the
metric matches the glob instance:auth-*
, with service label with the value auth
.
Similarly, it creates the label service for a metric if this metric matches the glob
instance:gateway-*
, and this service label has the value gateway
.
This example assumes the use of these time series as a staring point:
http_requests_total{k8s_cluster="production", method="get", instance="auth-1a2-b3c4"}
http_requests_total{k8s_cluster="canary", method="put", instance="gateway-4s5-9f8b"}
grpc_requests_total{kubernetes_cluster="production", method="get", instance="gateway-0h8-6m2f"}
grpc_requests_total{kubernetes_cluster="canary", method="put", instance="auth-3g8-kl9m"}
api_version: v1/config
kind: DerivedLabel
spec:
name: Service constructed label
slug: service-constructed-label
label_name: service
description: this is a constructed derived label for service
metric_label:
constructed_label:
value_definitions:
- value: auth
filters:
- name: instance
value_glob: auth-*
- value: gateway
filters:
- name: instance
value_glob: gateway-*
Usage
If you query http_requests_total{service="auth"}
, the resulting time series is:
{__name__="http_requests_total", k8s_cluster="production", method="get", instance="auth-1a2-b3c4", service="auth"}
.
If you query grpc_requests_total{service="gateway"}
, the resulting time series is:
{__name__="grpc_requests_total", kubernetes_cluster="production", method="get", instance="gateway-0h8-6m2f", service="gateway"}
.