OBSERVABILITY PLATFORM
Reduce cardinality

Reduce cardinality

When you're first using Chronosphere Observability Platform, or a new app or service comes online, you might see cardinality spikes. Cardinality spikes can occur when:

  • A metric or group of metrics has unexpectedly large numbers of labels.
  • A process or service creates many similarly named metrics.

Cardinality spikes can cause storage and licensing issues. To reduce cardinality, or data storage for less important metrics:

  1. Find a problematic metric or label.
  2. Review that metric or label's usage.
  3. Decide what to do with it (drop, rollup).

Find a metric and inspect the associated labels

The Metric Growth dashboard can help you find metrics and labels that have increased in cardinality during recent time periods.

The Live Telemetry Analyzer provides real-time insight into current incoming metrics. Sort metrics by Unique value to find potential high cardinality.

For example, you want to reduce the cardinality of a metric. The first step of the decision process is understanding the targeted metric and its associated labels.

If you have administrative privileges:

  1. In the navigation menu, click Go to Admin and then select Analyzers > Live Telemetry Analyzer.
  2. The analyzer defaults to _name_. Sort Label values by name, or add a label filter.
  3. In the Labels section, inspect the incoming label keys. The Unique Values column shows how many distinct values are incoming for a given label (cardinality), and Appears In shows how frequently that label is attached to the metric.

When the number of unique values for a metric is high, that label contributes significantly to the cardinality for the metric.

Review metric and label usage

After identifying a high-cardinality label, you need to understand whether this label is meaningful, or if it can be safely removed.

To verify dropping a label is safe, use one of the following methods:

  • The Telemetry Usage Analyzer displays a Utility score, providing insight into which metrics users find important.

  • Chronoctl includes a search command to filter previously defined configurations for references to a metric regular expression. This allows us to determine whether a metric is referenced within the organization, and where it's used.

Remove the identified label

If, after identifying a label, you find that label isn't used in any dashboards or alerts, consider reducing or removing the label using drop, mapping, or rollup rules.

Validation

For rollup rules, preview the shaping impact to review and confirm your changes before deleting metrics and labels that still matter.

Post validation tasks

Return to the Live Telemetry Analyzer and search for your metric. If you've used a rollup rule, it can take some time before your rolled up metric appears.

For rolled up metrics, it often makes sense to drop raw data if that data isn't needed. This reduces cardinality and data storage requirements.

After you've validated your rule, apply the using your selected method.