OBSERVABILITY PLATFORM
Dictionary

Metrics dictionary

This dictionary defines metrics created by and specific to Chronosphere Observability Platform. These metrics are often included in default dashboards, and you can search for them anywhere you use metrics. These curated metrics can help track basic important information about your Observability Platform instance.

The Chronosphere Health Check dashboard includes links to the Collectors, Usage Dashboard, and Licensing Information dashboards.

Query these metrics as their respective Prometheus type.

Capacity limits

Licensing capacity is based on your telemetry types and usage. The following tables describe metrics used for specific sections of your license.

Chronosphere recommends creating alerts using the existing Capacity Limit metrics, which are also used in the Metrics License Consumption dashboard. Use alerts to be notified when you're close to or over 100% of your license limit and therefore at risk of experiencing drops:

Persisted cardinality

The following metrics apply to persisted cardinality, which is a cumulative measure that calculates the sum of the unique time series of the persisted writes that Observability Platform stores, seen over the last 2.5 hours only.

Metric nameDescription
chrono_metrics_persisted_cardinality_license_limitLicense limit for active persisted time series cardinality.
chrono_metrics_persisted_cardinality_license_capacityCapacity limit for active persisted time series cardinality.
chrono_metrics_persisted_cardinality_license_consumedConsumption of the persisted write cardinality limit by datapoint type.

Query the following metric to understand if data is actively being dropped:

chrono_metrics_persisted_license_dpps_dropped{limit="persisted_cardinality"}

Persisted writes

The following metrics apply to persisted writes, which are writes to the Observability Platform database.

Metric nameDescription
chrono_metrics_persisted_writes_license_dpps_limitLicense limit for persisted write DPPS by datapoint type.
chrono_metrics_persisted_writes_license_dpps_capacityCapacity limit for persisted write DPPS by datapoint type.
chrono_metrics_persisted_writes_license_dpps_consumedConsumption rate in DPPS of persisted write license by datapoint type.

For each of these metrics, you can query by datapoint_type, such as histogram or standard. For the _dpps_capacity and _dpps_consumed metrics, you can additionally query by pool_name, and priority.

For example, the following query returns the consumption rate in DPPS of your persisted write license for histogram data points on the Auth Services pool. Because priority isn't specified, the query returns a series for each priority:

chrono_metrics_persisted_writes_license_dpps_consumed{datapoint_type="histogram", pool_name="Auth Services"}

If you wanted to return only high priority histogram data points for the Auth Services pool, specify priority = "high" in your query:

chrono_metrics_persisted_writes_license_dpps_consumed{datapoint_type="histogram", pool_name="Auth Services", priority="high"}

Query the following metric to understand if data is actively being dropped:

chrono_metrics_persisted_license_dpps_dropped{limit="persisted_writes"}

Matched writes

The following metrics apply to matched writes, which are the number of writes per second being matched for transformation and reshaping by the Observability Platform aggregation tier.

Metric nameDescription
chrono_metrics_matched_writes_license_dpps_limitLicense limit for matched write DPPS by datapoint type.
chrono_metrics_matched_writes_license_dpps_capacityCapacity limit for matched write DPPS by datapoint type.
chrono_metrics_matched_writes_license_dpps_consumedConsumption rate in DPPS of matched write license by datapoint type.

Query the following metric to understand if data is actively being dropped:

chrono_metrics_matched_license_dpps_dropped

Legacy licensing metrics

The following table explains metrics that might be present in your environment, but will be replaced by new metrics. Persisted writes, persisted cardinality, and matched writes specific metrics will replace this table.

Metric nameMetric typeDescriptionTags provided during dashboard creation
limit_service_cardinality_count replaced by chrono_metrics_persisted_cardinality_license_consumedCounterCurrent cardinality count across all Collectors.chronosphere_service
limit_service_licensed_cardinality_limit replaced by chrono_metrics_persisted_cardinality_license_limitCounterCurrent cardinality limit across all Collectors.chronosphere_service
limit_service_licensed_persist_limit replaced by chrono_metrics_persisted_writes_license_dpps_limitCounterCurrent limit for data points persisted in the database across all Collectors, as defined in the contract.chronosphere_service
limit_service_capacity_limitCounterCurrent capacity limit for data points persisted in the database across all Collectors, based on grant by Chronosphere.chronosphere_service
limit_service_licensed_processing_limitCounterCurrent limit for processed data points across all Collectors.chronosphere_service
limit_service_persisted_count replaced by chrono_metrics_persisted_writes_license_dpps_consumedCounterTotal number of data points persisted in database.chronosphere_service
limit_service_processed_countCounterCurrent count of processed data points across all Collectors.chronosphere_service
limit_service_matched_limit replaced by chrono_metrics_matched_writes_license_dpps_limitCounterCurrent license limit for matched write DPPS by datapoint type.chronosphere_service
limit_service_capacity_limit replaced by chrono_metrics_matched_writes_license_dpps_capacityCounterCurrent capacity limit for matched write DPPS by datapoint type.chronosphere_service
chronosphere_rule_metrics_matched replaced by chrono_metrics_matched_writes_license_dpps_consumedCounterConsumption rate in DPPS of matched write license by datapoint type.chronosphere_service

Collectors

The Collectors dashboard includes the following metrics that Collectors generate. Use this dashboard to monitor the health of your Collectors.

Metric nameMetric typeDescriptionTags provided during dashboard creation
chronocollector_build_informationGaugeMetrics relating to current build of Collectors.branch
build_date
build_version
chronosphere_k8s_cluster
chronosphere_k8s_container_port
chronosphere_k8s_namespace
cluster
go_version
hostname
instance
job
k8s_cluster_id
pod_name
namespace
region
revision
service
chronocollector_gateway_push_errorsCounterCurrent total number of push errors from Collector.chronosphere_k8s_cluster
chronosphere_k8s_container_port
chronosphere_k8s_namespace
component
environment
hostname
instance
job
k8s_cluster_id
namespace
region
service
chronocollector_gateway_push_latencySummaryLatency of pushed writes by Collector.chronosphere_k8s_cluster
chronosphere_k8s_container_port
chronosphere_k8s_namespace
component
environment
instance
job
k8s_cluster_id
namespace
quantile
region
service
chronocollector_gateway_push_successCounterTotal number of metrics successfully pushed to
the Chronosphere gateway.
annotationsPrefix
cluster
component
env
environment
instance
job
node
region
service
service_account
team
version
chronocollector_gateway_write_successCounterTotal number of metrics successfully written to
the Chronosphere gateway.
annotationsPrefix
cluster
component
env
environment
instance
job
node
region
service
service_account
team
version
chronocollector_k8s_gatherer_processor_targets_activeGaugeCurrent number of active targets Collector is scraping.environment
instance
job
k8s_cluster_id
namespace
region
service
process_cpu_seconds_totalCounterCurrent total number of seconds of CPU processing time.environment
instance
job
k8s_cluster_id
namespace
node
region
service

Query overview

The Chronosphere Query Overview dashboard includes the following metrics. Use this dashboard to identify resource-intensive alert or recording groups.

Metric nameMetric typeDescriptionTags provided during dashboard creation
permits_quotaCounterAmount of resources used associated to querying time series.chronosphere_k8s_namespace
endpoint
instance
job
permit
pod_name
source
permits_throttledCounterAmount of throttling applied to queries.chronosphere_k8s_namespace
endpoint
instance
job
permit
pod_name
source
permits_wait_totalCounterAmount of time spent waiting to access querying resources.chronosphere_k8s_cluster
chronosphere_k8s_namespace
endpoint
instance
job
permit
pod_name
source
prometheus_rule_group_last_duration_secondsHistogramThe total time the group took to complete its last iteration, in seconds.chronosphere_k8s_cluster
chronosphere_k8s_namespace
instance
job
pod_name
rule_group

Policy statistics

The following usage metrics apply to policy statistics.

Metric nameMetric typeDescriptionTags provided during dashboard creation
chrono_policies_countCounterTracks actions for ingestion policies, grouped by the name of the policy.dropped
policy_name
type
chrono_policies_totalCounterTracks actions for ingestion policies with any naming policy.dropped
policy_name
type

Shaping usage statistics

The following usage metrics apply to shaping statistics.

Metric nameMetric typeDescriptionTags provided during dashboard creation
chrono_poolstats_countCounterShaping statistics that include pool information.drop_reason
dropped
type
chrono_poolstats_totalCounterTotal shaping statistics without any tag information.drop_reason
dropped
type
chrono_poolstats_samplingCounterEmitted only when the number of unique usage statistics values
exceeds the configured maximum allowed tags.
node
type

Usage statistics

The Usage Dashboard includes the following usage statistics metrics. Use this dashboard to identify who is contributing most to your Chronosphere usage and manage your overall usage.

Metric nameMetric typeDescriptionTags provided during dashboard creation
chrono_usagestats_countCounterUsage statistics grouped by tags.drop_reason
dropped
type
chrono_usagestats_totalCounterTotal usage statistics without any grouping.drop_reason
dropped
type
chrono_usagestats_count_samplingCounterEmitted only when the number of unique usage statistics values
exceeds the configured maximum allowed tags.
node
type

Other usage statistics count label and metric name usage.

Metric nameMetric typeDescription
chrono_datapoints_by_metric_per_secondGaugeContains the metric_name label. Emits the average data points per second over the last two minutes by metric name.
chrono_datapoints_by_label_per_secondGaugeContains the label_name label. Emits the average data points per second over the last two minutes by label name.
chrono_unique_label_values_countGaugeContains the label_name label. Emits the unique values seen over the last two minutes, by label name.

Service token usage

The following metric applies to service account tokens:

Metric nameMetric typeDescription
chrono_api_token_requests_totalCounterMonitors the number of requests made with a service account token. Can be inaccurate if more than 1000 service accounts are in use.

The email label for this metric corresponds to the email field queryable in the service accounts API.

Google Cloud integration

The following metrics apply to Observability Platform's Google Cloud integration:

Metric nameMetric typeDescription
chrono_gcp_integration_shards_totalGaugeThe number of Google Cloud metric shards successfully ingested.
chrono_gcp_integration_active_shards_totalGaugeThe number of active Google Cloud metric shards successfully ingested.
chrono_gcp_integration_data_points_totalCounterThe total number of data points ingested.
chrono_gcp_integration_metric_descriptorsGaugeList of all metric descriptors ingested, indicated by value of 1.