Collector service discovery for Prometheus metrics
The Collector supports several mechanisms to discover which metrics applications to scrape Prometheus metrics from:
- ServiceMonitors (each node)
- Kubernetes annotations (each node)
- Prometheus service discovery (per cluster)
Using ServiceMonitors
or Kubernetes annotations (or a combination of both) are
recommended for most deployments.
Use push-based collection mechanisms for use cases where jobs can't be scraped automatically, such as AWS Lambda, Google Cloud Functions, or ephemeral batch jobs.
ServiceMonitors
ServiceMonitors
are a custom resource definition (CRD) you can use to define scrape
configurations and options in a separate Kubernetes resource.
Discovery is scoped to the targets on the local node by default, which requires you to deploy the Collector as a DaemonSet for this method of service discovery.
Prerequisites
Run the following command to install the
ServiceMonitor
CRD (opens in a new tab)
from the full Prometheus Operator, using the file in the kube-prometheus-stack
Helm chart:
kubectl apply -f https://raw.githubusercontent.com/prometheus-community/helm-charts/e46dc6360b6733299452c8fd65d304004484de79/charts/kube-prometheus-stack/crds/crd-servicemonitors.yaml
Chronosphere supports only fields in version 0.44.1 of the Prometheus Operator.
Enable ServiceMonitor discovery
Chronosphere Collector v0.104.0 and later uses the Kubernetes EndpointSlices API by default. Upgrading from Chronosphere Collector v0.103.0 or earlier without configuring EndpointSlices can result in failures if you've enabled endpoint discovery but haven't configured the use of EndpointSlices.
To enable ServiceMonitor discovery in the Collector, make the following configuration changes:
-
Add the following options after the
ClusterRole
resource in the manifest underrules
:kind: ClusterRole rules: - apiGroups: - monitoring.coreos.com resources: - servicemonitors verbs: - get - list - watch - apiGroups: - discovery.k8s.io resources: - endpointslices verbs: - get - list - watch
-
Enable the
ServiceMonitors
feature of the Collector by setting the following keys totrue
in the manifestConfigMap
under thediscovery
>kubernetes
key:discovery: kubernetes: enabled: true serviceMonitorsEnabled: true endpointsDiscoveryEnabled: true useEndpointSlices: true podMatchingStrategy: VALUE
-
serviceMonitorsEnabled
: Indicates whether to use ServiceMonitors to generate job configurations. -
endpointsDiscoveryEnabled
: Determines whether to discover Endpoints. RequiresserviceMonitorsEnabled
to be set totrue
. -
useEndpointSlices
: Use EndpointSlices instead of Endpoints. RequiresserviceMonitorsEnabled
andendpointsDiscoveryEnabled
to be set totrue
. EndpointSlices use less resources than Endpoints.EndpointSlices are available with Chronosphere Collector v0.85.0 or later and Kubernetes v1.21 or later. Chronosphere Collector v0.104.0 and later exclusively use EndpointSlices and ignore the
useEndpointSlices
setting. -
podMatchingStrategy
: Determines how to use ServiceMonitors and annotations when discovering targets. Accepts the following settings forVALUE
:all
: Allows any and all scrape jobs to be registered for a single pod.annotations_first
: Matches annotations first. If no matches return, then other matching can occur.service_monitors_first
: Matches ServiceMonitors first. If no matches return, then other matching can occur.service_monitors_only
: Matches ServiceMonitors only.
-
Pod-based ServiceMonitor discovery
If you use a version of Kubernetes that doesn't support endpoint slices, you can set
endpointsDiscoveryEnabled
to false
to run the Collector in a mode that doesn't
discover Kubernetes endpoint slices or service resources.
In this mode, the Collector can still discover scrape targets using ServiceMonitors
under specific circumstances depending on the Kubernetes resource configuration. The
Collector uses the Pod's labels as the Service's labels. If the Pod's labels match
the Service's labels a ServiceMonitor
that uses targetPort
(container port) to
indicate the port to scrapes.
Because this discovery method can be very resource intensive, do not use this method for most deployments. Instead, contact Chronosphere Support for more information about pod-based ServiceMonitor discovery.
Run as a DaemonSet with ServiceMonitors
If you want to run the Collector as a DaemonSet and scrape kube-state-metrics
through a Collector running as a Deployment, you need to update the manifest for both
Collector instances.
In your DaemonSet, add the serviceMonitor
> serviceMonitorSelector
key to your
manifest and define the following matchExpressions
to ensure that your DaemonSet
only matches on ServiceMonitors
that don't contain kube-state-metrics
:
serviceMonitor:
serviceMonitorSelector:
matchAll: false
matchExpressions:
- label: app.kubernetes.io/name
operator: NotIn
values:
- kube-state-metrics
In your Deployment, add the same key and definitions to your manifest, but set the
operator
value of the matchExpressions
attribute to In
. This setting ensures
that your Deployment only matches on ServiceMonitors
that contain
kube-state-metrics
:
serviceMonitor:
serviceMonitorSelector:
matchAll: false
matchExpressions:
- label: app.kubernetes.io/name
operator: In
values:
- kube-state-metrics
Match specific ServiceMonitors
By default, the Collector ingests metrics from all ServiceMonitor
sources. To match
specific instances, use a series of AND
match rules under the
serviceMonitor
> serviceMonitorSelector
key and set the matchAll
under the
serviceMonitorSelector
key to false
.
serviceMonitorSelector:
matchAll: false
The available match rules are:
-
matchLabelsRegexp
: Labels and a regular expression to match a value. For example:matchLabelsRegexp: labelone: '[a-z]+'
-
matchLabels
: Labels and a matching value. For example:matchLabels: labelone: foo
-
matchExpressions
: Depending on the operator set, labels that exist or don't exist, or have or don't have specific values. For example:-
To match
ServiceMonitors
that have theexamplelabel
with valuesa
orb
use theIn
operator:matchExpressions: - label: examplelabel operator: In values: - a - b
-
To match
ServiceMonitors
that have theexamplelabel
without valuesa
orb
, use theNotIn
operator. TheNotIn
operator also matches anyServiceMonitors
without theexamplelabel
present:matchExpressions: - label: examplelabel operator: NotIn values: - a - b
-
To match
ServiceMonitors
that have theexamplelabel
with any value, use theExists
operator:matchExpressions: - label: examplelabel operator: Exists
-
To match
ServiceMonitors
that don't have theexamplelabel
, use theDoesNotExist
operator:matchExpressions: - label: examplelabel operator: DoesNotExist
-
Match endpoints without pods using ServiceMonitors
The default Collector configuration isn't suitable if you want to discover endpoints but lack access to Pod information. For example, if you want to:
- Monitor the Kubernetes API server, which doesn't run on the same node as Kubernetes workloads.
- Monitor endpoints that can be running anywhere in the cluster, but without using a Collector running as a DaemonSet.
- Discover and scrape
kube-state-metrics
, which listen to the Kubernetes API server and generate metrics about deployments, nodes, and pods.
If you're monitoring endpoints but don't have access to Pod information, the
ServiceMonitor
can't use the TargetPort
attribute to target the endpoint and
must instead use the Port
attribute.
In these cases, run the Collector as a Kubernetes Deployment
with a single instance, and set the allowSkipPodInfo
attribute to true
.
serviceMonitor:
allowSkipPodInfo: true
Use this attribute with caution. Setting allowSkipPodInfo
to true
on a DaemonSet
can cause every Collector in the DaemonSet to attempt to scrape every endpoint in the
cluster, or cause duplicate scrapes.
Kubernetes annotations
Discovery is scoped to the targets on the local node by default, which requires you to deploy the Collector as a DaemonSet for this method of service discovery.
For the Collector to start scraping the Pods in a Kubernetes cluster, set the
following annotations
on each Pod in the cluster:
spec:
template:
metadata:
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '{port_number}'
The following manifest is an example of using these two annotations for a basic
Node Exporter (opens in a new tab) deployment. Based on
these annotations, by default, the Collector starts scraping the /metrics
endpoint
on port 9100.
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
app.kubernetes.io/name: node-exporter
app.kubernetes.io/version: v1.0.1
name: node-exporter
namespace: monitoring
spec:
selector:
matchLabels:
app.kubernetes.io/name: node-exporter
template:
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9100"
labels:
app.kubernetes.io/name: node-exporter
app.kubernetes.io/version: v1.0.1
spec:
containers:
- image: quay.io/prometheus/node-exporter:v1.0.1
name: node-exporter
...
You can set additional annotations to control other scrape options. For a complete list of supported annotations, read the scrape configuration documentation.
You can change the annotation prefix, which defaults to prometheus.io/
, from the
kubernetes
> processor
section of the Collector ConfigMap
.
After any changes, send the updated manifest to the cluster with the following command:
kubectl apply -f path/to/manifest.yml
If you modify a Collector manifest, you must update it in the cluster and restart the Collector.
Prometheus service discovery
If using Prometheus service discovery within Kubernetes, deploy a single Collector as a Kubernetes Deployment per cluster. This is to avoid every Collector instance duplicating scrapes to all endpoints defined in the Prometheus service discovery configuration.
To enable Prometheus service discovery, set discovery.prometheus.enabled
to true
in the Collector config. Provide the list of scrape configs in the
discovery.prometheus.scrape_configs
section. The following example uses the
kubernetes_sd_config (opens in a new tab).
discovery:
prometheus:
enabled: true
scrape_configs:
- job_name: kubernetes-pods
honor_timestamps: true
scrape_interval: 30s
scrape_timeout: 30s
metrics_path: /metrics
scheme: http
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels:
[__meta_kubernetes_pod_annotation_prometheus_io_scrape]
separator: ;
regex: 'true'
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
separator: ;
regex: (.+)
target_label: __metrics_path__
replacement: $1
action: replace
- source_labels:
[__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
separator: ;
regex: ([^:]+)(?::\d+)?;(\d+)
target_label: __address__
replacement: $1:$2
action: replace
- separator: ;
regex: __meta_kubernetes_pod_label_(.+)
replacement: $1
action: labelmap
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: kubernetes_namespace
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_name]
separator: ;
regex: (.*)
target_label: kubernetes_pod_name
replacement: $1
action: replace
For details, see the Prometheus scrape configuration documentation (opens in a new tab). For a complete list of examples, see the examples section in the Prometheus GitHub repository (opens in a new tab).
Set the Collector to scrape its own metrics
For the Collector to scrape its own metrics, add another job to the
discovery.prometheus.scrape_configs
key:
…
- job_name: 'collector'
scrape_interval: 30s
scrape_timeout: 30s
static_configs:
- targets: ['0.0.0.0:3030']
…