OpenTelemetry collector configuration

OpenTelemetry Collector configuration

You can configure the OpenTelemetry Collector (opens in a new tab) to send metric and tracing data to Chronosphere.

Chronosphere accepts OpenTelemetry Protocol (OTLP) metrics from OpenTelemetry Collectors by using the gRPC OTLP Exporter.

Chronosphere recommends not sending OTLP metrics directly from the OpenTelemetry Client SDK.

All traffic must be sent over HTTPS.

This procedure adds each configuration section separately. For a complete file, which can be modified as needed, see the Full OpenTelemetry configuration example.

  1. Pull the OpenTelemetry Docker image (opens in a new tab) to run the OpenTelemetry Collector in a Docker container.

    docker pull otel/opentelemetry-collector-contrib:VERSION
    docker run otel/opentelemetry-collector-contrib:VERSION

    Replace VERSION with the version of the OpenTelemetry Collector you want to run, which must be version 0.83 or later.

  2. In the OpenTelemetry Collector config.yaml file, apply the following settings to modify the exporter YAML collection. Specify an endpoint that points to your Chronosphere tenant, and include the Chronosphere API key you created as an HTTP header.

    You can modify the OpenTelemetry Collector configuration (opens in a new tab) if you want to change the defaults. For example, you can reference a config.yaml in a different location, such as a Kubernetes ConfigMap.

    exporters:
      otlp/chronosphere:
        endpoint: {{ .Values.chronosphereAddress }}
        retry_on_failure:
          enabled: true
        compression: gzip
        headers:
          API-Token: API_TOKEN
    • .Values.chronosphereAddress: Your company name prefixed to your Chronosphere instance that ends in .chronosphere.io:443. For example, MY_COMPANY.chronosphere.io:443.
    • API_TOKEN: The API token generated from your service account, specified as an HTTP header. Chronosphere recommends storing your API token in a separate file or Kubernetes Secret and calling it using an environment variable, such as $API_TOKEN.
    • retry_on_failure: Set to false. The Chronosphere OTLP ingest API doesn't signal retryable errors.

    The Chronosphere OpenTelemetry endpoint supports gzip compression only.

  3. Configure batch processing (opens in a new tab). Sending telemetry in batches improves data compression and reduces the number of outgoing connections required to transmit the data.

    For example, the following configuration enforces a maximum batch size limit of 2000 spans without introducing any artificial delays:

    processors:
      batch:
        timeout: 1s
        send_batch_size: 1000
        send_batch_max_size: 2000

    The timeout, batch size, and batch max size are default recommendations. Monitor the exporter send and enqueueing failure metrics to tune these parameters based on your workload.

  4. Add the OTLP exporter and the batch processor to the metrics and traces exporters definition.

     service:
       pipelines:
         metrics:
           receivers: [otlp]
           processors: [batch]
           exporters: [otlp/chronosphere]
         traces:
           receivers: [otlp]
           processors: [batch]
           exporters: [otlp/chronosphere]
  5. Instruct the OpenTelemetry Collector to load the API token from an environment variable. The API_TOKEN variable should be securely stored in Kubernetes as a Secret. Binding a Secret to an Environment variable on the Pod is a well-supported, secure pattern in Kubernetes.

    ⚠️

    You should never share or store your API token in plain text. Chronosphere recommends using tools like SOPs to securely store this information.

       # In the OpenTelemetry Collector Deployment:
         - name: API_TOKEN
           valueFrom:
             secretKeyRef:
               name: chronosphere-api-token
               key: apiToken
     
       # The accompanying Secret:
         apiVersion: v1
         kind: Secret
         metadata:
           name: chronosphere-api-token
           namespace: <INSERT YOUR NAMESPACE>
         type: Opaque
         data:
           apiToken: <plaintext here>

Map resource attributes to Prometheus job and instance

Chronosphere uses the OpenTelemetry service.name and service.instance.id resource attributes as the values for the Prometheus job and instance source labels (opens in a new tab).

  • service.name -> job - (recommended) Chronosphere sets the job label value to the service.name resource attribute value. Chronosphere sets the job label to unknown when the metric does not include a service.name resource attribute.
  • service.instance.id -> instance - (required) Chronosphere rejects the metric when the service.instance.id resource attribute is missing.

To ensure all metrics have a valid service.instance.id attribute, copy an existing, unique resource attribute, such as host.name or pod.id as the unique instance identifier value for service.instance.id.

To define a default service.instance.id resource attribute:

  1. Add a resource detection processor. Use the resource detection processor can to detect resource information from the environment. Append or override the resource value in telemetry data with this information. Configure the resource detection processor based on the environment where the collector is deployed. Review the resource detection processor documentation (opens in a new tab) to select the processor for your environment. For example, if the Collector is deployed on a host, the system processor is the best option to gather information about the host.

       processors:
         resourcedetection:
         detectors: [env, system]
         timeout: 2s
         override: true
  2. Define a resource attribute processor. In the processors ConfigMap, define a rule to add the service.instance.id. The following example maps the host.name resource attribute to a new service.instance.id resource attribute. Depending on how resource detection is configured, your environment might have other resource attributes available, such as k8s.node.uid, which could serve as the instance identifier.

       resource/service-instance:
        attributes:
        - key: service.instance.id
          from_attribute: host.name
          action: insert
  3. Add the resource detection and resource attribute processors to your metrics processors pipeline. In the service.pipelines.metrics.processors ConfigMap, add the resource attribute processor defined in the previous step:

    processors: [transform, batch, resourcedetection, resource/service-instance]

Send operational metrics about the OpenTelemetry Collector

The OpenTelemetry Collector exposes metrics about its operations using a Prometheus scrape endpoint. Chronosphere uses OpenTelemetry Collector metrics in the OpenTelemetry Ingestion & Health dashboard to describe Collector health.

To send operational metrics about your OpenTelemetry Collectors to Chronosphere:

  1. Define a Prometheus receiver to scrape the endpoint receiver:

       prometheus/otel-collector-self-scrape:
       config:
         scrape_configs:
         - job_name: 'otel-collector-self-scrape'
           scrape_interval: 30s
           static_configs:
           - targets: ['0.0.0.0:8888']
  2. Add the Prometheus receiver to the metrics pipeline

      metrics:
        receivers: [prometheus/otel-collector-self-scrape, otlp/chronosphere]
  3. Add the metrics service to the Service ConfigMap:

       service:
         telemetry:
           metrics:
             address: "0.0.0.0:8888"

Next steps

Chronosphere should begin ingesting data. Verify the Collector is receiving metrics and traces. You can also configure head sampling to determine whether to drop a span or trace as early as possible.

If you encounter issues, see our troubleshooting page.

Full OpenTelemetry configuration example

The following example is the entire configuration file, containing all of the previously provided required sections:

receivers:
  # OTLP receiver
  otlp:
    protocols:
      grpc:
      http:
 
  # Prometheus receiver configured to scrape the collector's own metrics
  prometheus/otel-collector-self-scrape:
    config:
      scrape_configs:
      - job_name: 'otel-collector-self-scrape'
        scrape_interval: 30s
        static_configs:
        - targets: ['0.0.0.0:8888']
 
exporters:
  # OTLP exporter configured to send telemetry to your Chronosphere tenant
  # Replace the endpoint and API-Token with your values
  otlp/chronosphere:
    endpoint: "your-tenant.chronosphere.io:443"
    retry_on_failure:
      enabled: true
    compression: gzip
    headers:
      API-Token: "YOUR-API-KEY"
 
processors:
  # Detect environment information to include as telemetruy attributes
  # Configure the detector from the list of available detectors:
  # https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/resourcedetectionprocessor
  resourcedetection:
    detectors: [env, system]
    timeout: 2s
    override: true
  # Use environment information to define the `service.instance.id` attribute
  # This example maps `host.name` detected by the `resourcedetection` processor
  # to a new `service.instance.id` resource attribute
  resource/service-instance:
    attributes:
      - key: service.instance.id
        from_attribute: host.name
        action: insert
  # Add the batch processor to efficiently send telemetry
  batch:
    timeout: 1s
    send_batch_size: 1000
    send_batch_max_size: 2000
 
service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch, resourcedetection, resource/service-instance]
      exporters: [otlp/chronosphere]
    metrics:
      receivers: [otlp]
      processors: [batch, resourcedetection, resource/service-instance]
      exporters: [otlp/chronosphere]
    # Define a separate metrics pipeline to send the collector's metrics to Chronosphere
    metrics/internal:
     receivers: [prometheus/otel-collector-self-scrape]
     processors: [batch, resourcedetection]
     exporters: [otlp/chronosphere]