Head sampling
You can configure dynamic, remotely configurable head sampling (opens in a new tab) to capture only a portion of possible traces from originating services. Use head sampling rules to limit possible traces to a fraction of all possible traces. You define sampling strategies to target specific kinds of traces, focus on specific services, or concentrate on a combination of service and operation.
Head sampling applies only to root services, which are services that appear on the root span of a trace. Identifying root services can help identify where the most requests to a service originate, and can inform which service to increase sampling for to generate additional trace data.
To instrument head sampling, Chronosphere Observability Platform supports the OpenTelemetry JaegerRemoteSampler (opens in a new tab) head sampling standard. To use head sampling, you must first configure the OpenTelemetry Collector.
View head sampling rules
You can view head sampling rules in Observability Platform to understand the impact of each rule on your tracing data, such as the sampling rate, rule criteria, and the impact the rule is having on your trace data volume.
To return a list of defined head sampling rules without additional information, use Chronoctl.
In Observability Platform, each head sampling rule indicates which service or combination of service and operation it impacts. A single rule can impact multiple operations at different sampling rates for a specific service.
You need administrative access to complete this task.
To view head sampling rules:
-
In the navigation menu, click Go to Admin and then select Control > Trace Control Plane.
-
Click the Head sampling tab.
- The Status column indicates whether each rule is active or inactive. An active status means that a client fetched the head sampling rule within the last 30 minutes.
- The Traces Kept column indicates the percentage at which the current rule is sampling your traces for the indicated service. Hold the pointer over the bar in this column to view a description for each rule.
-
Expand each of the configured rules to view the rule criteria and impact on your tracing data. You can also view which operations the rule impacts for the selected service.
-
If a rule impacts multiple operations, use the dropdown menu to select which operation to display impact data for.
-
Use the search box to locate rules impacting a specific service.
Configure head sampling
To instrument head sampling, Observability Platform supports the OpenTelemetry JaegerRemoteSampler (opens in a new tab) head sampling standard. Observability Platform implements the JaegerRemoteSampler configuration API (opens in a new tab), and serves sampling strategies based on the Terraform resources you define.
Observability Platform is compatible with the Jaeger Remote Sampling extension (opens in a new tab) to enable the OpenTelemetry Collector to act as a proxy between OpenTelemetry agents and the Observability Platform backend.
To configure head sampling:
- Configure the OpenTelemetry Collector.
- Configure your app as an OpenTelemetry agent.
- Create head sampling rules.
Configure the OpenTelemetry Collector
Requires OpenTelemetry Collector version 0.83 or later.
Before configuring OpenTelemetry agents, you must configure the OpenTelemetry Collector to export tracing data to Chronosphere.
-
Configure the OpenTelemetry Collector to export tracing data to Observability Platform.
-
In the OpenTelemetry Collector
config.yaml
file, apply the following settings to modify theextensions
YAML collection. Specify anendpoint
that points to your Observability Platform tenant, and include the Chronosphere API token you created for the OpenTelemetry Collector.Chronosphere recommends specifying a
reload_interval
to control a local cache of your sampling strategy for theremote
source (which is Observability Platform). This interval reduces the number of backend API calls the OpenTelemetry Collector makes for your instrumented agents, which helps reduce egress costs.extensions: zpages: endpoint: 0.0.0.0:55679 health_check: {} pprof: {} jaegerremotesampling: source: reload_interval: 30s remote: endpoint: ADDRESS compression: gzip headers: API-Token: API_TOKEN
ADDRESS
: Your company name prefixed to your Observability Platform instance that ends in.chronosphere.io:443
. For example,MY_COMPANY
.chronosphere.io:443
.API_TOKEN
: The API token generated from your service account. Chronosphere recommends storing your API token in a separate file or Kubernetes Secret and calling it using an environment variable, such as$API_TOKEN
.
Configure your app as an OpenTelemetry agent
Configure your app as an OpenTelemetry agent (opens in a new tab) to act as a proxy for the OpenTelemetry Collector, which sends your tracing data to Observability Platform.
Instrument your app with the OpenTelemetry SDK using the OpenTelemetry protocol (OTLP) (opens in a new tab). Your app sends spans to the OpenTelemetry Collector, which pulls head sampling rules from Observability Platform. As you update head sampling rules in Observability Platform, the OpenTelemetry Collector pulls the changes and transmits the updated rules to your app.
The implementation depends on the programming language of your app. For example, the following code from the Go implementation of OpenTelemetry, OpenTelemetry-Go (opens in a new tab), defines how to implement the JaegerRemoteSampler in Go:
jaegerRemoteSampler := jaegerremote.New(
"SERVICE_NAME",
jaegerremote.WithSamplingServerURL("http://HOST_NAME:5778/sampling"),
jaegerremote.WithSamplingRefreshInterval(10*time.Second),
jaegerremote.WithInitialSampler(trace.TraceIDRatioBased(0.5)),
)
tp := trace.NewTracerProvider(
trace.WithSampler(jaegerRemoteSampler),
...
)
otel.SetTracerProvider(tp)
SERVICE_NAME
: Name of the service you're sampling, which can map to a microservice in your architecture.HOST_NAME
: Host name where your OpenTelemetry Collector is running.
When configuring the Jaeger remote sampler, you must include all of these properties:
- Endpoint: Endpoint where your OpenTelemetry Collector is running, which includes the Jaeger Remote Sampling extension that points to Observability Platform.
- Polling interval: Interval at which your OpenTelemetry agents sync strategies from your OpenTelemetry Collector.
- Initial sampler: Policy to implement from your service until the configured endpoint pulls a strategy.
Create head sampling rules
After configuring the OpenTelemetry Collector to export traces, configuring head sampling, and configuring your OpenTelemetry agents, manage your head sampling strategy using the Chronosphere Terraform provider or Chronoctl. This ability means you can push strategy changes to all of your OpenTelemetry agents without modifying the JaegerRemoteSampler strategy directly.
If you don't define a sampling strategy for a service, Observability Platform applies
the default sampling rate of 0.001
(0.1%) to the service.
The applied_strategy
defines your sampling strategy, and can be one of the
following values:
probabilistic_strategy
: Defines a probabilistic strategy, which samples traces from the identified service based on thesampling_rate
. This value determines the probability of sampling any trace, and must be in the range of0
to1
.rate_limiting_strategy
: Defines a rate-limited strategy that sets the maximum number of traces to sample per second.per_operation_strategies
: Defines a probabilistic strategy that sets a default sampling rate, plus an upper and lower bound.
Requires Chronoctl version 0.59.0 or later.
When creating your head sampling strategy with Chronoctl, consider the following
behaviors of the slug
property:
- If you don't provide a
slug
, Chronoctl generates one based on thename
field. - After creating your head sampling strategy, you can modify the
name
but can't modify theslug
. - The
service_name
andslug
must match.
To define your head sampling strategy:
-
Create a YAML file and define your head sampling strategy.
Use the
chronoctl trace-jaeger-remote-sampling-strategies scaffold
command to generate an example resource for a head sampling rule.The following head sampling strategy defines a probabilistic strategy that sets a default sampling rate for the
inventory-operation
, plus an upper and lower bound:api_version: v1/config kind: TraceJaegerRemoteSamplingStrategy spec: slug: ordering-service-sampling name: Ordering service head sampling strategy service_name: ordering-service-sampling applied_strategy: per_operation_strategies: default_sampling_rate: 0.01 default_lower_bound_traces_per_second: 1 default_upper_bound_traces_per_second: 1000 per_operation_strategies: operation: inventory-operation probabilistic_sampling_strategy: sampling_rate: 0.1
-
Apply your head sampling strategy and send it to Observability Platform:
chronoctl apply -f FILE_NAME.yml
FILE_NAME
is the name of your head sampling YAML file.
Terraform example
The following example provides a resource
definition for three services and
defines a distinct sampling strategy for each:
- The first resource samples the
billing-service
service at a rate of0.01
, which is 1% of traces. - The second resource samples at most two traces per second, per instance, from the
cart-service
service. For example, if thecart-service
service consists of 17 pods, the expected sample of traces per second is somewhere between 0 and 34. - The third resource samples traces from the
ordering-service
service at a rate of0.01
, or 1% of traces. If volumes are low, Observability Platform samples traces at least once per second. If volumes are high, Observability Platform stops sampling traces after reaching 1,000 traces per second.
resource "chronosphere_trace_jaeger_remote_sampling_strategy" "billing-service" {
name = "billing-service Jaeger Remote Sampling strategy"
service_name = "billing-service"
applied_strategy {
probabilistic_strategy {
sampling_rate = 0.01
}
}
}
resource "chronosphere_trace_jaeger_remote_sampling_strategy" "cart-service" {
name = "cart-service Jaeger Remote Sampling strategy"
service_name = "cart-service"
applied_strategy {
rate_limiting_strategy {
max_traces_per_second = 2
}
}
}
resource "chronosphere_trace_jaeger_remote_sampling_strategy" "ordering-service" {
name = "ordering-service Jaeger Remote Sampling strategy"
service_name = "ordering-service"
applied_strategy {
per_operation_strategies {
default_sampling_rate = 0.01
default_lower_bound_traces_per_second = 1
default_upper_bound_traces_per_second = 1000
per_operation_strategies {
operation = "notification-operation"
probabilistic_strategy {
sampling_rate = 0.0
}
}
per_operation_strategies {
operation = "inventory-operation"
probabilistic_strategy {
sampling_rate = 0.1
}
}
per_operation_strategies {
operation = "payment-operation"
probabilistic_strategy {
sampling_rate = 1.0
}
}
}
}
}
Edit head sampling rules
Select from the following methods to edit head sampling rules.
To edit head sampling rules using Chronoctl:
- View the head sampling rules Chronoctl YAML.
- Modify its properties and apply the changes with the same process as creating head sampling rules. Chronoctl updates head sampling rules if it has the same slug.
You can also use the following process if you already have a definition file:
-
Update the head sampling rules definition file.
-
Run the following command to submit the changes:
chronoctl trace-jaeger-remote-sampling-strategies update -f FILE_NAME.yaml
Replace
FILE_NAME
with the name of the YAML definition file you want to use.
Delete head sampling rules
Select from the following methods to delete head sampling rules.
To delete a head sampling rule with Chronoctl, use the
chronoctl trace-jaeger-remote-sampling-strategies delete
command:
chronoctl trace-jaeger-remote-sampling-strategies delete SLUG
Replace SLUG
with the slug of the head sampling rule you want to delete.
For example, to delete a head sampling rule with the slug head-sampling-prod
:
chronoctl trace-jaeger-remote-sampling-strategies delete head-sampling-prod