View tail sampling rules
In Chronosphere Observability Platform, you can view the tail sampling rules you configured in Terraform to understand the impact of each rule on your tracing data. These impacts can include the sampling rate, the rule’s criteria, and the impact of a rule on your incoming traces. Observability Platform evaluates each trace against each rule’s trace filter, in order of precedence, until a rule matches. If a rule matches, Observability Platform applies the matched rule’s sampling rate to the trace. If a trace doesn’t match any rules, Observability Platform applies the default sample rate to the trace. If a default sampling rate isn’t specified, Observability Platform keeps all traces. You need administrative access to complete this task.- Web
- Chronoctl
- API
To view tail sampling rules:
- In the navigation menu, click Go to Admin and then select Control > Trace Control Plane.
- Select the Tail sampling tab. The sample rate for each rule displays in the Traces Kept column, in addition to the Created and Updated dates. Hold the pointer over the bar in the Traces Kept column to view a description for each rule.
- Expand each of the configured rules to view the rule criteria and impact on your tracing data.
- Use the search box to locate rules impacting a specific service or operation.
Create tail sampling rules
You can create tail sampling rules using the Chronosphere Terraform provider or Chronoctl. You create one set of tail sampling rules as an ordered list. Rules are evaluated in match order. In your rule definition file, put broader rules at the top, such as a rule that drops any traces with health check data. For each rule you must:- Assign a human-readable
nameto identify the tail sampling rule in Observability Platform. - Assign a
system_name, which provides a unique label name for the metric group that traces affected by the rule. - Define a specific filter such as
"error=true". - Specify a sampling rate. The defaults sampling rate is
1, which means that Observability Platform stores all traces that don’t match any sampling rules.
0 and 1, where a rate of 0 drops all
traces, and a rate of 1 keeps all traces matching the defined filter. A sampling
rate of .5 drops half of all traces matching the filter, and keeps the other half.
For a complete list of supported fields for tail sampling rules, see the
CreateTraceTailSamplingRules endpoint.
- Chronoctl
- Terraform
- API
Requires Chronoctl version 1.0.0 or later.You can use the
trace-tail-sampling-rules scaffold command to generate an example
tail sampling rule, and then copy the resource definition:-
Create a YAML file and define your tail sampling strategy.
The following tail sampling drops all health check traces from an operation named
/health. Thesample_rateof0drops any traces matching the defined rule. -
Apply your tail sampling strategy and send it to Observability Platform:
Replace
FILE_NAMEwith the name of your tail sampling YAML file.
Edit tail sampling rules
Select from the following methods to edit tail sampling rules.- Chronoctl
- Terraform
- API
To edit tail sampling rules with Chronoctl:
- View the tail sampling rules Chronoctl YAML.
- Modify its properties and apply the changes with the same process as creating tail sampling rules. Chronoctl updates tail sampling rules if it has the same slug.
- Update the tail sampling rules definition file.
-
Run the following command to submit the changes:
Replace
FILE_NAMEwith the name of the YAML definition file you want to use.
Delete tail sampling rules
Select from the following methods to delete tail sampling rules.- Chronoctl
- Terraform
- API
To delete a tail sampling rule with Chronoctl, use this
command:Replace
SLUG with the slug of the tail sampling rule you want to delete.For example, to delete a tail sampling rule with the slug tail-sampling-prod:Terraform examples
Use the following examples to build your tail sampling strategy in Terraform. Because the tracing backend evaluates rules in match order, put expansive rules at the top of your Terraform file, such as rules that always drop or always keep specific traces.Default sampling rate
The following example defines a default sample rate of1, which keeps all traces. A
sample rate of 0 drops all traces that don’t match any other rule.
Drop all health check traces
You might have load balancers that ping your backend servers every few seconds, which can generate a large amount of useless tracing data. In this instance, you can define a rule to drop all health check traces rather than those from a particular service. In addition to defining the default sample rate, the following rule drops all health check traces from an operation named"/health". The sample_rate of 0 drops any
traces matching the defined rule.
Always keep query traces with a minimum duration
Requests to your app can quickly consume your licensed trace capacity. For example, user-initiated requests to a ride sharing app can amount to huge traces, especially during peak travel hours. Any time a query executes, it can generate tens or even hundreds of thousands of spans. You might only want to keep traces that exceed a specific duration or result in an error state, rather than storing the entirety of your tracing data. The following example keeps any trace with a span where the operation is"/hail-ride",
and the overall duration of the trace is greater than five seconds. This rule lets
you store long-running traces and investigate what’s causing higher latency.
"/hail-ride" operation
that fail. The following rule matches any trace with at least one call to the
"/hail-ride" operation anywhere in the trace, even if there’s only one out of 1,000
spans. Observability Platform then keeps any traces from the "/hail-ride" operation
where the error value is true.
Match on services in specific regions
You might want to keep a percentage of traces from particular services that match certain conditions. For example, always keep a sample of traces from thebilling-svc service in the us-east or us-west regions that have a specific
duration. This ability to hone your sampling rules provides finer control over which
tracing data you keep and pay for.
The following example defines a resource definition with specified rules that
matches two tags:
- Matching a tag where the key is
regionand the values are eitherus-eastorus-west. The example uses theREGEXoperator to match either of the specified values. - Matching a tag where the key is
http.status_codeand the value doesn’t match200. The example uses theNOT_EQUALcomparison operator to achieve this evaluation.
sample_rate of 0.6 to any traces matching that
key/value pair and the additional specified criteria, such as duration, error,
operation, and service.
Nested tail sampling rules
You can nest tail sampling rules by adding multiplerules definitions. The
following example includes individual rules that match on different tags:
- The
Reduce prod to 5 percentrule matches a tag where the key isBillingEnvironmentand the value isproduction. The sample rate is0.05, which samples five percent of traces matching this rule. - The
Exclude API status tracesrule matches a tag where the key isOperationand the value is/api/status. The sample rate is0, which drops all traces matching this rule.