OBSERVABILITY PLATFORM
Behaviors

Use trace behaviors with datasets

Controlling the tracing data you process and persist is necessary to manage your license consumption.

After creating datasets, you can apply trace behaviors to set sampling rates for your datasets without needing to write fine-grained sampling rules. While datasets are the underpinning of behaviors, the behaviors themselves let you change the sampling rates of one or more datasets from either the Chronosphere Observability Platform app or by using Chronoctl.

Get started with behaviors

Create the right datasets for your organization and then use behaviors to manage the data that those datasets generate. Use the built-in behavior types to control sampling rates and apply them across datasets.

Observability Platform includes an unassigned dataset you can use for experimentation. The baseline behavior is mapped to the unassigned dataset by default, which lets you start sending traces to Observability Platform with pre-configured sampling rules based on best practices. As you learn about how the unassigned dataset generates data, create additional datasets for each of your individual services.

As you learn more about your trace data, you can edit facets of the baseline behavior to modify your sampling strategy and more clearly define which traces to drop and which to keep. In most cases, you want to assertively drop less interesting, lower-value traces and keep more interesting, higher-value traces. In the Observability Platform app, you can quickly modify the baseline sampling strategy and apply it across one or more datasets.

Trace behaviors implement head sampling rules by default, which means you need only set the percentage at which you want to keep processed data. For example, if you set the head sampling rate to 100%, you're keeping 100% of traces. If you don't set a behavior for a dataset, Observability Platform uses defined tail sampling rules. However, you can use trace datasets and behaviors without needing to write tail sampling rules.

To specify a sample rate for head sampling in the baseline behavior, you must instrument head sampling first. Chronosphere supports the OpenTelemetry JaegerRemoteSampler head sampling standard.

Behaviors are part of the Trace Control Plane, which also includes datasets and head and tail sampling rules. You need administrative access to use the Trace Control Plane.

Trace behavior types

Trace behaviors can be one of the following types:

  • Baseline behavior: Sample data in your datasets using data-driven best practices. Select which facets to apply to your low-value and high-value trace data, and modify the criteria as you learn more about the characteristics of your trace data.

    The baseline behavior is mostly proactive, and helps to identify what trace data to bring into Observability Platform and what data to drop.

  • Allow behavior: Sample your data at 100% to allow all traces.

    The allow behavior can be both proactive and reactive. For example, you might want more high-fidelity data during a deploy to catch any issues (proactive), or allow all traces from a specific service or operation when debugging issues (reactive).

  • Deny behavior: Sample your data at 0% to block all traces.

    The deny behavior can be both proactive and reactive. For example, drop all traces from a service because the data isn't useful (proactive), or stop traces from a dataset that's currently generating too much data so you don't exceed your license limit (reactive).

View behaviors

You can view and filter available trace behaviors using Observability Platform, and return the trace behavior definition using Chronoctl.

To view trace behaviors:

  1. In the navigation menu, click Go to Admin and then select Control > Trace Control Plane.

  2. Select the Behaviors tab to view the baseline, allow, and deny behaviors, including which datasets each behavior type currently applies to. The dataset match criteria displays the dataset definition the behavior operates on.

  3. Select the individual behavior you want to view details for:

    • Allow: Displays the assignment history for the datasets where the allow behavior is currently active, and the datasets where this behavior ended.

    • Deny: Displays the assignment history for the datasets where the deny behavior is currently active, and the datasets where this behavior ended.

    • Baseline: Displays head and tail sampling statistics for the baseline behavior when it was active for any datasets. The assignment history displays the datasets where the baseline behavior is currently active, and the datasets where this behavior ended.

      You can edit the baseline behavior to change the tail sampling methodology and modify the facets that define which traces to drop and keep.

Manage assigned behaviors

You can manage the assigned behaviors for a dataset on two levels:

  • Assign a main behavior to define the primary behavior for a dataset.

  • Assign an override behavior to temporarily override the main behavior.

    You can assign only one main behavior and one override behavior to a dataset.

Both the main and override layers can use any of the trace behavior types, which are baseline, allow, and deny. You can also create custom behaviors and assign them to the main or override layers on datasets. When assigning a behavior to the override layer, you can set the behavior to start immediately, or schedule it to start at a future time.

When managing assigned behaviors, you can set the shaping order for overlapping trace datasets. The shaping order determines the priority order to apply behaviors when traces in one dataset overlap with traces in another dataset. For example, if a trace belongs to more than one dataset with an assigned behavior, Observability Platform uses the behavior assigned to the dataset that's first in the shaping order.

The shaping order applies only when the selected behavior is active.

Select from the following methods to assign behaviors to a dataset.

đź’ˇ

If a behavior isn't currently assigned to a dataset, then you must assign behaviors to the dataset by selecting a dataset from the Overview tab of Trace Control Plane.

To assign behaviors to a dataset:

  1. In the navigation menu, click Go to Admin and then select Control > Trace Control Plane.

  2. Click the Behaviors tab, and then click the name of the behavior you want to manage.

  3. On the Behavior Details page, locate the dataset you want to manage behaviors for, click the three vertical dots icon, and select Manage behaviors.

    The selected behavior must be assigned to a dataset to manage it from the Behaviors tab of Trace Control Plane.

  4. On the Manage behaviors page, in the Main layer pane, select a main behavior from the dropdown.

  5. Optional: In the Override layer pane, select an override behavior and choose the override start and end time, and select a duration for how long the override remains active.

  6. Select a shaping order for your main behavior. Shaping order is in decreasing priority, so a behavior in position one takes precedence over a behavior in position three.

    The shaping order section indicates the percentage of overlap between datasets so you can better understand the impact to traces in other datasets.

  7. Click Save to save the behavior definition for your dataset.

Customize your sampling strategy

You can edit the facets of the baseline behavior to modify the tail sampling strategy you want to apply to your datasets. There are two main parts of the tail sampling strategy with different facets you can modify: drop less interesting, low-value traces and keep more interesting, high-value traces.

  • Less interesting traces might be ones that denote success. These traces indicate that your app is working as designed, but you likely don't need to keep most of them.
  • More interesting traces might be error traces, which indicate an issue that requires attention. Keeping more interesting traces helps when you're debugging issues and need to identify the source of the problem.

There are several facets you can configure:

  • Failed traces: Traces containing root spans that have a status set to Error, as defined by the OpenTelemetry span status (opens in a new tab).
  • Small traces: Traces with very few spans that often indicate repeated messages about successful operations or incomplete instrumentation.
  • Large traces: Traces that are difficult to parse because of their large size, and which are rarely used in incident debugging due to a high ratio of noise to signal. These traces also consume a large amount of the persistence budget.
  • Slow traces: Traces that take a long time to complete, which can indicate issues in a related operation or service.
  • Fast traces: Traces that complete very quickly, which can be either repeated messages about successful operations or incomplete traces.

Although each part of the baseline sampling strategy includes a set of facets by default, you can move facets from the low-value section to the high-value section, and the reverse. You can select the number of spans to keep, the duration for slow and fast traces, and the sampling rate for each facet.

This flexibility lets you shape your sampling strategy as you learn more about your trace data and determine what information is valuable to your organization, and what isn't. You can disable some of the facets if you don't want to use them in your baseline strategy.

For example, you don't want to keep most of your successful traces because you know they're successful, and you want to sample only a portion of your very small traces. In the Default section of the sampling strategy, you set the sample rate to 0.1%, and then set the sample rate for small traces to 10%.

In the high-value section of the sampling strategy, you define the criteria for the traces you want to keep. The facets for this part of the strategy might include failed traces, large traces, and slow traces.

When you assign your customized baseline behavior to a dataset, Observability Platform applies the following match criteria in order:

  1. All traces in the dataset get matched against the behavior's low-value trace criteria. If a trace meets one or more of these criteria, the lowest possible sample rate applies.
  2. All remaining traces in the dataset get matched against the high-value trace criteria. If a trace meets one or more of these criteria, the highest possible sampling rate applies.
  3. If a trace doesn't match either of these criteria, the sample rate specified in the Default section applies.

Edit the baseline behavior

You can edit facets of the baseline behavior using the Observability Platform app only.

To edit the baseline behavior:

  1. In the navigation menu, click Go to Admin and then select Control > Trace Control Plane.

  2. Click the Behaviors tab, and then click the Baseline behavior.

  3. On the Behavior details panel, click Edit.

  4. On the Edit baseline behavior panel, define your sampling methodology:

    • Head sampling: Specify the sample rate for head sampling to capture only a portion of possible traces from originating services.

      To specify a sample rate for head sampling in the baseline behavior, you must instrument head sampling first.

    • Tail sampling: Specify the sample rate for each of the facets to drop low-value traces and keep high-value traces. Enable which facets to include in your tail sampling strategy, and specify a sampling rate for each facet.

      Specific sample rates depend on how your organization configures individual services. The following recommendations are based on best practices observed across many organizations:

      • Large traces: Set the number of spans to sample to 10,000, and set a sample rate percentage of 25 or less. Some organizations might set the number of spans to sample closer to 3,000 and set the sample rate percentage to 5.
      • Slow traces: Set the minimum duration threshold to 5 seconds and set a sample rate between 80 percent and 100 percent.
      • Failed traces: Set the sample rate between 80 percent and 100 percent to ensure that a repository of interesting error information is available for engineers investigating a problem.
      • Small traces: Set the number of spans to sample to 2, and set a sample rate percentage of 50 or less.
      • Fast traces: Set the maximum duration threshold to 0.00001 seconds and set a sample rate percentage of 50 or less.

      To move facets between the low-value and high-value sections of the tail sampling strategy, click the left or right arrow for each facet you enable.

    • Default: Set the sample rate percentage to a mid-range value, such as 50. If a trace doesn't match the low or high-value criteria, then the sample rate specified in the Default section applies.

  5. Click Save to save the changes to your baseline behavior.

Create custom behaviors

Chronosphere provides specific trace behavior types your organization can use to define your sampling strategy for trace datasets. These behaviors let you determine which traces to drop and which to keep across assigned datasets.

As you learn more about your trace data, you can create custom behaviors and set different tail sampling rates for individual datasets. Custom behaviors let you modify behaviors for each dataset, ensuring that you're spending the highest portion of your budget on the most critical and relevant data.

To create a custom behavior, you duplicate the baseline behavior, define the tail sampling rules to apply, and set a default sampling rate for the behavior. You can then assign the custom behavior to a dataset.

After creating a custom behavior, you can duplicate that behavior rather than duplicating the baseline behavior.

To create a custom behavior:

  1. In the navigation menu, click Go to Admin and then select Control > Trace Control Plane.

  2. Click the Behaviors tab, and take one of the following actions to display the Duplicate behavior dialog:

    • In the Chronosphere managed behaviors section, in the row for the Baseline behavior, click Duplicate.
    • Click the Baseline behavior, and then on the Baseline page, click Duplicate.
  3. On the Duplicate behavior panel, enter a name and description for your custom behavior. Observability Platform generates a slug based on the name you enter. You can modify the slug to any alphanumeric combination, but special characters aren't supported.

  4. Modify the tail sampling attributes to reflect the traces you want to drop or keep. See edit the baseline behavior for recommendations based on best practices observed across many organizations.

  5. Set the default sample rate percentage. If a trace doesn't match the low or high-value tail sampling attributes you defined, then the default sample rate applies.

  6. Click Save.

After creating a custom behavior, you can assign the behavior to a dataset.

Modify custom behaviors

After creating a custom behavior, you can modify its attributes. For example, you might want to modify the tail sampling strategy of a behavior as you learn more about the dataset the behavior is assigned to.

To modify a custom behavior:

  1. In the navigation menu, click Go to Admin and then select Control > Trace Control Plane.

  2. Click the Behaviors tab, and take one of the following actions to display the Edit behavior dialog:

    • In the Custom behaviors section, in the row for the behavior you want to modify, click Edit.
    • Click the behavior you want to modify, and then on the selected behavior page, click Edit.
  3. Modify the tail sampling attributes to reflect the traces you want to drop or keep. See edit the baseline behavior for recommendations based on best practices observed across many organizations.

  4. Click Save.

Delete custom behaviors

To delete a behavior that's assigned to one or more datasets, you must first assign all active and scheduled datasets to a different behavior. Observability Platform won't let you delete a behavior that's assigned to a dataset.

To delete a custom behavior:

  1. In the navigation menu, click Go to Admin and then select Control > Trace Control Plane.

  2. Click the Behaviors tab, and take one of the following actions to display the Delete behavior dialog:

    • In the Custom behaviors section, in the row for the behavior you want to delete, click Delete.
    • Click the behavior you want to delete, and then on the selected behavior page, click Delete.
  3. On the Delete behavior confirmation dialog, click Delete.

Observability Platform deletes the custom behavior.