Signals
Signals use a unique set of labels to create groups of notifications when a monitor alert triggers or resolves. The groups display information from your monitor queries that aggregation might otherwise remove. Use signals to reduce the number of notifications sent, or to improve visibility by showing each time series alert.
Signals use the following defaults:
- Alerts send for the initial and any subsequent triggers for all signal types.
- All monitors default to
per monitor
when created unless you choose another option. - Chronosphere Observability Platform sends an additional notification every five minutes when a new time series alert triggers.
- If an alert remains active but no new time series trigger, Observability Platform sends a notification every hour.
- Signals use the notification policies set in Observability Platform.
Use the following options to group alerts, which affects how many notifications Observability Platform sends:
- Per monitor (one alert): Sends one notification containing all time series that meet the conditions. Use this option if you want only one notification that contains all time series.
- Per signal (multiple alerts): Sends one notification for each group in your monitor. Helpful for logically grouping time series into the same alert.
- Per time series (many alerts): Sends one notification for every time series returned by the monitor query. This option sends the most alerts, but is helpful if you want a notification for every time series that triggers your query.
You can configure a signal on an existing monitor or when creating a new monitor.
Observability Platform reserves specific Prometheus labels such as alertname
and
severity
. Observability Platform also uses the severity
label to group alerts,
except when a monitor specifies a signal per series.
Refer to the Prometheus metric naming recommendations for additional information.
View signals
Select from the following methods to view and signals on monitors.
On the Monitors page, monitors with defined signals display the file tree icon.
To view signals:
- In the navigation menu select Alerts > Monitors.
- Select the monitor to view.
Signals for the monitor display in the Signals section of the monitor definition.
Edit signals
-
In the navigation menu select Alerts > Monitors and select the monitor to edit.
-
Click the three vertical dots icon and select Edit Monitor.
-
Scroll to the Signals section and select one of the following options:
-
Per monitor (one alert): Observability Platform sends a notification using your selected notification policy. It includes all time series triggered by the monitor. This alert can send additional notifications if new time series trigger.
-
Per signal (multiple alerts): In the Label Key field, choose the label to group alerts from this monitor. To add more label keys, click the add icon.
You can use query aggregation to include or exclude specific labels. For example, create a query to group results by only the
namespace
andinstance
labels:count by (namespace, instance) (up)
-
Per time series (many alerts): The monitor sends a notification for every time series as it triggers. You can change the alert behavior or channel by changing the policy for the monitor or editing the notification policy.
-
Signal examples
Use the following examples to help you use signals, by type.
Per monitor
This example query generates a single notification that includes all alerting time series.
-
Enter the following query in the Query field, and select 15s as the check interval:
count by (namespace, job, instance) ({instance!="", namespace~=""})
-
In the Signals section, select Per monitor (one alert).
-
In the Conditions section, select Critical, choose is > as the operator, and enter
5s
in the Sustain field.
When the critical condition matches any time series, a single notification sends that includes all alerting time series.
Per signal
In this example, you configure a query to track outages and use signals to track multiple time series.
Four teams (frontend
, backend
, database
, and search
) are working on different
components of a project. Each component has a set of services and resources its team
monitors for performance and availability.
Observability Platform ingests these metrics:
resource_status{component="frontend", resource_type="availability", service_name="web_app"} 0
resource_status{component="frontend", resource_type="performance", service_name="web_app"} 0
resource_status{component="backend", resource_type="availability", service_name="api"} 0
resource_status{component="database", resource_type="availability", service_name="db_cluster_1"} 0
resource_status{component="database", resource_type="availability", service_name="db_cluster_2"} 0
resource_status{component="database", resource_type="availability", service_name="db_cluster_3"} 0
resource_status{component="search", resource_type="performance", service_name="search_engine"} 92
resource_status{component="search", resource_type="availability", service_name="search_engine"} 100
Each monitored service sends a metric to Observability Platform called
resource_status
. This metric has the following labels:
component
: The component and team name.resource_type
: Eitheravailability
orperformance
for a tracked metric resource.service_name
: The name of the service.
To define the signals for this data:
-
Enter the following query in the Query field to alert your teams when a resource isn't working as expected, and select 15s as the check interval:
resource_status == 0
This query looks for a status of
0
, which indicates a service failure. -
In the Signals field, select Per signal (multiple alerts) and enter the following labels in the Label key field to send the
component
andresource_type
information when an alert triggers:component,resource_type
-
In the Conditions section, select Critical, choose is > as the operator, and enter 5s in the Sustain field.
When a matching alert triggers:
-
The
frontend
team receives a notification with these metrics:resource_status{component="frontend", resource_type="availability", service_name="web_app"} 0 resource_status{component="frontend", resource_type="performance", service_name="web_app"} 0
-
The
backend
team receives a notification with this metric:resource_status{component="backend", resource_type="availability", service_name="api"} 0
-
The
database
team receives a notification with these metrics:resource_status{component="database", resource_type="availability", service_name="db_cluster_1"} 0 resource_status{component="database", resource_type="availability", service_name="db_cluster_2"} 0 resource_status{component="database", resource_type="availability", service_name="db_cluster_3"} 0
The search team doesn't receive any alerts because their services are functioning.
Per time series
This example uses the same query and condition as in the per signal example. The difference is that in the Signals field, select Per time series (many alerts).
-
Enter the following query in the Query field, and select 15s as the check interval:
count by (namespace, job, instance) ({instance!="", namespace~=""})
-
In the Signals field, select Per time series (many alerts) and enter the following labels in the Label key field to send the
component
andresource_type
information when an alert triggers:component,resource_type
-
In the Conditions section, select Critical, choose is > as the operator, and enter
5s
in the Sustain field.
When the critical condition matches, a notification for each time series in the monitor query triggers using your notification policy.