Service pages
A service page takes advantage of consistent metric tagging to automatically produce queries for common compute and RPC metrics values, generate data visualization panels based on those interactive queries, and lists the status of of related monitors.
Service pages are part of the Services feature and might not be available in your app. For information about enabling this feature in your environment, contact Chronosphere Support.
On the Services page, click any service's Name to see that service's detail page.
To edit basic information about a service:
-
In the Service information section, click Edit.
-
Make any needed changes to the following values:
- Service name: Change the service name. This value defaults to the name discovered by Chronosphere Observability Platform.
- Parent team: Set a team to own the service.
- Select a notification policy: Choose an optional notification policy.
- Service Description: Add a description of your service to help others understand the service's purpose.
-
Click Save.
Service page contents can't be directly modified or customized. Observability Platform maintains these views for you as a streamlined entry point to a service's telemetry data.
If you require more customization than a service page provides, you can create a dashboard, write your own queries, or configure your own data visualization panels. If you need a more customizable home page for a team or collection, see Team and collection home pages.
The service page exposes the queries that it uses for the data visualization panels. You can copy and paste them into Metrics Explorer or a dashboard you've created and modify them as needed.
Service page components
The service page header contains links to the service's related team and a dropdown to select a different service.
The service page also contains these columns:
- The primary column displays data visualization panels for compute metrics and RPC metrics (if available), controls for filtering these panels (including by time span), and a list of all monitors related to the service (along with their statuses).
- The sidebar column collects links to related resources, which can include traces in the Trace Explorer, change events in the Changes Explorer, logs in the Logs Explorer, and dashboards.
Observability Platform includes the following default components:
- Compute (cAdvisor)
- RPC (gRPC)
- Istio
- OTEL HTTP server metrics
To enable links to Logs Explorer from your service pages, contact Chronosphere Support and provide the following information:
- The name of your organization from your Observability Platform URL. For example,
MY_COMPANY
.chronosphere.io:443
. - The LogScale repository you want to open when clicking the link on a services page.
- The field name in your log data that refers to your service.
Visualization panels
Each service displays visualization panels populated with queries for commonly tracked metrics.
-
Compute metrics panels visualize CPU Usage, Memory Usage, and Network Usage for all pods with metrics tagged for this service.
-
RPC panels visualize Requests per second, Errors per second, and Duration P99 (slowest request in the 99th percentile of requests). Panels with an alert have truncated queries. Panels with a warning use a query which took too long.
Filter metrics
Service pages support pinned scopes. You can filter metrics by labels and their values. To view and set scopes for available labels:
-
Click the Pin scope dropdown to display a list of all labels and selectable values for them.
-
Click a label value to select it. Selecting a value filters all queries in the service page's metrics visualization panels to display only the metrics with those label values.
Because the dependency map displays services from trace data, selecting scopes doesn't affect that view.
If any filters are active, they appear as a list of labels and values in the Apply global filter field. To remove a filter, click the cancel button for that filter in the list.
Interact with panels
A service page's visualization panels display data as time-series line charts, which are interactive.
- Hold the pointer over lines in a chart to view data at that point.
- Click a line to pin the data view at that point. Click Show All in the view to toggle between showing only the selected line and all lines at that point in time.
- Click and drag a region in the chart to narrow the view to a time range. Selecting a time range modifies all panels simultaneously, which helps you correlate data across the different types of data displayed on the page.
You can also select a time range by clicking the clock icon , which displays either a list of time ranges or the option to define a custom range.
Annotations
Services can integrate with annotations for events. If your service has Kube state metrics (opens in a new tab), those annotations can display on your service graphs.
Next to the Global filters search box, select the toggle. This toggle is configurable. For example, Kube Deploy Events is an available toggle for annotations.
Activating the toggle updates the service's graphs to display annotation marks along the top of each graph where events occurred. Hold the pointer over an annotation mark to display additional information about the event, such as the occurrence time, and global filters this event matches.
Kube Deploy Events is the only metric configured to use annotations with services. To configure additional metrics, contact Chronosphere support.
Filter panels
You can filter the panels for a service based on the discovered metric labels. For example, if a service includes a Compute panel that includes metric labels like container and pod, you can select individual values for that label to display usage statistics for only those instances.
The default behavior displays All values.
To filter panels:
- In a panel, click the dropdown arrow for the label you want to select values for.
- Select the values you want to display usage statistics for in the panels.
Explore queries
Each panel's visualization uses an automatically generated metrics query. You can't modify the query or visualization type in this view. To edit and inspect a panel's query and examine its results in a larger view, you can open it in the Metrics Explorer.
You can also copy the query from the Metrics Explorer and paste it into other tools, including a visualization panel on a dashboard.
To open a panel's query in the Metrics Explorer:
- Click the three vertical dots icon in the panel you want to explore.
- In the menu that appears, click Open in Metrics Explorer.
Display change events
To enable change events on service pages, contact Chronosphere Support.
You can enable change events to display them on panels for each of your services. This capability lets you visualize changes in your environment relative to a specific service.
To display events on service pages, your organization must meet the following requirements:
- Your organization must have an active license for the Events product, and the product must be active in your tenant.
- Events data must have service names provided on a metric label.
- Service name values for each event must exactly match the service name value used in metrics.
- Observability Platform service discovery must be configured and actively discovering services.
After enabling change events, you can choose the category of events you want to display. You can select a single or multiple categories. Clicking a change event from a panel gives you the option to open the event in Changes Explorer to view the full summary for that event.
Enabling change events on service panels also adds a Change Events table to your service page that contains a list of events related to the selected service. Clicking a change event from the table opens a panel for the specific event that displays the full summary.
To display change events for services:
- On the Services page, click any service's Name to open the service's detail page.
- On the service's detail page, click the Change Events toggle to enable change events for that service.
- In the Category dropdown, select the change event categories you want to display on your service panels.
- In any panel, hold the pointer over a change event to display more details for a particular event.
- Click the event to pin it, and then click Open in Changes Explorer to open the event in Changes Explorer with the full summary for that event.
From Changes Explorer, you can filter
specifically on services by including lens_service
and the name of the service you
want to display. For example, the following filter displays all deploy
events for
the auth
service:
lens_service = "auth" AND category = "deploys"
Change events also display in a table pages for individual services. The information is similar to the change events list in Changes Explorer, but is filtered to the selected service.
Dependency map
To view dependency maps for trace data in service pages, your organization must purchase distributed tracing (opens in a new tab) and have it enabled in your tenant.
When receiving an alert about your service, you need to identify and fix the issue as quickly as possible. If the issue isn't with your service, you want to inform the responsible team so they can investigate the issue.
Use dependency maps to view a perspective that's scoped to an individual service, including its upstream and downstream dependencies. These maps visualize trends in duration, requests, and errors, so you can understand trends to focus your investigation. See Service exploration for an example of how to use dependency maps to help identify issues.
Understand trends
Each service has its own dependency map that includes trending requests, errors, and latencies for that service and upstream and downstream services.
Arrows on connecting edges indicate increases and decreases in trends for duration, requests, and errors, so you can identify spikes to focus your investigation. Although the icons are the same, each trend uses a different color for the arrow icons to visually indicate the trend type. When you select a trend statistic from the Trends menu, the icons change to match your selection.
Single, double, and triple arrows indicate different percentages of change. An edge with a triple arrow indicates a larger trend than an edge with a single arrow:
Arrow type | Icon | Indication |
---|---|---|
Single | Change of 0% up to 5% | |
Double | Change greater than 5%, up to 15% | |
Triple | Change greater than 15% |
The thickness of edges correlate with request volume. The thicker the edge, the higher the request volume.
Use trends and edge thickness in combination to gauge trend severity. For example, a thin edge with a triple arrow indicates very few requests, but a high trend increase. This combination might be less insightful than a thick edge, which on its own might indicate many requests to a highly used request path.
Find the root cause
Using the dependency map on its own might be enough to identify where issues are occurring and which services to investigate. Sometimes, you need more information to identify the root cause, such as exploring span details to identify issues with a particular operation in a service.
From a dependency map, you can access options to open a predefined query in Trace Explorer for the selected service. This capability means you don't have to create a query without any context and have a starting point to investigate from. Click the click the more icon and choose one of these options:
- Explore trace data opens Trace Explorer with a predefined query. You can add additional search criteria as you define your search.
- View full map opens Trace Explorer, scoped to the topology view. The full topology map provides a broader perspective to help identify upstream and downstream issues.
Trend statistics
In the dependency map, hold the pointer over a service to display totals for the selected trend statistic.
To view more detailed trend statistics, click an individual service in the map or one of its edges to display incoming and outgoing data, such as requests, errors, leaf errors, and duration:
- Requests: Counts of all spans within the selected group, within the selected time range of seconds in the time range. Ranked in descending order.
- Errors: The number of spans that indicate an error outcome. Ranked in descending order.
- Leaf errors: Error spans that have no failing child spans. These spans are often the potential cause of a trace's failure. Navigating directly to leaf errors helps filter out propagated errors, and provides clearer signals about the source of an error that might be causing the entire trace to fail. Ranked in descending order.
- Median duration P50: Ranks the spans of each group in order of duration, and selects the duration of the span in the middle of the list (fiftieth percentile). Ranks groups in descending order of this duration.
- Tail duration P99: Tail refers to the statistical notion of the upper tail of a distribution. This statistic ranks the spans of each group in order of duration, and selects the duration of the span that's 99% of the way through the list, meaning, a span that typically has a high duration. Ranks service and operation in descending order of this duration.
Identify issues behind suspicious trends
The dependency map visualizes trends, but understanding what issues are causing
trends is critical to resolving the underlying issue. From a dependency map, you can
use differential diagnosis
to identify trends and immediately scan through all related tags and values to pinpoint
the exact tag:value
pairs most closely correlated with suspicious behavior.
When you click a service in a dependency map, trends display for duration, requests, and errors. From this detailed view, click the more icon for the statistic you want to run differential diagnosis on, and then click Differential Diagnosis.
Observability Platform navigates you to the Trace Explorer page and selects the Differential Diagnosis tab. The context you selected in the dependency maps moves with you to Trace Explorer.
See more details
If your service has Compute(cAdvisor) or RPC(gRPC) panels, you can click a More details link to open a view with filters scoped to the service page you came from. These views support pinned scopes.
For example, clicking the More details link on a Compute(cAdvisor) panel opens that view with CPU usage statistics scoped to the service page you came from.
Metrics on the Compute(cAdvisor) page are defined in Monitoring cAdvisor with Prometheus (opens in a new tab). Observability Platform uses these metrics to display CPU, memory, network, and disk usage metrics during the selected time period.
The first group of graphs displays the average and maximum values for that metric. In the CPU, Memory, Network, and Disk details sections, the relevant statics might be further broken down by pod and node if those statistics are available.
With this information, you can more clearly visualize which pods or nodes have the highest contributions to each statistic.
Use the Compare menu to select a second time period to graph against the current selected period.
Drag in any chart to select a time frame across all charts on the page.
Own dashboards and monitors
This feature isn't available to all Chronosphere Observability Platform users and might not be visible in your app. For information about enabling this feature in your environment, contact Chronosphere Support.
Services can own dashboards and monitors. Monitors owned by a service directly affect the service's status.
The total number of owned and connected items displays next to Dashboards or
All monitors. For example, a service with one connected dashboard and one
owned dashboard displays (2)
. Click Manage next to either title to open the
management menu. From this menu you can:
- See the number of Owned items. Click to manage.
- See the number of Connections. Click to manage.
- Create a New dashboard or New monitor.
For some instances, you can use the Observability Platform app to create or edit a monitor or dashboard to assign it to a service.
To add a dashboard or monitor to a service, with the service as the owner:
- In the navigation menu, click Go to Admin.
- Click Services, and then select a service.
- Next to Dashboards or All Monitors, click Manage and then New dashboard or New monitor.
- Populate the form and click Save.
The selected service is listed as the owner.
Connected resources
In addition to displaying monitors and dashboards owned by a service, you can connect these resources to a service, even when they're owned by another service. Connecting monitors and dashboards to services can help you identify issues in services that are up or downstream more quickly.
Dashboards and monitors display in tables on the right side of the page. You can connect existing resources to the currently viewed service. Dashboards owned by the selected service prepopulate the list.
Connect a service to a resource
To connect an existing resource:
- Next to Dashboards or All Monitors, click Manage.
- Select Manage connections.
- Search for a resource using one or more of the following methods:
- Scroll the list until you find the dashboard.
- Type the resource name in the Search box.
- Use the Select a team menu to choose a specific team.
- Use the Select an owner menu to choose a specific collection or service.
- Select the checkboxes next to the desired resources. You can select multiple resources at once.
- Click Connect [resources].
When resources are connected to a service, they display in the Connected [resources] table.
Edit a connected resource
- Next to All monitors or Dashboards, click Manage.
- Select Manage connections.
- Make changes.
Affect service health
When viewing a connected monitor, toggle Affects service health to include this resource in the service status.
Remove a connection to a dashboard or monitor
- Next to All monitors or Dashboards, click Manage.
- Select Manage connections.
- In the Linked dashboards or Linked monitors section, select the checkboxes next to the items you want to unlink.
- Click Delete [resource] connections.