Service pages

Service pages

A service page takes advantage of consistent metric tagging to automatically produce queries for common compute and RPC metrics values, generate data visualization panels based on those interactive queries, and lists the status of of related monitors.

On the Services page, click any service's Name to see that service's detail page.

Click Edit to update the following values:

  • Service name: Change the service name. This value defaults to the name discovered by Chronosphere.
  • Parent team: Set a team to own the service.
  • Select a notification policy: Choose an optional notification policy.
  • Service description: Add a description of your service to help others understand the service's purpose.

Service page contents can't be directly modified or customized. Chronosphere maintains these views for you as a streamlined entry point to a service's telemetry data.

If you require more customization than a service page provides, you can create a dashboard, write your own queries, or configure your own data visualization panels. If you need a more customizable home page for a team or collection, see Team and collection home pages.

The service page exposes the queries that it uses for the data visualization panels. You can copy and paste them into Metrics Explorer or a dashboard you've created and modify them as needed.

Service page components

The service page header contains links to the service's related team and a dropdown to select a different service.

The service page also contains these columns:

  • The primary column displays data visualization panels for compute metrics and RPC metrics (if available), controls for filtering these panels (including by time span), and a list of all monitors related to the service (along with their statuses).

  • The sidebar column collects links to related resources, which can include traces in the Trace Explorer, change events in the Changes Explorer, logs in the Logs Explorer, and dashboards.

    To enable links to Logs Explorer from your service pages, contact Chronosphere Support and provide the following information:

    • The name of your organization from your Observability Platform URL. For example,
    • The LogScale repository you want to open when clicking the link on a services page.
    • The field name in your log data that refers to your service.

Visualization panels

Each service displays visualization panels populated with queries for commonly tracked metrics.

  • Compute metrics panels visualize CPU Usage, Memory Usage, and Network Usage for all pods with metrics tagged for this service.

  • RPC panels visualize Requests per second, Errors per second, and Duration P99 (slowest request in the 99th percentile of requests). Panels with an alert have truncated queries. Panels with a warning use a query which took too long.

Filter metrics

Service pages support global filters. You can filter metrics by labels and their values. To view and set filters for available labels:

  1. Click the Apply global filter dropdown to display a list of all labels and selectable values for them.

  2. Click a label value to select it. This filters all queries in the service page's metrics visualization panels to display only the metrics with those label values.

    Because the dependency map displays services from trace data, selecting filters doesn't affect that view.

If any filters are active, they appear as a list of labels and values in the Apply global filter field. To remove a filter, click the cancel button for that filter in the list.

Interact with panels

A service page's visualization panels display data as time-series line charts, which are interactive.

  • Hold the pointer over lines in a chart to view data at that point.
  • Click a line to pin the data view at that point. Click Show All in the view to toggle between showing only the selected line and all lines at that point in time.
  • Click and drag a region in the chart to narrow the view to a time range. Selecting a time range modifies all panels simultaneously, which helps you correlate data across the different types of data displayed on the page.

You can also select a time range by clicking the calendar icon , which displays either a list of time ranges or the option to define a custom range.

Filter panels

You can filter the panels for a service based on the discovered metric labels. For example, if a service includes a Compute panel that includes metric labels like container and pod, you can select individual values for that label to display usage statistics for only those instances.

The default behavior displays All values.

To filter panels:

  1. In a panel, click the dropdown arrow for the label you want to select values for.
  2. Select the values you want to display usage statistics for in the panels.

Explore queries

Each panel's visualization uses an automatically generated metrics query. You can't modify the query or visualization type in this view. To edit and inspect a panel's query and examine its results in a larger view, you can open it in the Metrics Explorer.

You can also copy the query from the Metrics Explorer and paste it into other tools, including a visualization panel on a dashboard.

To open a panel's query in the Metrics Explorer:

  1. Click the three vertical dots icon in the panel you want to explore.
  2. In the menu that appears, click Open in Metrics Explorer.

Dependency map

Dependency maps require tracing data.

When receiving an alert about your service, you need to identify and fix the issue as quickly as possible. If the issue isn't with your service, you want to inform the responsible team so they can investigate the issue.

Use dependency maps to view a perspective that's scoped to an individual service, including its upstream and downstream dependencies. These maps visualize trends in duration, requests, and errors, so you can understand trends to focus your investigation. See Service exploration for an example of how to use dependency maps to help identify issues.

Understand trends

Each service has its own dependency map that includes trending requests, errors, and latencies for that service and upstream and downstream services.

Arrows on connecting edges indicate increases and decreases in trends for duration, requests, and errors, so you can identify spikes to focus your investigation. Although the icons are the same, each trend uses a different color for the arrow icons to visually indicate the trend type. When you select a trend statistic from the Trends menu, the icons change to match your selection.

Single, double, and triple arrows indicate different percentages of change. An edge with a triple arrow indicates a larger trend than an edge with a single arrow:

Arrow typeIconIndication
SingleSingle arrow iconChange of 0% up to 5%
DoubleDouble arrow iconChange greater than 5%, up to 15%
TripleTriple arrow iconChange greater than 15%

The thickness of edges correlate with request volume. The thicker the edge, the higher the request volume.

Use trends and edge thickness in combination to gauge trend severity. For example, a thin edge with a triple arrow indicates very few requests, but a high trend increase. This combination might be less insightful than a thick edge, which on its own might indicate many requests to a highly used request path.

Find the root cause

Using the dependency map on its own might be enough to identify where issues are occurring and which services to investigate. Sometimes, you need more information to identify the root cause, such as exploring span details to identify issues with a particular operation in a service.

From a dependency map, you can access options to open a predefined query in Trace Explorer for the selected service. This capability means you don't have to create a query without any context and have a starting point to investigate from. Click the click the more icon and choose one of these options:

  • Explore trace data opens Trace Explorer with a predefined query. You can add additional search criteria as you define your search.
  • View full map opens Trace Explorer, scoped to the topology view. The full topology map provides a broader perspective to help identify upstream and downstream issues.

Trend statistics

In the dependency map, hold the pointer over a service to display totals for the selected trend statistic.

To view more detailed trend statistics, click an individual service in the map or one of its edges to display incoming and outgoing data, such as requests, errors, leaf errors, and duration:

  • Requests: Counts of all spans within the selected group, within the selected time range of seconds in the time range. Ranked in descending order.
  • Errors: The number of spans that indicate an error outcome. Ranked in descending order.
  • Leaf errors: Error spans that have no failing child spans. These spans are often the potential cause of a trace's failure. Navigating directly to leaf errors helps filter out propagated errors, and provides clearer signals about the source of an error that might be causing the entire trace to fail. Ranked in descending order.
  • Median duration P50: Ranks the spans of each group in order of duration, and selects the duration of the span in the middle of the list (fiftieth percentile). Ranks groups in descending order of this duration.
  • Tail duration P99: Tail refers to the statistical notion of the upper tail of a distribution. This statistic ranks the spans of each group in order of duration, and selects the duration of the span that's 99% of the way through the list, meaning, a span that typically has a high duration. Ranks service and operation in descending order of this duration.

See more details

If your service has Compute(cAdvisor) or RPC(gRPC) panels, you can click a More details link to open a view with filters scoped to the service page you came from. These views support global filters.

For example, clicking the More details link on a Compute(cAdvisor) panel opens that view with CPU usage statistics scoped to the service page you came from.

Metrics on the Compute(cAdvisor) page are defined in Monitoring cAdvisor with Prometheus (opens in a new tab). Chronosphere uses these metrics to display CPU, memory, network, and disk usage metrics during the selected time period.

The first group of graphs displays the average and maximum values for that metric. In the CPU, Memory, Network, and Disk details sections, the relevant statics might be further broken down by pod and node if those statistics are available.

With this information, you can more clearly visualize which pods or nodes have the highest contributions to each statistic.

Use the Compare menu to select a second time period to graph against the current selected period.

Drag in any chart to select a time frame across all charts on the page.