OBSERVABILITY PLATFORM
Service exploration

Service exploration

This feature isn't available to all Chronosphere Observability Platform users and might not be visible in your app. For information about enabling this feature in your environment, contact Chronosphere Support.

The Services page provides efficient views into your services to help you discover ways of exploring your data. You can link directly to tracing data related to a specific service from the Services page. Chronosphere Observability Platform automatically generates these links to help you monitor services and connected telemetry.

  1. In the navigation menu select Services.

  2. In the list of services, your Deployer Service has a currently alerting monitor that exceeds the defined critical conditions.

  3. Select the Deployer Service to display its individual service page.

  4. In dependency map, in the Trends dropdown, select Errors.

    You notice that the edge for this service has downstream errors. From here, you can investigate trace data in Trace Explorer.

  5. Click the outgoing edge for the Deployer Service to display a panel with statistics about this edge.

  6. In the panel containing edge information, next to Requests, click the more icon to display these options.

    When you select one of these options, Observability Platform navigates you to Trace Explorer and applies the context defined in the service page:

    • View in Trace Explorer opens the Query builder in Trace Explorer. You can further refine the query to identify issues.
    • Differential Diagnosis opens the Differential Diagnosis tab in Trace Explorer. Use this information to identify trends and immediately scan through all related tags and values to pinpoint the exact tag:value pairs most closely correlated with suspicious behavior.
  7. In the Trace Explorer page, in the Statistics section, select Leaf Errors in the Metric dropdown menu to highlight services that include error spans with no failing child spans.

    These errors are the deepest errors within a request flow, and are often the reason why an entire trace fails.

  8. In the sparklines chart, you notice spikes in leaf errors for the gatewayauth service.

  9. In the Group by field, enter Operation so that you're grouping results by both Service and Operation.

    In the Statistics table, you see that the auth.Auth/Authenticate operation of the gatewayauth has the most leaf errors.

  10. Click the gatewayauth service to add it and the auth.Auth/Authenticate operation to your search filter.

  11. At the top of the page, click Copy link to copy a link with the defined filter criteria that you can share with team owning that owns the gatewayauth service.

You identified the service and operation with the most leaf errors and can send a contextual link to the team responsible for that service. By focusing on leaf errors, you located the root issue impacting related traces and can provide that context to the owning team.