Examples for tracing
Use Chronosphere tracing when you want to locate a service operation causing latency issues to other services that rely on it. The following examples use data from the OpenTelemetry Astronomy Shop Demo (opens in a new tab), which is an open source, microservice-based distributed system that illustrates the implementation of OpenTelemetry in a near real-world environment.
The following example highlights annotations, which you can use to link to tracing data from a dashboard. The example assumes that you received a notification from an alert that triggered for a monitor.
You click a link in the notification, which directs you to the Order Service Latency monitor. This monitor tracks requests and errors for the
On the Order Service Latency monitor, in the Query Results chart, you notice a continual spike in queries to the
In the Annotations section of the monitor, you click a link to a dashboard.
In the Order Service Overview dashboard, you notice a wave of spikes in requests to the
/ordering.Ordering/Checkoutoperation of the
On the Requests chart, click any point and then click Query Traces.
The link opens Trace Explorer with a predefined search query that includes the service and operation you want to explore.
On the Trace Explorer page, click the Topology View tab to view a mapping of affected upstream and downstream services.
In the Search Services box, enter
ordering-svcto scope the view to that service.
ordering-svcnode to display details.
In the Node Details panel, you see 176 errors incoming and 119 errors outgoing connected to the
ordering-svcservice. As you zoom in on the topology view, you notice that the edge connecting to the
billing-svcservice is thicker than the others.
In the Node Details panel for the
billing-svc, you notice that outgoing requests to the
In the Node Details panel, click Include to include the
billing-svcin your search query. Your search query now includes:
You determined that the
billing-svcservice is generating the most errors, which is also impacting the
On the Trace Explorer page, click Create Metric to create a trace metric for detecting future issues with the
Other on-call engineers can use this trace metric to open a predefined query in Trace Explorer and help reduce the time to identify and fix issues with this service.
The Services page provides efficient views into your services to help you discover ways of exploring your data. You can link directly to tracing data related to a specific service from the Services page. Chronosphere automatically generates these links to help you monitor services and connected telemetry.
In the navigation menu, select Services.
In the My services table, your Deployer Service has a currently alerting monitor that exceeds the defined critical conditions.
Select the Deployer Service to display its individual service page.
In the Dependency Map, you notice that this service has downstream errors.
In the Related Queries section, click View Traces to open Trace Explorer with the context defined in the service page.
In the Trace Explorer page, in the Statistics section, select Leaf Errors in the Metric dropdown menu to highlight services that include error spans with no failing child spans.
These errors are the deepest errors within a request flow, and are often the reason why an entire trace fails.
In the sparklines chart, you notice spikes in leaf errors for the
In the Group by field, enter Operation so that you're grouping results by both Service and Operation.
In the Statistics table, you see that the
auth.Auth/Authenticateoperation of the
gatewayauthhas the most leaf errors.
Click the gatewayauth service to add it and the
auth.Auth/Authenticateoperation to your search filter.
At the top of the page, click the link icon to copy a link with the defined filter criteria that you can share with team owning that owns the
You identified the service and operation with the most leaf errors and can send a contextual link to the team responsible for that service. By focusing on leaf errors, you located the root issue impacting related traces and can provide that context to the owning team.
The following example begins in Trace Explorer. Maybe you navigated here from Trace Metrics, a dashboard, or a monitor, and now you're exploring trace data to identify where issues are occurring.
In the navigation menu select Exploring > Trace Explorer.
Set the Time Window to Within last and 30 minutes.
From the Showing Error States dropdown, click Only errors.
This search returns too many traces to narrow down the issue. You think the issue relates to the
frontendservice, but don't know which related operation is the culprit. Modify the search criteria to narrow your search.
In the Search Summary field, enter
frontendand then click that service from the search results.
Your search narrows the results and scope to only spans that include the frontend service. On the Statistics tab, you notice that the loadgenerator service has a high error rate.
On the Statistics tab under the Error Percentage column, click loadgenerator, and then click Include in Span Filter in the resulting dialog to add the
loadgeneratorservice to your search query.
You know that the
loadgeneratorservice is contributing to your trace latency, but still aren't sure what the main issue is.
Click the Traces tab to view a list of the most relevant traces for your search.
In the Trace column, click loadgenerator > HTTP GET to display the trace details for that service and operation combination.
You notice errors in operations for two additional services related to the
GEToperation on both the
frontendservices have high latency.
Click the frontend service, which updates the Span Details panel with information specific to that service and operation combination.
You now have detailed information about the specific services and operations causing latency issues.
In LINKS, click + Add Link to add a templated link to your external logging service, which provides other users access to the logs related to this span.
In PROCESS, you identify
k8s.pod.name, which is the Kubernetes pod the
GETrequest originates. You can begin investigating that specific operation to remediate the issue.
To the right of the value for
k8s.pod.name, click the more icon and then click Add to Filter to add the value of that process to your search query.
On the Trace Explorer page, you can click Create Metric to create a trace metric based on your updated search. You can use trace metrics to create dashboards and monitors for key metrics that you want to track and get alerts for.