Distributed traces

Overview of tracing

Distributed tracing is the process of mapping and analyzing requests as they flow through a distributed system. A trace represents a single request flow, which follows the path of a specific transaction or request from the origin point of the request, through an app, and back. A single trace can traverse hundreds of independent services that comprise an app. When aggregated, engineers can use a collection of traces to identify issues related to app latency and diagnose errors.

Each trace comprises one or more segments called spans. Each span is like a single unit of work, or a single operation within a trace. Spans have a parent/child relationship, and each span can include details like service name, operation name, operation duration, and additional metadata that describes the span. Combining this relationship with a timestamp enables assembling individual spans into a trace.

Traces are essential to provide a view of distributed systems based on the incoming and outgoing requests. This architecture can be complex and difficult to navigate as your app grows, making it more challenging to pinpoint issues. Distributed tracing provides insights into how and where data flows for individual requests, and where the source of a problem originates.

When an incident occurs, you can use the following Chronosphere tracing tools to locate and remediate issues:

  • Trace Explorer lets you search for traces and spans to help you identify, triage, and understand the root cause of problems. You can view all trace data that's relevant to a particular issue, and compare that trace against a previous time period to better understand where errors are occurring.

  • Trace Analyzer provides a real-time view of incoming traces grouped by tag and their relative frequency. This view helps you understand how often your apps emit traces, troubleshoot spikes in ingest rates, and ensure your Collector is aware of particular kinds of traces.

  • Trace Metrics let you create metrics based on trace data. You can create dashboards and monitors based on these metrics to help track information for collected traces, and create links from those locations to predefined queries in Trace Explorer.

Refer to the examples to learn more about how to use tracing as part of identifying and resolving issues.

Get traces into Chronosphere

To explore tracing data in Chronosphere, you must first install and configure one of the following Collectors to receive trace data from your services:

An added benefit of using the OpenTelemetry Collector is you can configure head sampling, which is a powerful control mechanism for managing your tracing costs.