Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.chronosphere.io/llms.txt

Use this file to discover all available pages before exploring further.

The Responder reactively triages and mitigates undesired system behavior. Their primary motivation is to treat the symptom and alleviate the pain, now.

Pain points

  • Has some understanding of their part of the system, but lacks a clear picture of what’s upstream, downstream, or underneath (infrastructure).
  • Difficult to assess the real scope and impact of issues and alerts, either because of the alert signal-to-noise ratio or due to lack of helpful context in the data.
  • Time is critical, and encountering slow queries is frustrating.
  • Different telemetry types make it possible to view the system with different granularity and perspectives, but disconnected data means having to restart the search when moving between them.
  • Manually-created alert links and runbooks are prone to error.

Common tasks

  • Receives and reviews alert notifications for problems in their system.
  • Triages issues by reviewing alert details to assess severity, blast radius, location of the root cause, and customer impact.
  • Views observability data related to the notifying alert on service pages and dashboards.
  • Compares current observability data to previous time periods using time range controls.
  • Temporarily silences alert notifications using muting rules.
  • Collaborates with other responders using notebooks and comments.
  • Reviews alert data about the current alert, or about other alerts in a similar time range, to identify patterns using differential diagnosis.
  • Documents mitigation measures taken and notifies relevant stakeholders.
  • Adjusts alerting evaluation logic to better reflect issue severity and risk by updating monitors and notification policies.
  • Annotates observability data to explain anomalies, such as dropped data, cardinality spikes, or license violations.

Connected personas

The Responder and the Watcher are closely linked. A single person might move back and forth between the two several times during an incident. The Watcher becomes a Responder when they notice an issue even without being notified, and returns to the Watcher role after a mitigation action is taken. The line between the Responder and the Investigator is similarly blurry, but is best understood through motivation. The Responder is focused on mitigating symptoms, while the Investigator is focused on treating the underlying root cause. Some severe issues can’t be mitigated without addressing the underlying problem, at which point the Responder becomes an Investigator.