Monitoring
The different components of Chronosphere Telemetry Pipeline generate their own observability metrics that you can monitor in real time. These metrics include data about throughput, volume, and the health of your components.
Scope
Telemetry Pipeline metrics are grouped at the project level. Each project's metrics include the following labels, where applicable:
- Core Instance IDs
- Pipeline IDs
- Fleet IDs
- Agent and replica IDs
Access
There are two main ways to access your Telemetry Pipeline metrics: through Chronosphere Observability Platform, or by importing data into a third-party platform. Both methods are available to all Telemetry Pipeline users at no extra cost.
Observability Platform
You can access your Telemetry Pipeline metrics in an Observability Platform dashboard (opens in a new tab). These dashboards include visualizations for the following metrics:
- Input (in bytes)
- Input (number of records)
- Output (in bytes)
- Output (number of records)
- Total retries
- Total dropped
- Total filtered
Observability Platform receives these metrics from your Telemetry Pipeline components every 30 seconds granularity and stores them for 60 days.
To get started with Telemetry Pipeline monitoring in Observability Platform, contact Chronosphere Support.
Third-party platform
You can also access your Telemetry Pipeline metrics in the third-party platform of your choosing. To generate an import token, contact Chronosphere Support.
Metrics list
Each component of Telemetry Pipeline emits a variety of metrics. These metrics flow directly to the Chronosphere backend and don't impact the performance or behavior of Telemetry Pipeline itself.
Metric | Granularity | Type | Unit | Description |
---|---|---|---|---|
fluentbit_input_bytes_total | per input | counter | bytes | The number of bytes that an input instance successfully ingested. |
fluentbit_input_records_total | per input | counter | records | The number of records that an input instance successfully ingested. |
fluentbit_filter_bytes_total | per input | counter | bytes | The number of bytes that a filter instance successfully ingested. |
fluentbit_filter_records_total | per input | counter | records | The number of records that a filter instance successfully ingested. |
fluentbit_filter_added_records_total | per input | counter | records | The number of records that a filter instance added to a pipeline. |
fluentbit_filter_dropped_records_total | per input | counter | records | The number of records that a filter instance removed from a pipeline. |
fluentbit_input_storage_overlimit | per input | gauge | Boolean | Whether the input instance has exceeded the configured value of mem_buf_limit . |
fluentbit_input_storage_memory_bytes | per input | gauge | bytes | The amount of memory that an input instance is consuming to buffer logs in chunks. |
fluentbit_input_storage_chunks | per input | gauge | chunks | The current number of chunks that belong to an input instance. |
fluentbit_input_storage_chunks_up | per input | gauge | chunks | The current number of chunks that are in memory for an input instance. If filesystem storage is available, chunks that in memory are also available in the filesystem layer. |
fluentbit_input_storage_chunks_down | per input | gauge | chunks | The current number of chunks that are in the filesystem for an input instance. |
fluentbit_input_storage_chunks_busy | per input | gauge | chunks | The current number of chunks being processed by an output instance and aren't eligibile to have new data appended. |
fluentbit_input_storage_chunks_busy_bytes | per input | gauge | bytes | The total size of all chunks currently being processed. |
fluentbit_output_dropped_records_total | per output | counter | records | The number of records that were dropped by an output instance. This includes records that faced an unrecoverable error, or records in a chunk for whom retries expired. |
fluentbit_output_errors_total | per output | counter | chunks | The number of records that faced both recoverable and unrecoverable errors. This value describes the number of times a chunk has failed, and does not match the number of error messages generated in a pipeline's log output. |
fluentbit_output_proc_bytes_total | per output | counter | bytes | The amount of data successfully sent by an output instance. This value describes all unique chunks sent. Records that encounter an error don't count towards this total. |
fluentbit_output_proc_records_total | per output | counter | records | The number of records successfully sent by an output instance. This value describes all unique chunks. Records that encounter an error don't count towards this total. |
fluentbit_output_retried_records_total | per output | counter | records | The number of records that experienced a retry. This value is calculated at the chunk level and increments when an entire chunk is marked for retry. An output might perform multiple actions that generate multiple error messages when uploading a single chunk. |
fluentbit_output_retries_failed_total | per output | counter | chunks | The number of times that retries expired for a chunk. This value increments when an output instance exceeds the number of retries specified by Retry_Limit . |
fluentbit_output_retries_total | per output | counter | chunks | The number of times that an output instance requested retries for a chunk. |
fluentbit_output_upstream_total_connections | per output | gauge | bytes | The total connection count for an output instance. |
fluentbit_output_upstream_busy_connections | per output | gauge | bytes | The total count of busy connections for an output instance. |
fluentbit_uptime | per pipeline | counter | seconds | The number of seconds that a pipeline has been running. |
fluentbit_process_start_time_seconds | per pipeline | gauge | seconds | The timestamp of when a pipeline started running, in Unix epoch format. |
fluentbit_build_info | per pipeline | gauge | seconds | Build version information about a pipeline. This value initializes the Unix epoch timestamp of configuration context. |
fluentbit_hot_reloaded_times | per pipeline | gauge | n/a | The number of times that a pipeline underwent a hot reload. |
fluentbit_input_chunks.storage_chunks | per pipeline | gauge | chunks | The total number of chunks that a pipeline is currently buffering. |
fluentbit_storage_mem_chunk | per pipeline | gauge | chunks | The total number of chunks currently buffered in the memory of a pipeline. Note that a chunk can simultaneously be in memory and on the file system. |
fluentbit_storage_fs_chunks | per pipeline | gauge | chunks | The total number of chunks in the filesystem of a pipeline. |
fluentbit_storage_fs_chunks_up | per pipeline | gauge | chunks | The total number of chunks both in the filesystem and in the memory of a pipeline. |
fluentbit_storage_fs_chunks_down | per pipeline | gauge | chunks | The total number of chunks that are only in the filesystem of a pipeline. This value excludes pipelines that are both in the filesystem and in memory. |
fluentbit_storage_fs_chunks_busy | per pipeline | gauge | chunks | The total number of busy chunks in a pipeline. |
fluentbit_storage_fs_chunks_busy_bytes | per pipeline | gauge | bytes | The total size of busy chunks in a pipeline. |