> ## Documentation Index
> Fetch the complete documentation index at: https://docs.chronosphere.io/llms.txt
> Use this file to discover all available pages before exploring further.

# S3 Input (SQS) source plugin

export const entity_0 = "S3 Input (SQS) source plugin"

export const plugin_0 = "S3 Input (SQS) source plugin"

> Requires [pipeline agent](/ingest/pipeline/v2/component-versions) v25.8.1 or later,
> [Core Operator](/ingest/pipeline/v2/component-versions) v3.67.0 or later, and
> [Pipeline CLI](/ingest/pipeline/pipeline-cli) v3.66.0 or later.

The S3 Input (SQS) [source plugin](/ingest/pipeline/plugins/source-plugins) (name: `s3_sqs`)
lets you continuously ingest new data from Amazon S3 buckets into a telemetry pipeline.
This plugin monitors an SQS queue configured to receive notifications
[directly from S3](https://docs.aws.amazon.com/AmazonS3/latest/userguide/ways-to-add-notification-config-to-bucket.html)
or [through SNS](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-configure-subscribe-queue-sns-topic.html),
then creates logs from the files described in S3 events. This plugin ignores
any SQS notifications that can't be decoded as references to an object in an S3 bucket,
but it doesn't [filter](#filtering) notifications by event type.

This is a [pull-based](/ingest/pipeline/plugins/source-plugins#push-based-and-pull-based-source-plugins) source plugin.

<Note>
  This plugin doesn't support duplicates of itself within the same pipeline.
</Note>

## Supported telemetry types

The {plugin_0} for Chronosphere Telemetry Pipeline supports these telemetry types:

|                    Logs                    |             Metrics             |              Traces             |
| :----------------------------------------: | :-----------------------------: | :-----------------------------: |
| <Icon icon="circle-check" color="green" /> | <Icon icon="ban" color="red" /> | <Icon icon="ban" color="red" /> |

## Requirements

To use the S3 Input (SQS) plugin, you must meet these requirements:

* Your IAM user or IAM role must have the following permissions for the ARN of your
  SQS queue:
  * [`sqs:ChangeMessageVisibility`](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_ChangeMessageVisibility.html)
  * [`sqs:DeleteMessage`](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_DeleteMessage.html)
  * [`sqs:GetQueueAttributes`](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_GetQueueAttributes.html)
  * [`sqs:GetQueueUrl`](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_GetQueueUrl.html)
  * [`sqs:ReceiveMessage`](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_ReceiveMessage.html)

* Your IAM user or IAM role must have the
  [`s3:GetObject`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html)
  permission for all buckets configured to notify your SQS queue.

* Your SQS queue must have
  [a redrive policy and a dead letter queue](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-dead-letter-queues.html).
  Additionally, Chronosphere recommends setting the `maxReceiveCount` of your redrive
  queue to a value greater than `1`, which lets SQS retry sending messages upon failure.

## Configuration parameters

Use the parameters in this section to configure the {entity_0}. The
Telemetry Pipeline web interface uses the items in the **Name** column to
describe these parameters. [Pipeline configuration files](/ingest/pipeline/v2/configure/config-files)
use the items in the **Key** column as YAML keys.

### Required

| Name                     | Key                | Description                                                                                                             | Default |
| ------------------------ | ------------------ | ----------------------------------------------------------------------------------------------------------------------- | ------- |
| **AWS SQS Queue Name**   | `sqs_queue_name`   | Required. The name of the SQS queue whose notifications you want to monitor.                                            | *none*  |
| **AWS SQS Queue Region** | `sqs_queue_region` | Required if `aws_sqs_endpoint` isn't set. The name of the region where your SQS queue exists. For example, `us-east-1`. | *none*  |

### Advanced

| Name                                | Key                          | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | Default                                                                                                 |
| :---------------------------------- | :--------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------ |
| **Regular Expression Object Match** | `match_regexp`               | The regular expression for matching or excluding object keys from S3. This plugin processes notifications only for objects that match the specified regular expression. If not set, the default value of `.*` matches all possible object keys.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | `.*`                                                                                                    |
| **Delete Message from SQS**         | `delete_messages`            | If `true`, deletes SQS messages after processing the associated S3 data. If `false`, the plugin re-processes each message at an interval specified by your SQS [visibility timeout](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-visibility-timeout.html), and continues to process each message until a [redrive policy](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-dead-letter-queues.html) is triggered or until you meet your SQS message retention policy. Chronosphere recommends not modifying this value unless you're testing a pipeline during its initial setup. This is because deleting SQS messages prevents the plugin from processing the same message multiple times and creating duplicate logs. Accepted values: `true`, `false`. | `true`                                                                                                  |
| **Line Buffer Max Size**            | `max_line_buffer_size`       | The maximum line size the plugin will read from [JSON or plain text files](#supported-data-types).                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | `10MiB`                                                                                                 |
| **S3 Assume Role ARN**              | `s3_assume_role_arn`         | The ARN of the IAM role for accessing S3 buckets. This can be an ARN within the same account or [across accounts](#cross-account-access).                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | *none*                                                                                                  |
| **S3 Role External ID**             | `s3_role_external_id`        | The [external ID](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_common-scenarios_third-party.html) of the role to assume in S3.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | *none*                                                                                                  |
| **SQS Assume Role ARN**             | `sqs_assume_role_arn`        | The ARN of the IAM role for accessing the SQS queue. This can be an ARN within the same account or [across accounts](#cross-account-access).                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | *none*                                                                                                  |
| **SQS Role External ID**            | `sqs_role_external_id`       | The [external ID](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_common-scenarios_third-party.html) of the role to assume in SQS.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | *none*                                                                                                  |
| **SQS Queue Owner Account ID**      | `sqs_queue_owner_account_id` | The AWS account ID of the queue owner for [cross-account access](#cross-account-access).                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | *none*                                                                                                  |
| **S3 Read Concurrency**             | `s3_read_concurrency`        | The maximum number of concurrent S3 [`GetObject`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html) calls that this plugin will make.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | The number of logical CPUs allocated to each pipeline [replica](/ingest/pipeline/v2/configure/scaling). |
| **Memory Buffer Limit**             | `mem_buf_limit`              | For pipelines with the Deployment or DaemonSet [workload](/ingest/pipeline/v2/configure/kubernetes/workloads) type only. Sets a limit for how much buffered data the plugin can write to memory, which affects [backpressure](/ingest/pipeline/v2/configure/backpressure). This value must follow Fluent Bit's rules for [unit sizes](https://docs.fluentbit.io/manual/administration/configuring-fluent-bit#unit-sizes). If unspecified, no limit is enforced.                                                                                                                                                                                                                                                                                                                                                            | *none*                                                                                                  |

## Authentication methods

The S3 Input (SQS) plugin supports the following authentication methods:

* [EKS Pod Identities](#eks-pod-identities)
* [IMDS](#imds)
* [IRSA](#irsa)
* [Static credentials](#static-credentials)

### EKS Pod Identities

To use EKS Pod Identities for authentication:

1. In AWS, configure [EKS Pod Identities](https://docs.aws.amazon.com/eks/latest/userguide/pod-identities.html).

2. In [Pipeline CLI](/ingest/pipeline/pipeline-cli), add the following flag to a `create pipeline`
   or `update pipeline` command:

   ```shell /VALUE/ theme={null}
   calyptia {create|update} pipeline --service-account VALUE
   ```

   Replace *`VALUE`* with the name of the Kubernetes service account associated
   with your Pods.

### IMDS

To use IMDS for authentication:

* In AWS, configure [IAM roles for your EC2 instance](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html).

### IRSA

To use IRSA for authentication:

1. In AWS, [set up IRSA for your EKS cluster](https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html).

2. [Assign an IAM role to your Kubernetes service account](https://docs.aws.amazon.com/eks/latest/userguide/associate-service-account-role.html).

3. In [Pipeline CLI](/ingest/pipeline/pipeline-cli), add the following flag to a `create pipeline`
   or `update pipeline` command:

   ```shell /VALUE/ theme={null}
   calyptia {create|update} pipeline --service-account VALUE
   ```

   Replace *`VALUE`* with the name of your Kubernetes service account.

### Static credentials

To use static credentials for authentication:

* In Telemetry Pipeline, [create secrets](/ingest/pipeline/v2/configure/secrets#add-a-secret)
  that contain the values of your [IAM access keys](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html).
  These secrets must use the key names `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`.

<Note>
  You don't need to add an explicit [reference](/ingest/pipeline/v2/configure/secrets#reference-a-secret)
  to these secrets in your pipeline configuration file. If secrets with the correct
  key names are present, the S3 Input (SQS) plugin automatically detects these values
  and uses them for authentication.
</Note>

## Cross-account access

The S3 Input (SQS) plugin supports cross-account access through the following methods.

* [Using IAM roles](#cross-account-access-using-iam-roles) (recommended)
* [Using resource-based policies](#cross-account-access-using-resource-based-policies)

Cross-account access and authentication are independent. You can use any cross-account
access method with any [authentication method](#authentication-methods).

### Cross-account access using IAM roles

To set up cross-account access using IAM roles, use the following
[configuration parameters](#configuration-parameters):

* `sqs_assume_role_arn`: Required.
* `sqs_role_external_id`: Required if you need to use an external ID to assume
  the SQS role.

The S3 (SQS) plugin will extract the necessary account ID from
the value of `sqs_assume_role_arn`, which means that the `sqs_queue_owner_account_id`
parameter isn't required. However, if you do specify a value for `sqs_queue_owner_account_id`,
that value takes precedence over the value extracted from `sqs_assume_role_arn`.

### Cross-account access using resource-based policies

To set up cross-account access using resource-based policies, use the
following [configuration parameters](#configuration-parameters):

* `sqs_assume_role_arn`: Required if you're using an assumed role.
* `sqs_role_external_id`: Required if you need to use an external ID to assume
  the SQS role.
* `sqs_queue_owner_account_id`: Required if the SQS queue to which your policy
  is attached has a different owner than the account specified in
  `sqs_assume_role_arn`, or if you aren't using assumed roles.

## Supported data types

The S3 Input (SQS) plugin can ingest [JSON objects](#json) and [plain text](#plain-text)
from files stored in S3 buckets, including gzip-compressed files. Additionally,
this plugin can extract and ingest compressed and uncompressed files from
[tar archives](#tar-archives).

### JSON

This plugin can ingest data from JSON files with these file extensions:

* `.json`
* `.jsonl`
* `.ndjson`

If a file contains only a single JSON object, this plugin creates
a new log from that object. If a file contains multiple
[newline-delimited JSON (NDJSON)](https://github.com/ndjson/ndjson-spec)
objects, this plugin creates a new log from each JSON object within that file.
Key/value pairs from JSON objects are stored as key/value pairs in the resulting log.

For JSON files that use gzip compression (with file extensions such as `.json.gzip`
or `.json.gz`), this plugin decompresses each file before processing it accordingly.

### Plain text

If a file doesn't use a file extension that identifies it as a JSON file, the
S3 Input (SQS) plugin processes that file as plain text. It creates a new log from
each line of the file and stores the content in a key named `_raw` within the
resulting log.

For non-JSON files that use gzip compression (with file extensions that include
the `.gzip` or `.gz` suffix), this plugin decompresses each file before processing
it accordingly.

### Tar archives

The plugin can extract and consume files from tar archives with these file extensions:

* `.tar`
* `.tar.gz`
* `.tar.gzip`

After the plugin extracts these files, it processes
any [JSON](#json) and [plain text](#plain-text) data accordingly, but skips directories
and symbolic links.

If files inside a tar archive are gzip-compressed, this plugin decompresses those
files accordingly.

## Filtering

The S3 Input (SQS) plugin doesn't filter notifications by event type. If a
notification contains a reference to an object in an S3 bucket, the plugin will
ingest data from that object, regardless of its associated event type.

To create filters based on event type, you must configure the notification settings
of your SQS settings in AWS. For more information, see the AWS
[Event notification types and destinations](https://docs.aws.amazon.com/AmazonS3/latest/userguide/notification-how-to-event-types-and-destinations.html)
guide.

## Metadata

The S3 Input (SQS) plugin attaches the following metadata to each log:

* `__chrono_bucket`: The name of the S3 bucket that contains the file from which
  the log was created.
* `__chrono_file`: The key of the S3 object from which the log was created.
* `__chrono_tar_file_entry`: For data extracted from [tar archives](#tar-archives)
  only. The name of the tar archive that contained the file from which the log
  was created.

## Get started

To get started with the S3 Input (SQS) plugin, follow these steps.

1. Either [create a new pipeline](/ingest/pipeline/v2/build/create-modify#create-a-pipeline)
   or [modify an existing pipeline](/ingest/pipeline/v2/build/create-modify).

2. For testing purposes, set the pipeline's destination to
   [standard output](/ingest/pipeline/plugins/destination-plugins/standard-output).

3. Set the pipeline's source to S3 Input (SQS), and then add values for all required
   parameters, along with any optional parameters of your choosing.

4. Set up one of the supported [authentication methods](#authentication-methods)
   for the S3 Input (SQS) source plugin.

5. In the Telemetry Pipeline web interface, go to the summary page for that pipeline.

6. In the [**Pipeline Output**](/ingest/pipeline/navigate#pipeline-output-v2-pipelines) section,
   click **Get latest logs**.

7. Review this log output to ensure that you're receiving data from S3. If you don't
   receive any data, or if you encounter connection errors, review your plugin
   configuration settings.

8. After you've confirmed that the S3 Input (SQS) plugin is functioning correctly,
   you can overwrite the standard output destination with the destination where
   you want to send your telemetry data.
