> ## Documentation Index
> Fetch the complete documentation index at: https://docs.chronosphere.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Datagen source plugin

export const entity_0 = "Datagen source plugin"

export const plugin_0 = "Datagen source plugin"

The Datagen [source plugin](/ingest/pipeline/plugins/source-plugins)
(name: `datagen`, alias: `apache_common`) generates simulated log data through
[Gofakeit](https://github.com/brianvoe/gofakeit). You can use this simulated
data to test your pipeline configurations.

Although this is a self-contained plugin that doesn't communicate with external
sources, it's still classified as a [pull-based](/ingest/pipeline/plugins/source-plugins#push-based-and-pull-based-source-plugins)
source plugin.

<Note>
  This plugin doesn't support the use of a
  [descriptive metadata name](/ingest/pipeline/plugins#descriptive-names) in the
  Pipeline Builder interface.
</Note>

<Note>
  This plugin doesn't support duplicates of itself within the same pipeline.
</Note>

## Supported telemetry types

The {plugin_0} for Chronosphere Telemetry Pipeline supports these telemetry types:

|                    Logs                    |             Metrics             |              Traces             |
| :----------------------------------------: | :-----------------------------: | :-----------------------------: |
| <Icon icon="circle-check" color="green" /> | <Icon icon="ban" color="red" /> | <Icon icon="ban" color="red" /> |

## Configuration parameters

Use the parameters in this section to configure the {entity_0}. The
Telemetry Pipeline web interface uses the items in the **Name** column to
describe these parameters. [Pipeline configuration files](/ingest/pipeline/v2/configure/config-files)
use the items in the **Key** column as YAML keys.

### General

| Name         | Key        | Description                                                                                                                                                     | Default |
| ------------ | ---------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
| **Template** | `template` | Required. A JSON object that specifies how to generate log data. Accepted types: Apache common, Syslog rfc3164, or Syslog rfc5424. See [Templates](#templates). | *none*  |
| *none*       | `rate`     | How often to generate logs, in seconds.                                                                                                                         | `1`     |

### Advanced

| Name                    | Key             | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                     | Default |
| ----------------------- | --------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
| **Memory Buffer Limit** | `mem_buf_limit` | For pipelines with the Deployment or DaemonSet [workload](/ingest/pipeline/v2/configure/kubernetes/workloads) type only. Sets a limit for how much buffered data the plugin can write to memory, which affects [backpressure](/ingest/pipeline/v2/configure/backpressure). This value must follow Fluent Bit's rules for [unit sizes](https://docs.fluentbit.io/manual/administration/configuring-fluent-bit#unit-sizes). If unspecified, no limit is enforced. | *none*  |

## Templates

Templates define the structure and content of your simulated log data. When you
configure the Datagen source plugin in a pipeline's YAML configuration file, use
the `template` key to create a JSON object. For example:

```yaml theme={null}
pipeline:
  inputs:
    - name: datagen
      template: |-
        {
            "host":            "{{ IPv4Address }}",
            "user": {
              "user-identifier": "{{ Username }}",
              "auth-user-id":    "{{ UUID }}",
            }
            "method":          "{{ HTTPMethod }}",
            "request":         "{{ URLPath }}",
            "protocol":        "{{ HTTPVersion }}",
            "response-code":   "{{ HTTPStatusCode }}",
            "bytes":           "{{ Number 0 30_000 }}"
        }
```

When the Datagen plugin generates logs, each log matches the format of your template,
but any included [functions](#functions) are replaced by simulated data. For best
results, Chronosphere recommends creating a JSON object that uses custom strings
for its keys and functions for its values.

## Functions

The Datagen plugin uses functions to generate randomized values that follow a
consistent pattern. For example, `{{ HTTPMethod }}` can output a string like
`GET`, `POST`, or `PUT`, but not a string like `GREEN` or `SUNDAY`.

To invoke a function in a [template](#templates), surround the name of that
function with double curly braces.

### Suggested functions

You can use any of the available
[Gofakeit functions](https://github.com/brianvoe/gofakeit#functions)
in a template. The following functions are best suited for generating realistic
log data:

* `{{ HTTPMethod }}`: Returns an HTTP method, like `GET`.
* `{{ HTTPStatusCode }}`: Returns an HTTP status code, like `403`.
* `{{ HTTPVersion }}`: Returns an HTTP protocol version, like `HTTP/1.1`.
* `{{ IPv4Address }}`: Returns a random IPv4 address.
* `{{ IPv6Address }}`: Returns a random IPv6 address.
* `{{ Username }}`: Returns a random username, like `Bailey1270`.
* `{{ UUID }}`: Returns a string of numbers in UUID format.
* `{{ Date }}`: Returns a random time in UTC format.
* `{{ LogLevel syslog }}`: Returns a log type, like `alert`.
* `{{ Number 0 30_000 }}`: Returns a number between 0 and 30,000.
* `{{ URLPath }}`: Returns a string that resembles a URL path, like `/foo/bar`.
  This function is unique to Datagen and isn't included in Gofakeit.
* `{{ URLScheme }}`: Returns either `http` or `https`. This function is unique
  to Datagen and isn't included in Gofakeit.

## Example output

For a pipeline with the following template:

```yaml theme={null}
pipeline:
  inputs:
    - name: datagen
      template: |-
        {
            "response-code":   "{{ HTTPStatusCode }}",
            "bytes":           "{{ Number 0 30_000 }}"
            "host":            "{{ IPv4Address }}",
        }
```

The Datagen source plugin generates simulated log data in the following format:

```shell theme={null}
{"response-code"=>201.000000, "bytes"=>29600.000000, "host"=>"64.15.54.123"}

{"response-code"=>302.000000, "bytes"=>429.000000, "host"=>"214.221.240.201"}

{"response-code"=>205.000000, "bytes"=>3507.000000, "host"=>"27.79.197.250"}

{"response-code"=>504.000000, "bytes"=>758.000000, "host"=>"211.135.114.51"}

{"response-code"=>404.000000, "bytes"=>12190.000000, "host"=>"84.74.177.0"}
```
