Google Cloud Big Query

Google Cloud BigQuery destination plugin

Google BigQuery is a fully managed, cloud-native data warehouse that allows businesses to store, analyze, and query large datasets in a scalable and cost-effective manner. It is part of the Google Cloud suite of services.

BigQuery provides a serverless architecture, meaning that users do not need to worry about infrastructure provisioning, management, or tuning. It can process terabytes of data in seconds and petabytes of data in minutes, making it suitable for organizations that need to quickly process and analyze large amounts of data.

The Google CloudBigQuery destination plugin in Calyptia Core lets you configure your pipeline to send your log data and metrics to Google Cloud BigQuery.

Configuration parameters

The Google Cloud BigQuery destination plugin provides these configuration parameters.

General

KeyDescription
Google Service Credentials PathThe Service Credentials file lets Calyptia Core communicate directly with Google Cloud Services. Read the following on how to set up service credentials: https://cloud.google.com/logging/docs/agent/logging/authorization#create-service-account (opens in a new tab).
Google Project IdThe project id containing the BigQuery dataset to stream into. If the service file is provided then the project id is taken from there.
Existing Data Set IDThe dataset id of the BigQuery dataset to write into. This dataset must exist in your project.
Existing Table IDThe table id of the BigQuery table to write into. This table must exist in the specified dataset and the schema must match the output.

Advanced

KeyDescription
Skip Invalid RowsIf on then insert all valid rows of a request, even if invalid rows exist. The default value is false, which causes the entire request to fail if any invalid rows exist.
Ignore Unknown ValuesAccept rows that contain values that do not match the schema. The unknown values are ignored. Default is false, which treats unknown values as errors.
Enable Workload Identity FederationEnables workload identity federation as an alternative authentication method. Cannot be used with service account credentials file or environment variable. AWS is the only identity provider currently supported.
Google Cloud Region for BigQueryGoogle Cloud region for BigQuery.
Google Cloud Project NumberGoogle Cloud project number where the identity provider was created. Used to construct the full resource name of the identity provider.
Google Cloud Pool IdGoogle Cloud workload identity pool where the identity provider was created. Used to construct the full resource name of the identity provider.
Google Cloud Provider IdGoogle Cloud workload identity provider. Used to construct the full resource name of the identity provider. Currently only AWS accounts are supported.

Security and TLS

KeyDescription
TLSEnable or disable TLS/SSL support.
TLS Certificate ValidationTurn TLS/SSL certificate validation on or off. TLS must be on for this setting to be enabled.
TLS Debug LevelSet TLS debug verbosity level. Accepts these values: 0 (No debug), 1 (Error), 2 (State change), 3 (Informational), 4 (Verbose).
CA Certificate File PathAbsolute path to CA certificate file.
Certificate File PathAbsolute path to certificate file.
Private key File PathAbsolute path to private key file.
Private Key Path PasswordOptional password for tls.key_file file.
TLS SNI Hostname ExtensionHostname to be used for TLS SNI extension.

Advanced networking

KeyDescription
DNS ModeSelect the primary DNS connection type (TCP or UDP).
DNS ResolverSelect the primary DNS connection type (TCP or UDP).
Prefer IPv4Prioritize IPv4 DNS results when trying to establish a connection.
KeepaliveEnable or disable Keepalive support.
Keepalive Idle TimeoutSet maximum time allowed for an idle Keepalive connection.
Max Connect TimeoutSet maximum time allowed to establish a connection, this time includes the TLS handshake.
Max Connect Timeout Log ErrorOn connection timeout, specify if it should log an error. When disabled, the timeout is logged as a debug message.
Max Keepalive RecycleSet maximum number of times a keepalive connection can be used before it is retired.
Source AddressSpecify network address to bind for data traffic.