From 714685a887574d4f86478bab4bcdb256be32a143 Mon Sep 17 00:00:00 2001 From: jansen-w Date: Thu, 21 May 2026 13:53:47 -0400 Subject: [PATCH] Add ClickHouse destination docs Documents the new Observability Pipelines ClickHouse destination, including the HTTP endpoint, target table/database, insert formats (json_each_row, json_as_object, json_as_string, arrow_stream), skip_unknown_fields, date_time_best_effort, compression, basic auth, TLS, batching, and arrow_stream batch encoding. Co-Authored-By: Claude Opus 4.7 --- .../destinations/_index.md | 2 + .../destinations/clickhouse.md | 182 ++++++++++++++++++ .../destination_env_vars/clickhouse.en.md | 9 + 3 files changed, 193 insertions(+) create mode 100644 content/en/observability_pipelines/destinations/clickhouse.md create mode 100644 layouts/shortcodes/observability_pipelines/configure_existing_pipelines/destination_env_vars/clickhouse.en.md diff --git a/content/en/observability_pipelines/destinations/_index.md b/content/en/observability_pipelines/destinations/_index.md index 53e4232e644..f1986ca3a02 100644 --- a/content/en/observability_pipelines/destinations/_index.md +++ b/content/en/observability_pipelines/destinations/_index.md @@ -24,6 +24,7 @@ These are the available destinations: - [Amazon S3][22] - [Amazon Security Lake][3] - [Azure Storage][4] +- [ClickHouse][24] - [CrowdStrike Next-Gen SIEM][6] - [Databricks (Zerobus)][23] - [Datadog Archives][2] @@ -67,6 +68,7 @@ These are the available destinations: [21]: /observability_pipelines/destinations/syslog/ [22]: /observability_pipelines/destinations/amazon_s3/ [23]: /observability_pipelines/destinations/databricks/ +[24]: /observability_pipelines/destinations/clickhouse/ {{% /tab %}} diff --git a/content/en/observability_pipelines/destinations/clickhouse.md b/content/en/observability_pipelines/destinations/clickhouse.md new file mode 100644 index 00000000000..93ffa31f21d --- /dev/null +++ b/content/en/observability_pipelines/destinations/clickhouse.md @@ -0,0 +1,182 @@ +--- +title: ClickHouse Destination +disable_toc: false +products: +- name: Logs + icon: logs + url: /observability_pipelines/configuration/?tab=logs#pipeline-types +further_reading: +- link: "/observability_pipelines/" + tag: "Documentation" + text: "Observability Pipelines" +--- + +{{< product-availability >}} + +## Overview + +Use Observability Pipelines' ClickHouse destination to send logs to a [ClickHouse][1] server over the HTTP interface. The destination supports JSON insert formats for mapping events to columns by name or storing raw payloads, and Apache Arrow IPC streaming for higher-throughput inserts. + +## Prerequisites + +Before you configure the ClickHouse destination, you must have: + +- A running ClickHouse server reachable from the Observability Pipelines Worker over the [HTTP interface][2]. +- A database and target table where events are inserted. The destination does not create the database or table for you. +- (Optional) Credentials for a ClickHouse user that has `INSERT` permission on the target table. The destination authenticates with HTTP Basic auth. +- (Optional) TLS material if your ClickHouse server requires HTTPS with client certificates. + +## Setup + +Configure the ClickHouse destination when you [set up a pipeline][3]. You can set up a pipeline in the [UI][4], using the [API][5], or with [Terraform][6]. The steps in this section are configured in the UI. + +
Only enter the identifiers for the ClickHouse endpoint URL and, if applicable, the username, password, and TLS key pass. Do not enter the actual values.
+ +After you select the ClickHouse destination in the pipeline UI: + +1. Enter the identifier for your ClickHouse HTTP endpoint URL. If you leave it blank, the [default](#secret-defaults) is used. +1. In the **Table** field, enter the name of the ClickHouse table to insert events into. This field is required. +1. (Optional) In the **Database** field, enter the name of the ClickHouse database that contains the table. If you leave it blank, the ClickHouse user's default database is used. +1. In the **Format** dropdown menu, select the insert format for events: + - `json_each_row` (default): Inserts each event as a JSON object on its own line. Event fields are mapped to columns of the same name. This maps to ClickHouse's [`JSONEachRow`][7] format. + - `json_as_object`: Inserts each event into a single `JSON`-typed column. This maps to ClickHouse's [`JSONAsObject`][8] format. + - `json_as_string`: Inserts each event into a single `String`-typed column, storing the raw JSON. This maps to ClickHouse's [`JSONAsString`][9] format. + - `arrow_stream`: Batches events using the Apache Arrow IPC streaming format. When you select this format, you must also configure [Batch encoding](#batch-encoding). + +### Optional settings + +#### Skip unknown fields + +Toggle the **Skip unknown fields** switch to drop event fields that are not present in the target table schema instead of returning an insert error. When this setting is left unset, the ClickHouse server's [`input_format_skip_unknown_fields`][10] setting applies. + +#### Date time best effort + +Toggle the **Date time best effort** switch to enable flexible `DateTime` parsing on the ClickHouse server. When enabled, the server accepts a wider range of date and time string formats. See ClickHouse's [`date_time_input_format`][11] setting for more information. + +#### Compression + +In the **Compression** dropdown menu, select the algorithm used to compress outbound HTTP requests: +- `gzip` (default): Compresses requests with gzip. +- `none`: Sends requests uncompressed. + +When you select `gzip`, you can optionally set the **Compression level** to an integer from `1` (fastest) to `9` (best compression). If left unset, the algorithm's default level is used. + +#### Enable basic authentication + +Toggle the **Enable Basic Auth** switch to authenticate to ClickHouse with HTTP Basic auth. + +1. Enter the identifier for your ClickHouse username. If you leave it blank, the [default](#secret-defaults) is used. +1. Enter the identifier for your ClickHouse password. If you leave it blank, the [default](#secret-defaults) is used. + +#### Enable TLS + +{{% observability_pipelines/tls_settings %}} + +#### Batch encoding + +Batch encoding is required when **Format** is set to `arrow_stream`. It is not used with the JSON formats. + +1. In the **Codec** dropdown menu, select `arrow_stream`. +1. (Optional) Toggle **Allow nullable fields** to allow `null` values for non-nullable columns in the target table. When this setting is off (default), missing values for non-nullable columns cause encoding errors. + +**Note**: When you use `arrow_stream`, your ClickHouse server must be reachable at the time the pipeline starts because the target table's schema is read from the server to build the Arrow encoder. + +#### Batching + +Use the **Batching** settings to control how events are grouped into HTTP inserts: + +1. (Optional) In the **Max events** field, enter the maximum number of events per batch. Must be `1` or greater. +1. (Optional) In the **Timeout (secs)** field, enter the maximum time, in seconds, before a partial batch is flushed. Must be between `1` and `65535`. If left unset, the default is 1 second. + +See [Event batching](#event-batching) for more information. + +## Configuration example + +When you configure the ClickHouse destination through the [API][5] or [Terraform][6], the destination accepts the following fields. The example below uses basic authentication, gzip compression, and the default `json_each_row` insert format. + +```yaml +type: clickhouse +endpoint_url_key: DESTINATION_CLICKHOUSE_ENDPOINT_URL +database: observability +table: application_logs +format: json_each_row +skip_unknown_fields: true +date_time_best_effort: true +compression: + algorithm: gzip + level: 6 +auth: + strategy: basic + username_key: DESTINATION_CLICKHOUSE_USERNAME + password_key: DESTINATION_CLICKHOUSE_PASSWORD +batch: + max_events: 10000 + timeout_secs: 5 +``` + +The values for `endpoint_url_key`, `username_key`, and `password_key` are the identifiers for the secret values, not the secret values themselves. The actual values are read from your secrets manager or from environment variables prefixed with `DD_OP_`. See [Secret defaults](#secret-defaults) for more information. + +When **Format** is set to `arrow_stream`, you must also configure `batch_encoding`: + +```yaml +type: clickhouse +endpoint_url_key: DESTINATION_CLICKHOUSE_ENDPOINT_URL +table: application_logs +format: arrow_stream +batch_encoding: + codec: arrow_stream + allow_nullable_fields: false +``` + +## Secret defaults + +{{% observability_pipelines/set_secrets_intro %}} + +{{< tabs >}} +{{% tab "Secrets Management" %}} + +- ClickHouse HTTP endpoint URL identifier: + - References the HTTP interface endpoint of your ClickHouse server. + - The default identifier is `DESTINATION_CLICKHOUSE_ENDPOINT_URL`. +- ClickHouse username identifier (when basic auth is enabled): + - The default identifier is `DESTINATION_CLICKHOUSE_USERNAME`. +- ClickHouse password identifier (when basic auth is enabled): + - The default identifier is `DESTINATION_CLICKHOUSE_PASSWORD`. +- ClickHouse TLS passphrase identifier (when TLS is enabled with an encrypted private key): + - The default identifier is `DESTINATION_CLICKHOUSE_KEY_PASS`. + +{{% /tab %}} + +{{% tab "Environment Variables" %}} + +{{% observability_pipelines/configure_existing_pipelines/destination_env_vars/clickhouse %}} + +{{% /tab %}} +{{< /tabs >}} + +## How the destination works + +### Event batching + +A batch of events is flushed when one of the configured parameters is met. See [event batching][12] for more information. + +| Maximum Events | Timeout (seconds) | +|----------------|-------------------| +| Configurable | Configurable | + +## Further reading + +{{< partial name="whats-next/whats-next.html" >}} + +[1]: https://clickhouse.com/docs +[2]: https://clickhouse.com/docs/interfaces/http +[3]: /observability_pipelines/configuration/set_up_pipelines/ +[4]: https://app.datadoghq.com/observability-pipelines +[5]: /api/latest/observability-pipelines/ +[6]: https://registry.terraform.io/providers/datadog/datadog/latest/docs/resources/observability_pipeline +[7]: https://clickhouse.com/docs/interfaces/formats/JSONEachRow +[8]: https://clickhouse.com/docs/interfaces/formats/JSONAsObject +[9]: https://clickhouse.com/docs/interfaces/formats/JSONAsString +[10]: https://clickhouse.com/docs/operations/settings/formats#input_format_skip_unknown_fields +[11]: https://clickhouse.com/docs/operations/settings/formats#date_time_input_format +[12]: /observability_pipelines/destinations/#event-batching diff --git a/layouts/shortcodes/observability_pipelines/configure_existing_pipelines/destination_env_vars/clickhouse.en.md b/layouts/shortcodes/observability_pipelines/configure_existing_pipelines/destination_env_vars/clickhouse.en.md new file mode 100644 index 00000000000..68360fd6996 --- /dev/null +++ b/layouts/shortcodes/observability_pipelines/configure_existing_pipelines/destination_env_vars/clickhouse.en.md @@ -0,0 +1,9 @@ +- ClickHouse HTTP endpoint URL: + - The HTTP interface endpoint your Observability Pipelines Worker sends events to. For example, `https://clickhouse.example.com:8443`. + - The default environment variable is `DD_OP_DESTINATION_CLICKHOUSE_ENDPOINT_URL`. +- ClickHouse authentication username (when basic auth is enabled): + - The default environment variable is `DD_OP_DESTINATION_CLICKHOUSE_USERNAME`. +- ClickHouse authentication password (when basic auth is enabled): + - The default environment variable is `DD_OP_DESTINATION_CLICKHOUSE_PASSWORD`. +- ClickHouse TLS passphrase (when TLS is enabled with an encrypted private key): + - The default environment variable is `DD_OP_DESTINATION_CLICKHOUSE_KEY_PASS`.