Skip to content

Feat/clickhouse replicas#167

Open
avaya09 wants to merge 3 commits intomainfrom
feat/clickhouse_replicas
Open

Feat/clickhouse replicas#167
avaya09 wants to merge 3 commits intomainfrom
feat/clickhouse_replicas

Conversation

@avaya09
Copy link
Contributor

@avaya09 avaya09 commented Mar 23, 2026

No description provided.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Helm chart configuration and documentation to support running Portkey against an external replicated and/or sharded ClickHouse cluster, wiring the required flags through Secrets into service environment variables.

Changes:

  • Add new ClickHouse external values (replicationEnabled, shardingEnabled, clusterName) to values.yaml.
  • Persist new ClickHouse replication/sharding settings into the ClickHouse Secret and expose them as env vars for backend/dataservice and gateway.
  • Add documentation for deploying replicated ClickHouse and migrating from the built-in single-node instance.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
charts/portkey-app/values.yaml Introduces new external ClickHouse replication/sharding/cluster configuration values.
charts/portkey-app/templates/clickhouse/secrets.yaml Stores the new replication/sharding/cluster settings in the ClickHouse Secret (when chart manages it).
charts/portkey-app/templates/_helpers.tpl Exposes new settings as env vars for backend/dataservice (CLICKHOUSE_*) and gateway (ANALYTICS_STORE_*).
charts/portkey-app/docs/clickhouse-replication.md New guide for deploying a replicated/sharded ClickHouse externally and configuring Portkey.
charts/portkey-app/docs/clickhouse-migration.md New migration runbook from built-in ClickHouse to external replicated cluster.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +113 to +130
echo "Copying data..."
clickhouse-client --host "${OLD_CH_HOST}" --port "${OLD_CH_PORT}" \
--user "${OLD_CH_USER}" --password "${OLD_CH_PASSWORD}" \
--query "SELECT * FROM ${CH_DATABASE}.${table} FORMAT Native" \
| clickhouse-client --host "${NEW_CH_HOST}" --port "${NEW_CH_PORT}" \
--user "${NEW_CH_USER}" --password "${NEW_CH_PASSWORD}" \
--query "INSERT INTO ${CH_DATABASE}.${table} FORMAT Native"

NEW_COUNT=$(clickhouse-client --host "${NEW_CH_HOST}" --port "${NEW_CH_PORT}" \
--user "${NEW_CH_USER}" --password "${NEW_CH_PASSWORD}" \
--query "SELECT count() FROM ${CH_DATABASE}.${table}")

if [ "${NEW_COUNT}" -ge "${OLD_COUNT}" ]; then
echo "Verification passed: old=${OLD_COUNT} new=${NEW_COUNT}"
else
echo "WARNING: Row count mismatch: old=${OLD_COUNT} new=${NEW_COUNT}"
FAILED=1
fi
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The migration job is not idempotent: it always pipes SELECT * ... FORMAT Native into an INSERT on the new cluster. Because this is a post-upgrade hook (and also retries via backoffLimit), reruns will append duplicate rows. Additionally, the verification uses new_count >= old_count, which can mask duplication. Please make the migration safe to rerun (e.g., truncate/replace target tables before copying, or only insert missing rows with a deterministic key strategy) and tighten verification so duplicates can’t be treated as success.

Copilot uses AI. Check for mistakes.
Comment on lines +114 to +120
clickhouse-client --host "${OLD_CH_HOST}" --port "${OLD_CH_PORT}" \
--user "${OLD_CH_USER}" --password "${OLD_CH_PASSWORD}" \
--query "SELECT * FROM ${CH_DATABASE}.${table} FORMAT Native" \
| clickhouse-client --host "${NEW_CH_HOST}" --port "${NEW_CH_PORT}" \
--user "${NEW_CH_USER}" --password "${NEW_CH_PASSWORD}" \
--query "INSERT INTO ${CH_DATABASE}.${table} FORMAT Native"

Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data copy uses a shell pipeline, but the script doesn’t enable pipefail. That means failures in the first clickhouse-client (reading from the old cluster) may not fail the step if the second command exits successfully, leading to silent partial/empty migrations. Please enable set -o pipefail (and keep -e) so the job reliably fails on copy errors.

Suggested change
clickhouse-client --host "${OLD_CH_HOST}" --port "${OLD_CH_PORT}" \
--user "${OLD_CH_USER}" --password "${OLD_CH_PASSWORD}" \
--query "SELECT * FROM ${CH_DATABASE}.${table} FORMAT Native" \
| clickhouse-client --host "${NEW_CH_HOST}" --port "${NEW_CH_PORT}" \
--user "${NEW_CH_USER}" --password "${NEW_CH_PASSWORD}" \
--query "INSERT INTO ${CH_DATABASE}.${table} FORMAT Native"
set -o pipefail
clickhouse-client --host "${OLD_CH_HOST}" --port "${OLD_CH_PORT}" \
--user "${OLD_CH_USER}" --password "${OLD_CH_PASSWORD}" \
--query "SELECT * FROM ${CH_DATABASE}.${table} FORMAT Native" \
| clickhouse-client --host "${NEW_CH_HOST}" --port "${NEW_CH_PORT}" \
--user "${NEW_CH_USER}" --password "${NEW_CH_PASSWORD}" \
--query "INSERT INTO ${CH_DATABASE}.${table} FORMAT Native"

Copilot uses AI. Check for mistakes.

```bash
# Port-forward the built-in ClickHouse native port
kubectl port-forward svc/<release>-app-clickhouse 9000:9000 -n <namespace>
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The service name in this port-forward example doesn’t match the chart’s actual ClickHouse Service naming. The chart uses {{ include "portkey.fullname" . }}-{{ .Values.clickhouse.name }} (typically <release>-portkey-app-clickhouse), not <release>-app-clickhouse. Please update the example to use the correct pattern so users can follow the guide successfully.

Suggested change
kubectl port-forward svc/<release>-app-clickhouse 9000:9000 -n <namespace>
kubectl port-forward svc/<release>-portkey-app-clickhouse 9000:9000 -n <namespace>

Copilot uses AI. Check for mistakes.
Comment on lines +213 to +217
value: {{ .Values.clickhouse.external.replicationEnabled | toString | quote }}
- name: CLICKHOUSE_SHARDING_ENABLED
value: {{ .Values.clickhouse.external.shardingEnabled | toString | quote }}
- name: CLICKHOUSE_CLUSTER_NAME
value: {{ .Values.clickhouse.external.clusterName | quote }}
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLICKHOUSE_REPLICATION_ENABLED, CLICKHOUSE_SHARDING_ENABLED, and CLICKHOUSE_CLUSTER_NAME are sourced directly from .Values instead of the ClickHouse secret (unlike the other ClickHouse connection settings above). This breaks setups that use clickhouse.external.existingSecretName (and contradicts the docs that instruct putting these keys in the secret), because changing the secret won’t affect these env vars. Please source these values from secretKeyRef keys (e.g., clickhouse_replication_enabled, clickhouse_sharding_enabled, clickhouse_cluster_name) for consistency and to support existing-secret workflows.

Suggested change
value: {{ .Values.clickhouse.external.replicationEnabled | toString | quote }}
- name: CLICKHOUSE_SHARDING_ENABLED
value: {{ .Values.clickhouse.external.shardingEnabled | toString | quote }}
- name: CLICKHOUSE_CLUSTER_NAME
value: {{ .Values.clickhouse.external.clusterName | quote }}
valueFrom:
secretKeyRef:
name: {{ include "portkey.clickhouseSecretsName" . }}
key: clickhouse_replication_enabled
- name: CLICKHOUSE_SHARDING_ENABLED
valueFrom:
secretKeyRef:
name: {{ include "portkey.clickhouseSecretsName" . }}
key: clickhouse_sharding_enabled
- name: CLICKHOUSE_CLUSTER_NAME
valueFrom:
secretKeyRef:
name: {{ include "portkey.clickhouseSecretsName" . }}
key: clickhouse_cluster_name

Copilot uses AI. Check for mistakes.
Comment on lines +507 to +511
value: {{ .Values.clickhouse.external.replicationEnabled | toString | quote }}
- name: ANALYTICS_STORE_SHARDING_ENABLED
value: {{ .Values.clickhouse.external.shardingEnabled | toString | quote }}
- name: ANALYTICS_STORE_CLUSTER_NAME
value: {{ .Values.clickhouse.external.clusterName | quote }}
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above for the gateway env block: ANALYTICS_STORE_REPLICATION_ENABLED, ANALYTICS_STORE_SHARDING_ENABLED, and ANALYTICS_STORE_CLUSTER_NAME are taken from .Values rather than the ClickHouse secret. If clickhouse.external.existingSecretName is used, the secret contents won’t be reflected here. Please read these via secretKeyRef to the corresponding secret keys, consistent with the other analytics store settings.

Suggested change
value: {{ .Values.clickhouse.external.replicationEnabled | toString | quote }}
- name: ANALYTICS_STORE_SHARDING_ENABLED
value: {{ .Values.clickhouse.external.shardingEnabled | toString | quote }}
- name: ANALYTICS_STORE_CLUSTER_NAME
value: {{ .Values.clickhouse.external.clusterName | quote }}
valueFrom:
secretKeyRef:
name: {{ include "portkey.clickhouseSecretsName" . }}
key: clickhouse_replication_enabled
- name: ANALYTICS_STORE_SHARDING_ENABLED
valueFrom:
secretKeyRef:
name: {{ include "portkey.clickhouseSecretsName" . }}
key: clickhouse_sharding_enabled
- name: ANALYTICS_STORE_CLUSTER_NAME
valueFrom:
secretKeyRef:
name: {{ include "portkey.clickhouseSecretsName" . }}
key: clickhouse_cluster_name

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants