Skip to content

fix(deployment): restart keycloak on every deploy when dev DB is wiped#6432

Closed
theosanderson-agent wants to merge 1 commit into
mainfrom
fix-keycloak-restart-on-devdb-wipe
Closed

fix(deployment): restart keycloak on every deploy when dev DB is wiped#6432
theosanderson-agent wants to merge 1 commit into
mainfrom
fix-keycloak-restart-on-devdb-wipe

Conversation

@theosanderson-agent
Copy link
Copy Markdown
Collaborator

@theosanderson-agent theosanderson-agent commented May 13, 2026

Summary

Closes #6431.

Preview/dev deployments reset the Keycloak DB on every Helm sync — the loculus-keycloak-database pod has a timestamp: {{ now }} annotation, and (when developmentDatabasePersistence: false) no PVC, so its data is lost whenever the pod is recreated.

The Keycloak pod itself was only restarted when the docker tag changed (via the LOCULUS_VERSION env var added in #4720). So a redeploy with the same image tag (e.g. a config-only sync, or two argo syncs of the same commit) would wipe the DB without restarting Keycloak. Keycloak then ran on with internal/in-memory state pointing at rows that no longer existed, producing the Unexpected error when handling authentication request to identity provider errors.

Fix

Add a timestamp: {{ now }} annotation to the Keycloak pod template, gated on runDevelopmentKeycloakDatabase && !developmentDatabasePersistence — so the Keycloak Deployment rolls every sync, exactly when its DB is going to be wiped.

We deliberately do not restart Keycloak unconditionally — #4326 removed an unconditional timestamp here because it was rolling Keycloak every 24h in production and logging everyone out. The new condition only fires for ephemeral dev/preview DBs, never for prod or for persistent dev DBs.

runDevelopmentKeycloakDatabase developmentDatabasePersistence Keycloak restart on each sync?
false (prod) n/a No (unchanged)
true true (persistent PVC) No (data survives)
true false (ephemeral, default for previews) Yes (new)

🤖 Generated with Claude Code

🚀 Preview: Add preview label to enable

Preview/dev deploys reset the keycloak DB on each Helm sync (the
keycloak-database-standin pod has a `timestamp: now` annotation, and
without persistence the pod has no PVC so its data is lost on every
restart). The keycloak pod itself however only restarted when the docker
tag changed (via the `LOCULUS_VERSION` env var). When a redeploy
happened without a version bump the DB was wiped but keycloak kept
running with stale internal state, causing the
"Unexpected error when handling authentication request to identity
provider" errors users were seeing on dev instances (#6431).

Add a `timestamp: now` pod annotation to the keycloak Deployment, gated
on `runDevelopmentKeycloakDatabase` AND NOT `developmentDatabasePersistence`,
so keycloak is recreated on every Helm sync exactly when its DB is.

We deliberately do NOT add the timestamp in prod or in persistent dev
mode — #4326 removed an unconditional timestamp here because it was
logging users out every 24h.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@theosanderson
Copy link
Copy Markdown
Member

now unsure about this

@corneliusroemer
Copy link
Copy Markdown
Contributor

What's wrong about restarting keycloak? This is necessary whenever we restart the db in previews - otherwise db tables aren't initialized. So irrespective of using now or values hash from #6433 we need some way of restarting keycloak.

Maybe it already restarts almost always due to keycloakify theme commit changing. But there seem to be edge cases where the db renews but keycloak doesn't.

@theosanderson
Copy link
Copy Markdown
Member

This made it happen every 24 hrs, and iirc all users were logged out when it was restarted - that's why we removed it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deployment Code changes targetting the deployment infrastructure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The reason we get Unexpected error when handling authentication request to identity provider on dev instances

3 participants