feat: store hashed tokens instead of plaintext by rubenfiszel · Pull Request #8217 · windmill-labs/windmill

rubenfiszel · 2026-03-04T07:48:26Z

Summary

Store SHA-256 hashed tokens (token_hash) in the token table instead of plaintext, so a DB compromise doesn't expose usable credentials
Add token_prefix (first 10 chars) column for listing/deletion without needing plaintext
Auth cache stays keyed by raw token — zero-cost cache hits, only hash on cache miss
Two-phase migration: (1) add columns + backfill + build indexes (allows concurrent reads), (2) instant PK swap (~2ms exclusive lock)
Plaintext token column is still written alongside the hash until MIN_VERSION_SUPPORTS_TOKEN_HASH (1.650.0) is reached by all workers, then NULL is written instead

TODO for @hugocasademont

Remove `get_token_by_prefix` (native triggers)

get_token_by_prefix retrieves the plaintext token from the DB so it can be embedded in webhook callback URLs. It's used in two places:

handler.rs:237 — trigger update: already handles None gracefully (deletes old token, creates new one)
google/external.rs:327 — renew_channel: hard-fails on None ("Webhook token not found")

Once MIN_VERSION_SUPPORTS_TOKEN_HASH is met and plaintext stops being written, all new webhook tokens will have token = NULL, breaking Google trigger watch renewal.

Recommended fix — always delete + create a new token on renewal:

Add a low-level create_webhook_token_for_trigger(db, trigger) that inserts a token directly using the trigger's email/owner/workspace_id/script_path (no ApiAuthed needed)
Update renew_channel to: delete old token by prefix → create new token → use it in the webhook URL → update webhook_token_prefix on the trigger
Then get_token_by_prefix can be removed entirely — every renewal just creates a fresh token

This matches the pattern handler.rs already uses on trigger update.

Test plan

cargo check passes (CE + EE + native_trigger features)
Migration applies cleanly on fresh and existing DBs
Create token via UI → verify token_hash+token_prefix populated, token column has plaintext (pre-1.650.0) or NULL (post-1.650.0)
Authenticate with Bearer token → hash-based lookup works
List tokens → prefix displays correctly
Delete token by prefix → deletion works
Login/logout cycle → session tokens work
Monitor cleanup of expired tokens uses prefix in logs
Native trigger webhook re-creation works

🤖 Generated with Claude Code

Companion EE PR: https://github.com/windmill-labs/windmill-ee-private/pull/437

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cloudflare-workers-and-pages · 2026-03-04T07:51:07Z

Deploying windmill with Cloudflare Pages

Latest commit:	`cb144ef`
Status:	⚡️ Build in progress...

View logs

claude · 2026-03-04T07:51:27Z

+-- Make token NOT NULL again
+ALTER TABLE token ALTER COLUMN token SET NOT NULL;


Potential down-migration failure: When rolling back step 2 (token_hash_pk_swap.down.sql), this restores token NOT NULL, but new tokens created after the forward migration will have token = NULL (since the code no longer populates it). This means the rollback will fail with ERROR: column "token" of relation "token" contains null values if any new tokens were created after migration.

You'd need to backfill or drop the new null rows before restoring NOT NULL:

-- Delete tokens that were created after the forward migration (no plaintext token) DELETE FROM token WHERE token IS NULL; -- Then restore NOT NULL ALTER TABLE token ALTER COLUMN token SET NOT NULL;

- Update all base.sql fixtures to include token_hash/token_prefix columns - Keep plaintext token for webhook tokens (needed for URL reconstruction) - Restore get_token_by_prefix to query DB for webhook tokens - Fix down migration to delete NULL-token rows before restoring NOT NULL - Update parser fixture standalone schema - Update EE dedicated_worker_ee.rs to use token_hash/token_prefix Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Write to token column alongside token_hash until MIN_VERSION_SUPPORTS_TOKEN_HASH (1.649.0) is reached. This ensures older workers can still authenticate during rolling upgrades. Remove the separate UPDATE in new_webhook_token since create_token_internal now writes plaintext directly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Check MIN_VERSION_SUPPORTS_TOKEN_HASH at runtime: write plaintext to token column while old workers exist, switch to NULL once all workers are >= 1.649.0. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

rubenfiszel · 2026-03-04T09:46:51Z

/ai-fast reviewi this PR

Resolve conflicts in monitor.rs and native-triggers/lib.rs, keeping the token_prefix-based queries from the store-hash branch. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…OT NULL Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

hugocasa · 2026-03-13T15:52:32Z

/ai-fast review this PR

claude · 2026-03-13T15:53:35Z

Claude finished @hugocasa's task in 1m 55s —— View job

PR Review: Store hashed tokens instead of plaintext

Read key changed files (auth, migrations, native triggers, users)
Analyze security and correctness
Post review findings

Overall this is a well-designed security improvement. The two-phase migration approach, the MIN_VERSION_SUPPORTS_TOKEN_HASH gating, and the auth cache staying keyed by raw token are all solid choices. Here are the findings:

Bugs / Issues

1. Potential panic on short tokens (auth.rs)

There are 7 instances of token[0..TOKEN_PREFIX_LEN] that will panic if a token shorter than 10 chars is provided. While rd_string(32) always produces 32-char tokens, this could still be triggered by:

A malformed Authorization: Bearer xx header
Short cookie values from corrupted state

The superadmin path at auth.rs:437 already uses the safe pattern: token.get(..TOKEN_PREFIX_LEN).unwrap_or(token). The other 7 sites should too:

auth.rs:284, auth.rs:308, auth.rs:324, auth.rs:373, auth.rs:387, auth.rs:402
users.rs:1651

Similarly, mcp/oauth_server.rs:389 and oauth_server.rs:533 use &access_token[..TOKEN_PREFIX_LEN] which could panic, though these are on internally-generated tokens so the risk is very low.

Fix this →

2. Google watch renewal fails when token is orphaned (google/external.rs:328-334)

renew_channel calls rotate_webhook_token and if it returns None (token deleted or missing), it returns a hard error. This is fine for now since there's always a token row during the transition period, but after MIN_VERSION_SUPPORTS_TOKEN_HASH is met and plaintext stops being stored, the rotate_webhook_token function should still work (it looks up by token_hash, not plaintext), so this path should remain functional. However, the error message at line 332 says "no user context to create a fresh one" — consider adding a fallback that creates a fresh token (as the PR description's TODO recommends) rather than failing the renewal entirely.

Migration Concerns

3. Native trigger migration deletes orphaned triggers (20260313000000_native_trigger_token_hash.up.sql:13)

DELETE FROM native_trigger WHERE webhook_token_hash IS NULL;

This silently deletes any native_trigger row whose webhook_token_prefix doesn't match a token.token_prefix. This could happen if:

A token was deleted but the trigger wasn't cleaned up
Prefix collision caused the JOIN to match a different token

The deletion is the right thing to do for data integrity, but consider logging or counting how many rows are deleted, or adding a DO NOTHING approach (set a sentinel hash value) so the trigger can be detected and re-created rather than silently lost.

4. Native trigger migration backfill uses prefix JOIN (20260313000000_native_trigger_token_hash.up.sql:7-10)

UPDATE native_trigger nt SET webhook_token_hash = t.token_hash
FROM token t WHERE t.token_prefix = nt.webhook_token_prefix;

If two tokens share the same 10-char prefix (unlikely but possible), this JOIN could match the wrong token for a trigger. Since the prefix space is 62^10, the probability is negligible in practice, but it's worth noting. A DISTINCT ON or additional filtering by workspace_id and email would be safer.

Design Observations (Non-blocking)

5. delete_token can delete multiple tokens with same prefix (users.rs:1951-1960)

The DELETE FROM token WHERE email = $1 AND token_prefix = $2 could delete multiple tokens if a user has tokens with colliding 10-char prefixes. The code already returns tokens_deleted.len() in the response message, which is good for visibility. At scale, consider whether a UNIQUE(email, token_prefix) constraint or using token_hash for deletion would be safer.

6. rotate_webhook_token doesn't copy expiration or owner (lib.rs:727-782)

The rotation creates a new token copying label, email, scopes, workspace_id, and super_admin, but doesn't copy expiration or owner. For webhook tokens, having no expiration is probably intentional (they live as long as the trigger), but the missing owner column means the new token won't have an owner set. Verify this is acceptable for token permission checks.

7. token_expiry_notification table stores token_hash as FK (lib.rs:622-631)

INSERT INTO token_expiry_notification (token, expiration) VALUES ($1, $2)

The column is still named token but now stores a hash. This is a naming inconsistency that could confuse future developers. Not blocking, but worth a rename in a follow-up migration (e.g., token_hash).

Summary

The core security goal (removing plaintext tokens from the DB) is well-implemented. The main actionable item is the short token panic (item 1) which could cause a runtime crash on malformed input. Everything else is either already acknowledged in the PR's TODO section or is a low-probability edge case.

hugocasa

/ai-fast

hugocasa · 2026-03-13T16:16:48Z

/ai-fast

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…ner/expiration in rotate Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…fication column Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…ration Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

hugocasa · 2026-03-13T22:34:22Z

/ai-fast review this PR

claude · 2026-03-13T22:35:21Z

Claude finished @hugocasa's task in 1m 48s —— View job

PR Review: Store hashed tokens instead of plaintext

Read key changed files (auth, users, native triggers, migrations)
Check if previous review issues were addressed
Analyze security, correctness, and edge cases
Post review findings

This is a well-designed security improvement. The two-phase migration, auth cache keyed by raw token, MIN_VERSION_SUPPORTS_TOKEN_HASH gating, and rotate_webhook_token approach are all solid. Here are the findings from this review:

Bugs

1. Panic on short tokens — still present (7 sites in auth.rs + 1 in users.rs)

This was flagged in both previous reviews and remains unfixed. token[0..TOKEN_PREFIX_LEN] will panic if a token shorter than 10 characters is provided:

auth.rs:284, auth.rs:308, auth.rs:324, auth.rs:373, auth.rs:387, auth.rs:402 — regular token auth paths
auth.rs:436 — superadmin secret path (the secret is user-configured and could easily be shorter than 10 chars)
users.rs:1651 — login path

While rd_string(32) always produces 32-char tokens, a malformed Authorization: Bearer x header or a short superadmin secret will crash the server.

The fix is straightforward — use token.get(..TOKEN_PREFIX_LEN).unwrap_or(token).to_string() everywhere. This is a one-line change at each site.

Fix this →

Previously Flagged Issues — Now Addressed

2. Native trigger orphan handling (migration) — Fixed

The previous reviews flagged that 20260313000000_native_trigger_token_hash.up.sql deleted orphaned triggers silently. This has been fixed — orphaned triggers now get a placeholder hash and an error message ("Webhook token not found during migration — re-create this trigger to fix") instead of being deleted. Good improvement.

3. Down migration NULL token handling — Fixed

20260304000001_token_hash_pk_swap.down.sql:16 correctly deletes NULL-token rows before restoring NOT NULL.

4. token_expiry_notification column naming — Fixed

Migration 20260313000001_rename_token_expiry_notification_column.up.sql correctly converts existing plaintext values to hashes and renames the column to token_hash.

5. rotate_webhook_token copies owner and expiration — Fixed

lib.rs:763-775 now copies old.owner and old.expiration to the new token, addressing the previous review concern.

Design Observations (Non-blocking)

6. token_expiry_notification migration hashes plaintext values (20260313000001)

UPDATE token_expiry_notification
SET token = encode(sha256(token::bytea), 'hex');

This assumes all existing values in token_expiry_notification.token are plaintext. If any rows were already inserted with a hash (e.g., from the new code running after 20260304000000 but before this migration), they'd be double-hashed. Worth verifying the ordering guarantees — since 20260313000001 runs after 20260304000000, and the new code in register_token_expiry_notification already passes token_hash, any rows created between the two migrations would indeed be double-hashed.

However, looking at the code path: register_token_expiry_notification in lib.rs:622 inserts token_hash into a column named token. So between migrations 20260304000000 (which starts writing hashes) and 20260313000001 (which hashes the column), new rows will contain hashes that get re-hashed. The impact is low — these rows just won't match in check_expiring_tokens's JOIN, so some expiry notifications could be missed during the transition window.

7. Google channel renewal still hard-fails on missing token (google/external.rs:328-335)

As acknowledged in the PR description's TODO, renew_channel returns a hard error when rotate_webhook_token returns None. The handler.rs update path (lines 255-273) already has the correct pattern — it falls back to new_webhook_token. The renew_channel path should get the same treatment before MIN_VERSION_SUPPORTS_TOKEN_HASH is bumped to avoid breaking Google watch renewals. Since this is explicitly called out as a TODO for @hugocasademont, it's tracked.

8. Native trigger migration backfill prefix collision risk (very low)

The prefix JOIN at 20260313000000:7-10 could theoretically match the wrong token if two tokens share a 10-char prefix. Adding AND t.workspace_id = nt.workspace_id would make this more robust, though the 62^10 prefix space makes collisions negligible in practice.

Summary

The core security design is sound and most previous review issues have been addressed. The only remaining actionable bug is the short token panic (item 1) — a malformed Authorization: Bearer x header or a short superadmin secret will crash the server. This should be fixed before merge.

View job

hugocasa · 2026-03-13T22:39:35Z

claude · 2026-03-16T13:24:24Z

Claude finished @hugocasa's task in 5m 20s —— View job

PR Review: Store hashed tokens instead of plaintext

Read key changed files (auth, users, migrations, native triggers, oauth)
Analyze security, correctness, and migration safety
Check previous review findings
Post comprehensive review with testing instructions

Overall this is a well-designed security improvement. The two-phase migration approach, auth cache keyed by raw token, MIN_VERSION_SUPPORTS_TOKEN_HASH gating, and rotate_webhook_token pattern are all solid. Here are the findings:

Bug: Panic on short tokens (still present)

This was flagged in all three previous reviews and remains unfixed. token[0..TOKEN_PREFIX_LEN] will panic on tokens shorter than 10 characters.

Most critical path — auth.rs:436 (superadmin secret):

token_prefix: Some(token[0..TOKEN_PREFIX_LEN].to_string()),

The SUPERADMIN_SECRET env var is user-configured and can easily be shorter than 10 chars. Any instance with a short secret will crash the API server on auth.

Other affected locations in auth.rs: lines 284, 308, 324, 373, 387, 402 — these handle tokens from Authorization: Bearer headers which could theoretically be malformed/short.

users.rs:531 — logout path, token comes from Bearer header. Lines 1651, 1753, 1841 use rd_string(32) so are safe in practice but should still use the defensive pattern.

Fix: Replace all bare token[0..TOKEN_PREFIX_LEN] with token.get(..TOKEN_PREFIX_LEN).unwrap_or(token).to_string(), or extract a helper:

fn safe_token_prefix(token: &str) -> String {
    token.get(..TOKEN_PREFIX_LEN).unwrap_or(token).to_string()
}

Fix this →

Recommendation: `mcp_oauth_refresh_token.access_token` semantic change without migration

oauth_server.rs:431 — The access_token column in mcp_oauth_refresh_token now stores the hash of the access token. There's no migration to hash existing plaintext values in this column. Existing rows from before this PR will have plaintext in access_token, and the DELETE FROM token WHERE token_hash = $1 at line 520 will silently fail to match (non-fatal since the error is caught, but the old access token won't be cleaned up on refresh).

Suggestion: Either add a migration to hash existing values, or rename the column to access_token_hash for clarity, or at minimum add a code comment explaining why existing rows are acceptable.

Recommendation: `rotate_webhook_token` doesn't migrate `token_expiry_notification` rows

lib.rs:729-786 — When rotating a webhook token, the old token is deleted and a new one created. If the old token had a token_expiry_notification row, it becomes orphaned (references a deleted token_hash). The new token gets no notification row.

Suggestion: Either copy the notification to the new hash, or add ON DELETE CASCADE from token_expiry_notification.token_hash to token.token_hash.

Migration Notes (non-blocking)

Orphan sentinel hash (20260316000002:15-18): Uses sha256('') as a placeholder. This works but is a magic value. A clearly-invalid non-hex string like '__orphaned__' would be more explicit.

Prefix JOIN for backfill (20260316000002:7-10): Joins on token_prefix (10 chars). Collision probability is negligible (62^10 space) but adding AND t.workspace_id = nt.workspace_id would be more robust.

Down migration is destructive (20260316000001.down.sql:16): DELETE FROM token WHERE token IS NULL destroys all tokens created after the forward migration. This is correct behavior but could surprise operators — consider adding a -- WARNING: destroys post-migration tokens comment.

Positive Observations

Migration strategy is excellent. Splitting into 4 migrations (add columns + backfill + indexes, instant PK swap, native trigger hash, token_expiry_notification rename) minimizes lock duration and allows partial rollback.
Backward compatibility is thoughtful. The MIN_VERSION_SUPPORTS_TOKEN_HASH gate ensures plaintext is still written for rolling upgrades, and idx_token_plaintext keeps old workers fast.
Cache invalidation transition is clean. The DB trigger now sends token_prefix instead of plaintext, and invalidate_token_from_cache evicts by prefix — no plaintext ever touches the notification channel post-migration.
Token rotation pattern is solid. rotate_webhook_token correctly copies metadata within a transaction, and callers handle None gracefully (or with a visible error on the trigger).
Test coverage is good. 6 integration tests cover hash consistency, CRUD lifecycle, logout, backward compat, and rotation.
Google renewal TODO is well-documented. The PR description clearly outlines the fix needed before MIN_VERSION_SUPPORTS_TOKEN_HASH is bumped.

Testing Instructions

To verify the token hashing changes: Go to Settings > Tokens (user settings page) and create a new token — it should display a token prefix (first 10 chars) in the token list. Copy the full token value shown at creation time and use it in an API call (curl -H "Authorization: Bearer <token>" http://localhost:8000/api/version) to verify hash-based authentication works. Delete the token from the list by clicking the delete button — verify it no longer authenticates. Log out and log back in to verify session tokens work. If you have access to the database, check the token table: token_hash and token_prefix should be populated for all rows; the token column should be NULL if the instance version is >= 1.650.0, or contain the plaintext otherwise. For native triggers, create/update a webhook trigger and verify native_trigger.webhook_token_hash is populated. The superadmin secret (SUPERADMIN_SECRET env var) should also still work for authentication — note the short-token panic bug above means secrets shorter than 10 characters will crash the server.

rotate_webhook_token now atomically inserts the new token and deletes the old one in a single transaction, preventing token leaks. Returns new_token_hash so callers can clean up the new token if their subsequent trigger update fails (which involves external HTTP calls and cannot be in the same DB transaction). - Handler: wraps post-rotation work; deletes new token on failure - Google renewal: deletes new token if service_config update fails - Tests updated to match new atomic semantics Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ilure rotate_webhook_token now keeps the old token alive and returns old_token_hash. Callers delete it only after the trigger row has been successfully updated. If the external service call or DB update fails, the trigger keeps working with the old token. Worst case: if the best-effort delete fails, the old token leaks as an extra DB row — harmless compared to breaking the trigger. Also update summarized_schema.txt for renamed columns. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

This commit updates the EE repository reference after PR #437 was merged in windmill-ee-private. Previous ee-repo-ref: 7aef8b06cb6f54c2bc89dd57b70947deed72553c New ee-repo-ref: 2d0823a471014e2bc2d898c63518323946b7474f Automated by sync-ee-ref workflow.

windmill-internal-app · 2026-03-17T00:52:39Z

🤖 Updated ee-repo-ref.txt to 2d0823a471014e2bc2d898c63518323946b7474f after windmill-ee-private PR #437 was merged.

rubenfiszel · 2026-03-17T00:54:31Z

/ai-fast re-review this PR

claude · 2026-03-17T00:55:30Z

Claude finished @rubenfiszel's task in 2m 55s —— View job

PR Re-Review: Store hashed tokens instead of plaintext

Read key changed files (auth, users, native triggers, migrations, oauth)
Check if previous review issues were addressed
Analyze security, correctness, and edge cases
Post comprehensive review findings

Overall this is a well-designed security improvement. The two-phase migration, auth cache keyed by raw token, MIN_VERSION_SUPPORTS_TOKEN_HASH gating, and rotate_webhook_token approach are all solid.

Status of previously flagged issues

#	Issue	Status
1	Short token panic	Still present (see below)
2	Orphaned triggers deleted silently	Fixed — now sets placeholder hash + error message
3	Down migration NULL handling	Fixed — deletes NULL rows before restoring `NOT NULL`
4	`token_expiry_notification` naming	Fixed — migration `20260316000003` renames to `token_hash`
5	`rotate_webhook_token` missing `owner`/`expiration`	Fixed — `lib.rs:763-775` copies both
6	`mcp_oauth_refresh_token.access_token` naming	Fixed — migration `20260316000004` renames to `access_token_hash`
7	Google renewal hard-fail	Acknowledged — tracked as TODO for @hugocasademont

Bug: Panic on short tokens (4th time flagged)

There are 8 instances of token[0..TOKEN_PREFIX_LEN].to_string() that will panic if a token shorter than 10 characters is provided:

auth.rs — 7 sites on the main auth path (lines 284, 308, 324, 373, 387, 402, 436):

token_prefix: Some(token[0..TOKEN_PREFIX_LEN].to_string()),

users.rs:1651 — login path:

token_prefix: Some(token[0..TOKEN_PREFIX_LEN].to_string()),

Additionally, there are 4 instances of &token[..TOKEN_PREFIX_LEN] that will also panic:

lib.rs:523 (create_token_internal) — safe, uses rd_string(32)
users.rs:531 (logout) — unsafe, token comes from Authorization: Bearer header
users.rs:1753 (create_session_token) — safe, uses rd_string(32)
users.rs:1841 (impersonate) — safe, uses rd_string(32)

Most critical path — auth.rs:436 (superadmin secret): The SUPERADMIN_SECRET env var is user-configured and can easily be set to something shorter than 10 chars (e.g., "changeme"). Any instance with a short secret will crash on first auth attempt.

users.rs:531 (logout): A malformed cookie or Authorization: Bearer x header will crash the server.

Fix: Add a helper and use it everywhere:

fn safe_prefix(token: &str) -> String {
    token.get(..TOKEN_PREFIX_LEN).unwrap_or(token).to_string()
}

Or just use the inline pattern at each site: token.get(..TOKEN_PREFIX_LEN).unwrap_or(token).to_string()

Fix this →

Design observations (non-blocking)

1. token_expiry_notification double-hashing window

Migration 20260316000003 hashes all existing values:

UPDATE token_expiry_notification SET token = encode(sha256(token::bytea), 'hex');

But the new code in register_token_expiry_notification (lib.rs:622) already writes token_hash into the token column. Any notification rows inserted between migration 20260316000000 (new code deploys) and 20260316000003 (this migration runs) will be double-hashed, causing them to not match in check_expiring_tokens. Impact is low — some expiry notifications may be missed during the brief transition window.

2. Google renewal TODO

renew_channel (google/external.rs:330-335) still hard-fails when rotate_webhook_token returns None. The PR description's TODO outlines the right fix (create a fresh token without ApiAuthed). This should be implemented before MIN_VERSION_SUPPORTS_TOKEN_HASH is bumped to avoid breaking Google watch renewals. Since this is explicitly tracked as a TODO for @hugocasademont, just noting it's still pending.

3. Native trigger backfill prefix collision

20260316000002:7-10 joins native_trigger to token on token_prefix alone. Adding AND t.workspace_id = nt.workspace_id would be more robust, though the 62^10 prefix space makes collisions negligible.

Positive observations

Migration strategy is excellent. Four migrations minimize lock duration and allow partial rollback
Backward compatibility is well-thought-out. MIN_VERSION_SUPPORTS_TOKEN_HASH gate + idx_token_plaintext keep old workers fast during rolling upgrades
Cache invalidation transition is clean. DB trigger sends token_prefix instead of plaintext
Token rotation pattern is solid. rotate_webhook_token correctly copies all metadata within a transaction
Test coverage is good. 6 integration tests cover hash consistency, CRUD lifecycle, logout, backward compat, and rotation
handler.rs update path is well-implemented. Falls back to new_webhook_token when rotate_webhook_token returns None (lines 263-279)

Summary

The core design is sound and most previous issues have been addressed. The only remaining actionable bug is the short token panic — a malformed Authorization: Bearer x header or a short superadmin secret will crash the server. This should be fixed before merge.

rubenfiszel · 2026-03-17T01:02:47Z

/ai fix There are 8 instances of token[0..TOKEN_PREFIX_LEN].to_string() that will panic if a token shorter than 10 characters is provided:

auth.rs — 7 sites on the main auth path (lines 284, 308, 324, 373, 387, 402, 436):

token_prefix: Some(token[0..TOKEN_PREFIX_LEN].to_string()),

users.rs:1651 — login path:

token_prefix: Some(token[0..TOKEN_PREFIX_LEN].to_string()),

Additionally, there are 4 instances of &token[..TOKEN_PREFIX_LEN] that will also panic:

lib.rs:523 (create_token_internal) — safe, uses rd_string(32)
users.rs:531 (logout) — unsafe, token comes from Authorization: Bearer header
users.rs:1753 (create_session_token) — safe, uses rd_string(32)
users.rs:1841 (impersonate) — safe, uses rd_string(32)

Most critical path — auth.rs:436 (superadmin secret): The SUPERADMIN_SECRET env var is user-configured and can easily be set to something shorter than 10 chars (e.g., "changeme"). Any instance with a short secret will crash on first auth attempt.

users.rs:531 (logout): A malformed cookie or Authorization: Bearer x header will crash the server.

Fix: Add a helper and use it everywhere:

fn safe_prefix(token: &str) -> String {
token.get(..TOKEN_PREFIX_LEN).unwrap_or(token).to_string()
}

claude · 2026-03-17T01:04:49Z

Add safe_token_prefix() helper that uses .get(..TOKEN_PREFIX_LEN).unwrap_or(token) instead of direct slice indexing, preventing panics when tokens are shorter than 10 characters (e.g., short superadmin secrets or malformed Bearer tokens). Co-authored-by: Ruben Fiszel <rubenfiszel@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat: store hashed tokens in the token table instead of plaintext

c65b4b2

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

rubenfiszel requested review from alpetric and hugocasa as code owners March 4, 2026 07:48

This comment was marked as outdated.

Sign in to view

claude Bot reviewed Mar 4, 2026

View reviewed changes

Comment thread backend/windmill-native-triggers/src/lib.rs Outdated

claude Bot reviewed Mar 4, 2026

View reviewed changes

Comment thread backend/migrations/20260316000000_token_hash.up.sql

claude Bot reviewed Mar 4, 2026

View reviewed changes

Comment thread backend/tests/fixtures/preserve_on_behalf_of.sql

claude Bot reviewed Mar 4, 2026

View reviewed changes

Comment thread backend/windmill-api-auth/src/auth.rs

rubenfiszel and others added 5 commits March 4, 2026 08:01

fix: restore sqlx offline cache (only add new query files)

790a18c

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat: branch on MIN_VERSION to write plaintext token or null

23bc2c4

Check MIN_VERSION_SUPPORTS_TOKEN_HASH at runtime: write plaintext to token column while old workers exist, switch to NULL once all workers are >= 1.649.0. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: set MIN_VERSION_SUPPORTS_TOKEN_HASH to 1.650.0

3153103

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

This comment was marked as outdated.

Sign in to view

hugocasa and others added 9 commits March 12, 2026 10:36

merge: update branch with origin/main

5d549f9

Resolve conflicts in monitor.rs and native-triggers/lib.rs, keeping the token_prefix-based queries from the store-hash branch. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: use token_hash for email lookup and expiry notifications

742fa00

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge remote-tracking branch 'origin/main' into store-hash

9436506

refactor: rotate webhook tokens instead of recovering plaintext from DB

b607c5c

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

refactor: use token_hash for native trigger token lookups and deletes

24fa52b

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

sqlx

98077ca

refactor: drop webhook_token_prefix from native_trigger table

5cbdcbf

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

fix: backward compat for token rotation and make webhook_token_hash N…

3d9d35e

…OT NULL Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

fix: prevent panic on short superadmin secret token prefix

37ec2e5

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

hugocasa requested changes Mar 13, 2026

View reviewed changes

hugocasa and others added 4 commits March 13, 2026 23:21

fix: preserve orphaned triggers with error instead of deleting

0385a97

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

fix: rename token_expiry_notification.token to token_hash and copy ow…

4e2ce09

…ner/expiration in rotate Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

fix: hash existing plaintext values before renaming token_expiry_noti…

30dc5f8

…fication column Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

fix: remove unnecessary length check in token_expiry_notification mig…

978ee56

…ration Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

hugocasa marked this pull request as draft March 13, 2026 22:41

update dates and version

1291e6f

hugocasa marked this pull request as ready for review March 16, 2026 13:24

hugocasa added 2 commits March 16, 2026 17:54

Merge remote-tracking branch 'origin/main' into store-hash

d908650

updat ee ref + sqlx

fdcfb79

hugocasa changed the title ~~feat: store hashed tokens instead of plaintext~~ [ee] feat: store hashed tokens instead of plaintext Mar 16, 2026

hugocasa and others added 5 commits March 16, 2026 19:11

improve mcp migration

ec3d992

higher min version

d0443c0

windmill-internal-app Bot changed the title ~~[ee] feat: store hashed tokens instead of plaintext~~ feat: store hashed tokens instead of plaintext Mar 17, 2026

Merge branch 'main' into store-hash

f32cce5

rubenfiszel merged commit f2be625 into main Mar 17, 2026
9 of 10 checks passed

rubenfiszel deleted the store-hash branch March 17, 2026 01:15

github-actions Bot locked and limited conversation to collaborators Mar 17, 2026

		-- Make token NOT NULL again
		ALTER TABLE token ALTER COLUMN token SET NOT NULL;

Conversation

rubenfiszel commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

TODO for @hugocasademont

Remove get_token_by_prefix (native triggers)

Test plan

Uh oh!

This comment was marked as outdated.

cloudflare-workers-and-pages Bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying windmill with Cloudflare Pages

Uh oh!

Uh oh!

Uh oh!

claude Bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

rubenfiszel commented Mar 4, 2026

Uh oh!

This comment was marked as outdated.

hugocasa commented Mar 13, 2026

Uh oh!

claude Bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Store hashed tokens instead of plaintext

Bugs / Issues

Migration Concerns

Design Observations (Non-blocking)

Summary

Uh oh!

hugocasa left a comment

Choose a reason for hiding this comment

Uh oh!

hugocasa commented Mar 13, 2026

Uh oh!

hugocasa commented Mar 13, 2026

Uh oh!

claude Bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Store hashed tokens instead of plaintext

Bugs

Previously Flagged Issues — Now Addressed

Design Observations (Non-blocking)

Summary

Uh oh!

hugocasa commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Manual Testing Checklist

Token basics

Backward compatibility

Native triggers (requires Google or Nextcloud integration)

Migration

Edge cases

Uh oh!

claude Bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Store hashed tokens instead of plaintext

Bug: Panic on short tokens (still present)

Recommendation: mcp_oauth_refresh_token.access_token semantic change without migration

Recommendation: rotate_webhook_token doesn't migrate token_expiry_notification rows

Migration Notes (non-blocking)

Positive Observations

Testing Instructions

Uh oh!

windmill-internal-app Bot commented Mar 17, 2026

Uh oh!

rubenfiszel commented Mar 17, 2026

Uh oh!

claude Bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Re-Review: Store hashed tokens instead of plaintext

Status of previously flagged issues

Bug: Panic on short tokens (4th time flagged)

Design observations (non-blocking)

Positive observations

Summary

Uh oh!

rubenfiszel commented Mar 4, 2026 •

edited

Loading

Remove `get_token_by_prefix` (native triggers)

cloudflare-workers-and-pages Bot commented Mar 4, 2026 •

edited

Loading

claude Bot commented Mar 13, 2026 •

edited

Loading

claude Bot commented Mar 13, 2026 •

edited

Loading

hugocasa commented Mar 13, 2026 •

edited

Loading

claude Bot commented Mar 16, 2026 •

edited

Loading

Recommendation: `mcp_oauth_refresh_token.access_token` semantic change without migration

Recommendation: `rotate_webhook_token` doesn't migrate `token_expiry_notification` rows

claude Bot commented Mar 17, 2026 •

edited

Loading

claude Bot commented Mar 17, 2026 •

edited

Loading