fix: deduplicate concurrent server-side token refresh calls#33
Conversation
When multiple server-function requests arrive concurrently with the same expired access token (e.g., TanStack Start firing 3 parallel loaders), each independently calls refreshTokens() with the same single-use refresh token. Only the first succeeds; the rest consume rate-limit quota or fail with TokenRefreshError. Add an in-flight Map to AuthKitCore.refreshTokens() keyed by refreshToken:organizationId. Concurrent callers share one promise; the map entry is cleared in finally on both success and failure. This mirrors the client-side refreshPromise dedup in tokenStore but operates per-process on the server.
Greptile SummaryThis PR fixes a race condition where multiple concurrent server-side requests each calling
Confidence Score: 5/5Safe to merge — the dedup logic is correct, the null-byte key separator is collision-free for well-formed WorkOS tokens, and cleanup is wired symmetrically on both success and failure paths. The change is a focused, well-tested addition to a single method. The JavaScript single-threaded execution model guarantees the map entry is set before any concurrent caller can miss it, and the cleanup fires before any awaiting caller's continuation resumes. The four new tests directly verify the core invariants (one API call, error fan-out, retry after failure, and per-org isolation). No files require special attention. Important Files Changed
Reviews (5): Last reviewed commit: "fix: address review feedback from Garen" | Re-trigger Greptile |
| const promise = (async () => { | ||
| try { | ||
| const result = | ||
| await this.client.userManagement.authenticateWithRefreshToken({ | ||
| refreshToken, | ||
| clientId: this.clientId, | ||
| organizationId, | ||
| }); | ||
|
|
||
| return { | ||
| accessToken: result.accessToken, | ||
| refreshToken: result.refreshToken, | ||
| user: result.user, | ||
| impersonator: result.impersonator, | ||
| }; | ||
| } catch (error) { | ||
| throw new TokenRefreshError('Failed to refresh tokens', error, context); | ||
| } finally { | ||
| this.inflightRefreshes.delete(key); | ||
| } | ||
| })(); | ||
|
|
||
| this.inflightRefreshes.set(key, promise); | ||
| return promise; |
There was a problem hiding this comment.
🟡 Stale rejected promise permanently cached in inflightRefreshes when authenticateWithRefreshToken throws synchronously
The IIFE at src/core/AuthKitCore.ts:233 starts executing synchronously before this.inflightRefreshes.set(key, promise) on line 255. If authenticateWithRefreshToken throws synchronously (i.e., is not an async function), the try/catch/finally inside the IIFE all execute synchronously — meaning this.inflightRefreshes.delete(key) on line 251 runs as a no-op (key not yet in the map), then set(key, promise) stores the already-rejected promise that will never be cleaned up. All subsequent calls with the same refreshToken:organizationId key will return the cached rejected promise, permanently preventing token refresh for that user until the process restarts.
Node.js verification of the execution order
When the awaited expression throws synchronously, finally runs before set:
Order: [ 'delete', 'set' ]
Map has stale entry: true
Map size: 1
When the awaited expression is async (even if it rejects), set correctly runs before finally:
async case: [ 'after-iife', 'finally' ]
| const promise = (async () => { | |
| try { | |
| const result = | |
| await this.client.userManagement.authenticateWithRefreshToken({ | |
| refreshToken, | |
| clientId: this.clientId, | |
| organizationId, | |
| }); | |
| return { | |
| accessToken: result.accessToken, | |
| refreshToken: result.refreshToken, | |
| user: result.user, | |
| impersonator: result.impersonator, | |
| }; | |
| } catch (error) { | |
| throw new TokenRefreshError('Failed to refresh tokens', error, context); | |
| } finally { | |
| this.inflightRefreshes.delete(key); | |
| } | |
| })(); | |
| this.inflightRefreshes.set(key, promise); | |
| return promise; | |
| const promise = (async () => { | |
| try { | |
| const result = | |
| await this.client.userManagement.authenticateWithRefreshToken({ | |
| refreshToken, | |
| clientId: this.clientId, | |
| organizationId, | |
| }); | |
| return { | |
| accessToken: result.accessToken, | |
| refreshToken: result.refreshToken, | |
| user: result.user, | |
| impersonator: result.impersonator, | |
| }; | |
| } catch (error) { | |
| throw new TokenRefreshError('Failed to refresh tokens', error, context); | |
| } | |
| })(); | |
| this.inflightRefreshes.set(key, promise); | |
| promise.finally(() => this.inflightRefreshes.delete(key)); | |
| return promise; |
Was this helpful? React with 👍 or 👎 to provide feedback.
- Extract RefreshResult type alias to deduplicate the Map field and method return type (prevents drift) - Hoist newJwt constant and makeExpiredSession() to file scope - Add makeCountingClient() helper to eliminate 4 near-identical mock client declarations in dedup tests - Remove shadowing newJwt redeclaration in validateAndRefresh block
- Use null byte delimiter in cache key to prevent theoretical collision if a refresh token contains ':' - Move map cleanup from IIFE finally to external .then(cleanup, cleanup) to guarantee delete runs after set, even if the SDK method throws synchronously
| getJwksUrl: () => 'https://api.workos.com/sso/jwks/test-client-id', | ||
| authenticateWithRefreshToken: async () => { | ||
| callCount++; | ||
| await new Promise(r => setTimeout(r, delayMs)); |
There was a problem hiding this comment.
I think this could/should use vi.useFakeTimers() + vi.advanceTimersByTimeAsync()
Co-authored-by: Garen Torikian <gjtorikian@users.noreply.github.com>
- Make bug report template framework-agnostic (remove hardcoded Next.js/TanStack references) - Use vi.useFakeTimers() + vi.advanceTimersByTimeAsync() in dedup tests instead of real setTimeout delays
Summary
MaptoAuthKitCore.refreshTokens()so concurrent calls with the same refresh token share a single WorkOS API call instead of racing${refreshToken}:${organizationId ?? ''}— different orgs get separate API callsfinallyon both success and failureProblem
When a TanStack Start user returns after an idle period, a single page navigation fires multiple concurrent server-function loaders (e.g., 3 for
/account). Each runs auth middleware independently, and each callsrefreshTokens()with the same expired refresh token. Since refresh tokens are single-use, only the first call succeeds — the rest hitRateLimitExceededExceptionorTokenRefreshError.Customer-reported: errors from the same IP had different trace IDs, confirming separate concurrent server-fn requests (not multi-tab).
The client-side
tokenStorealready hasrefreshPromisededup per browser tab, but there was zero server-side coordination.Approach
Mirrors the client-side
refreshPromisepattern:inflightRefreshesmap for an existing promise with the same keyfinallyAuthKitCoreis instantiated once per process (viaonce()infactory.ts), so the map is effectively process-wide.Test coverage
4 new tests added to
AuthKitCore.spec.ts:validateAndRefreshcalls → 1 API call, all resolve with same tokensTokenRefreshErrororganizationId= 2 separate API callsTrade-offs
context(userId/sessionId) is used for all waiters' error messages — acceptable since context is telemetry-onlyTest plan
main(3 API calls observed)pnpm buildpasses (TypeScript compilation)authkit-tanstack-startbuilds cleanly against this change