feat(auth): silent refresh helper + PKCE refreshToken via oauth4webapi#39
feat(auth): silent refresh helper + PKCE refreshToken via oauth4webapi#39scottlovegrove wants to merge 2 commits into
Conversation
PR3 of the refresh-token feature split. Login persists the full `TokenBundle` (refresh token + expiry) and `refreshAccessToken` rotates the access token proactively (skew window) or reactively (`force: true` after a 401), serialised via an `O_EXCL` file lock so concurrent CLI invocations don't issue parallel refresh grants. The PKCE provider implements `refreshToken` via `oauth4webapi`, declared as an optional peer dep — only refresh-capable consumers install it. `invalid_grant` (any status) maps to `AUTH_REFRESH_EXPIRED`; other failures to `AUTH_REFRESH_TRANSIENT`; missing peer / no refresh token / no `activeBundle` to `AUTH_REFRESH_UNAVAILABLE`. `KeyringTokenStore.activeBundle` parallel-reads both slots, honouring the `hasRefreshToken: false` record gate so access-only records skip the refresh-slot IPC. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The whole point of using oauth4webapi was to write less code — the initial PR3 commit didn't fully lean on that. Cuts: - `mapRefreshResponse` + `translateRefreshError` extracted as named functions called exactly once each. Inlined into `refreshToken`; the surrounding code reads top-to-bottom. - `loadOauth4webapi` was distinguishing `ERR_MODULE_NOT_FOUND` from other errors and surfacing the same `AUTH_REFRESH_UNAVAILABLE` code either way. Collapsed to one branch. - Verbose JSDoc on internal `acquireLock` / `translateRefreshError` restated what the code already says. Kept only the WHY-comments (re-read invariant on both acquire and timeout). - Three test-bloat cuts: provider 400-vs-401 + transient-variant it.eaches (oauth4webapi treats them identically — passes by construction); `refresh.test.ts#propagates AUTH_REFRESH_EXPIRED` (helper does no wrapping — `await` propagates by default); `internal.test.ts` redundant slot-read + slot-unavailable cases (covered by `activeBundle` integration tests in token-store). 412 tests still pass; net -128 LOC.
doistbot
left a comment
There was a problem hiding this comment.
This PR successfully implements the silent refresh capability by introducing the refreshAccessToken helper, integrating oauth4webapi for PKCE token rotation, and expanding the token store contract to handle full credential bundles. These additions robustly round out the CLI core auth storage and refresh lifecycle, laying a solid foundation for uninterrupted user sessions. There are a few areas to refine before merging, primarily around strengthening lock file management against stale state and race conditions, deduplicating shared logic, enforcing network timeouts, and ensuring the README and test suite fully reflect the new capabilities and peer dependencies.
| try { | ||
| const exchange = await provider.refreshToken({ | ||
| refreshToken: snapshot.bundle.refreshToken, | ||
| handshake: {}, |
There was a problem hiding this comment.
[P1] refreshAccessToken hardcodes handshake: {}, but the consumer CLI might have parsed runtime flags (e.g., --env) needed by the provider's tokenUrl resolver. Consider exposing handshake?: Record<string, unknown> on RefreshAccessTokenOptions so consumers can forward context down to provider.refreshToken.
| } | ||
|
|
||
| async function tryCreateLockFile(lockPath: string): Promise<boolean> { | ||
| try { |
There was a problem hiding this comment.
[P1] A stale lock file (e.g., left by a CLI crash or SIGINT during refresh) will permanently block future refreshes because acquireLock never steals or clears it. Consider checking stat(lockPath).mtimeMs before returning timeout and unlinking the lock if it's older than a safe threshold (e.g., 5-10s) so the user isn't forced to manually track down and delete the .lock file.
| if (await tryCreateLockFile(lockPath)) return { kind: 'acquired' } | ||
|
|
||
| const deadline = Date.now() + LOCK_WAIT_TIMEOUT_MS | ||
| while (Date.now() < deadline) { |
There was a problem hiding this comment.
[P1] Acquiring the lock and immediately returning acquired still leaves a race between the earlier activeBundle() read and lock creation. If another process refreshes and releases the lock in that window, this caller will POST with a stale refresh token and can spuriously hit AUTH_REFRESH_EXPIRED even though the store already has a fresh bundle. After any successful lock acquisition, re-read the bundle and skip POSTing when it has already changed.
| }, | ||
|
|
||
| async activeBundle(ref) { | ||
| const snapshot: Snapshot = |
There was a problem hiding this comment.
[P2] The record resolution logic (ref === undefined ? ...) and the structured access-outcome error mapping (if (!accessOutcome.ok) { ... throw new CliError(...) }) are identically duplicated from active(). Consider extracting a shared helper for resolution and another for the error mapping to avoid drift.
| const fallback = record.fallbackRefreshToken?.trim() | ||
| if (fallback) return { ok: true, token: fallback } | ||
|
|
||
| try { |
There was a problem hiding this comment.
[P3] The try/catch block that queries the keyring, trims the result, and handles SecureStoreUnavailableError is nearly identical to the one in readAccessTokenForRecord (differing only in the empty-slot detail string). Consider extracting this into a shared helper to remove the boilerplate duplication.
| function refreshProvider() { | ||
| return createPkceProvider<Account>({ | ||
| authorizeUrl: 'https://example.com/oauth/authorize', | ||
| tokenUrl: 'https://example.com/oauth/token', |
There was a problem hiding this comment.
[P2] This new refresh suite only covers the happy path and server/network failures with oauth4webapi present. The most user-visible branch introduced here is the optional-peer case, but there is still no automated check that a missing oauth4webapi import becomes AUTH_REFRESH_UNAVAILABLE with the install hint. Please add a test that forces the lazy import to fail with MODULE_NOT_FOUND/ERR_MODULE_NOT_FOUND and asserts the mapped error.
| }, 5_000) | ||
|
|
||
| it('carries the previous refresh token forward when the server omits it from the response', async () => { | ||
| const { store, state } = fakeStore({ |
There was a problem hiding this comment.
[P2] This error-path test asserts the propagated code but never verifies that the lock file was released after the rejected refresh. Since the helper's concurrency contract depends on finally cleanup, a stale-lock regression would leave later refreshes timing out while this test still passes. Please follow the rejection with a lock-path absence check or a second refresh that proves the lock can be reacquired.
| | `testing` (subpath) | `describeEmptyMachineOutput` | Vitest helpers reusable by consuming CLIs (e.g. parametrised empty-state suite covering `--json` / `--ndjson` / human modes). | | ||
| | Module | Key exports | Purpose | | ||
| | -------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | ||
| | `auth` (subpath) | `attachLoginCommand`, `attachLogoutCommand`, `attachStatusCommand`, `attachTokenViewCommand`, `runOAuthFlow`, `refreshAccessToken`, `createPkceProvider`, `createSecureStore`, `createKeyringTokenStore`, `migrateLegacyAuth`, `persistBundle`, `bundleFromExchange`, PKCE helpers, `AuthProvider` / `TokenStore` / `TokenBundle` / `ActiveBundleSnapshot` / `RefreshInput` / `AccountRef` / `SecureStore` / `UserRecordStore` types, `AttachLogoutRevokeContext` | OAuth runtime plus the Commander attachers for `<cli> [auth] login` / `logout` / `status` / `token`. `attachLogoutCommand` accepts an optional `revokeToken` hook for best-effort server-side token revocation. Ships the standard public-client PKCE flow (`createPkceProvider`), a thin cross-platform OS-keyring wrapper (`createSecureStore`), and a multi-account keyring-backed `TokenStore` (`createKeyringTokenStore`) that stores secrets in the OS credential manager and degrades to plaintext in the consumer's config when the keyring is unavailable (WSL/headless Linux/containers). The store contract supports an optional `setBundle(account, bundle)` write method (required on `KeyringTokenStore`) so consumers that need refresh-token persistence can opt in via `TokenBundle`; `active()` stays narrow (access token + account only) so callers that don't need refresh state don't pay extra keyring IPC. `AuthProvider` and `TokenStore` remain the escape hatches for DCR or fully bespoke backends. `logout` / `status` / `token` always attach `--user <ref>` and thread the parsed ref to `store.active(ref)` (and `store.clear(ref)` on `logout`). `commander` (when using the attachers), `open` (browser launch), and `@napi-rs/keyring` (when using `createSecureStore` or the keyring `TokenStore`) are optional peer/optional deps. | |
There was a problem hiding this comment.
[P2] AGENTS.md requires the "What's in it" table to reflect peer-dep changes before the PR lands. The Purpose column for the auth module lists commander, open, and @napi-rs/keyring as optional peer/optional deps, but it should also include oauth4webapi.
| const result = await oauth.processRefreshTokenResponse(as, client, response) | ||
| return { | ||
| accessToken: result.access_token, | ||
| refreshToken: result.refresh_token, |
There was a problem hiding this comment.
[P2] This new refresh-token HTTP path has no timeout/abort guard, so a hung token endpoint can block the CLI indefinitely while holding the refresh lock. Our service-readiness standard requires timeouts on all network calls: https://handbook.doist.com/doc/standard-service-production-readiness-qgNRng8TJx. Please thread an AbortSignal-backed timeout (or another timeout-capable fetch path) into the oauth4webapi request.
| fetchLive?(ctx: { | ||
| account: TAccount | ||
| token: string | ||
| /** |
There was a problem hiding this comment.
[P3] This changes a public callback contract by adding fetchLive({ bundle }), but the README’s attachStatusCommand section still documents fetchLive as receiving only { account, token, view, flags }. Per the repo’s README-maintenance guidance, please update that usage block alongside this API change.
Summary
refreshAccessToken({ store, provider, lockPath, skewMs?, force?, ref? })rotates the access token proactively (default 60s skew) or reactively (force: trueafter a 401).O_EXCLfile lock atlockPathserialises concurrent invocations; waiters re-read the rotated bundle and skip POSTing.createPkceProvidernow implementsrefreshTokenviaoauth4webapi, declared as an optional peer dep (only refresh-capable consumers install it).invalid_grant(any status — proxies vary) →AUTH_REFRESH_EXPIRED; everything else →AUTH_REFRESH_TRANSIENT; missing peer / no refresh token / noactiveBundle/ no provider hook →AUTH_REFRESH_UNAVAILABLE.runOAuthFlowpersists the full bundle viapersistBundle({ promoteDefault: true })so refresh tokens issued at login land on the keyring's refresh slot.TokenStore.activeBundle?(ref)added (required override onKeyringTokenStore). Parallel-reads both slots, honouring thehasRefreshToken: falserecord gate so access-only records skip the refresh-slot IPC.active()stays narrow.status.fetchLivectx now optionally carriesbundle— best-effort populated when the store implementsactiveBundle.Driver: Doist/outline-cli#74. Plan:
.claude/plans/cli-core-pr3-refresh-helper.md.PR1 (#37) established the storage contract; PR2 (#38) hardened legacy-record migration. This PR completes the trio.
Out of scope (deferred)
refreshToken(no consumer needs it yet; theAuthProviderhook is already in place).exchangeCodeontooauth4webapi.oauth4webapion logout.Test plan
npm run checkcleannpm run type-checkcleannpm test— 418 tests pass (27 new)npm run buildcleannpm linkinto outline-cli refresh branch;ol auth loginpopulates both keyring slots; subsequent calls past expiry refresh silentlyAUTH_REFRESH_EXPIREDwith re-login hintolcommands at expired moment → one POST hits the serveroauth4webapiinstalled in consumer: refresh throwsAUTH_REFRESH_UNAVAILABLEwith install hint; login still works🤖 Generated with Claude Code