Skip to content

feat(auth): two-phase migrate write for the bundle-shape contract#38

Merged
scottlovegrove merged 4 commits into
mainfrom
feat/auth-migration-bundle-shape
May 19, 2026
Merged

feat(auth): two-phase migrate write for the bundle-shape contract#38
scottlovegrove merged 4 commits into
mainfrom
feat/auth-migration-bundle-shape

Conversation

@scottlovegrove
Copy link
Copy Markdown
Collaborator

@scottlovegrove scottlovegrove commented May 19, 2026

Summary

PR2 of three. Hardens migrateLegacyAuth against the two ways the storage contract change in #37 can bite a partially-completed migration:

  1. Retry after failure clobbers a freshly-logged-in v2 record. Pre-existing migrate used upsert (replace-not-merge), which would destroy any newer state if a previous run's markMigrated() had failed and the user has since logged in afresh.
  2. A partial migration leaves the user credential-less. Keyring-first writing meant a crash between setSecret and upsert left a keyring orphan + no record — the CLI couldn't read the token until the next migration attempt completed.

Both solved by restructuring the migration write into two phases.

Approach

  • Phase 1 (record-first): write the v2 record with fallbackToken: legacyToken and hasRefreshToken: false. Atomic via the new optional UserRecordStore.tryInsert? hook (introduced by #37) when present; falls back to list-then-upsert otherwise — narrow TOCTOU window, tolerable for a one-time migration. Returns false when a v2 record for the account already exists; Phase 2 is then skipped so the live record stays untouched.
  • Phase 2 (keyring move): only when Phase 1 wrote the record. Delegates to writeRecordWithKeyringFallback (single-source for the keyring-then-clean-upsert flow) which moves the token into the per-user slot and clears the fallbackToken. Best-effort: a failure leaves the Phase 1 fallback in place; reads keep working; a later set() / setBundle() upgrades the credential naturally.

Marker, default promotion, and cleanup ordering all unchanged.

Contract surface

No public API changes. MigrateLegacyAuthOptions, MigrateAuthResult, MigrateSkipReason are byte-for-byte identical. Stores that don't implement the optional tryInsert? get the legacy code path.

Test plan

  • npm run check — clean
  • npm run type-check — clean
  • npm test — 386 pass (382 → +4 new: tryInsert atomic path, pre-existing v2 record skips Phase 2 atomic, pre-existing v2 record skips Phase 2 legacy, Phase 2 keyring failure benign)
  • npm run build — clean
  • Manual: live run from a pre-feat(auth): add TokenBundle storage contract to TokenStore #37 todoist-cli install once cli-core publishes

Note on PR3

PR3 (refresh helper + PKCE.refreshToken via oauth4webapi + flow.ts swap) lands next. Together the three PRs ship the full refresh-token feature.

🤖 Generated with Claude Code

…ck-first record

PR2 of the refresh-token series. Hardens `migrateLegacyAuth` against two
classes of issue exposed by PR1's bundle contract:

1. Retry-after-failure clobbers a freshly-logged-in v2 record.
   Pre-existing behaviour used `upsert` (replace-not-merge), which would
   destroy any newer state if a previous run's `markMigrated()` had
   failed and the user has since logged in afresh between attempts.
2. A partial migration leaves the user without working credentials.
   The keyring-first write meant a crash between `setSecret` and
   `upsert` left a keyring orphan and no record — the CLI couldn't
   read the token until the next migration attempt completed.

Both are solved by restructuring the migration write into two phases:

- **Phase 1 (record-first)**: write the v2 record with
  `fallbackToken: legacyToken` and `hasRefreshToken: false`. Atomic via
  the new `UserRecordStore.tryInsert?` hook (added in PR1) when the
  store implements it; falls back to list-then-upsert otherwise
  (narrow TOCTOU window, tolerable for a one-time migration). Returns
  `false` when a v2 record for the account already exists — Phase 2 is
  then skipped to preserve the live state.
- **Phase 2 (keyring move)**: only when Phase 1 wrote the record.
  Delegates to `writeRecordWithKeyringFallback`, which moves the token
  to the per-user keyring slot and upserts a clean record without
  `fallbackToken`. Best-effort: a failure leaves the Phase 1 fallback
  in place and the CLI continues to work via the plaintext fallback;
  a later `set()` or `setBundle()` upgrades the credential naturally.

Behaviour preserved for all existing callers:
- The contract surface (`migrateLegacyAuth` signature, `MigrateAuthResult`,
  `MigrateSkipReason`) is unchanged.
- Single-token records still end up with `hasRefreshToken: false` (PR1's
  `writeRecordWithKeyringFallback` already writes this).
- Migrated records get `setDefaultId(account.id)` best-effort unless
  another default is already pinned (existing behaviour).
- `markMigrated` is still the one-way gate; cleanup still runs concurrently.

Test surface (382 → 386):
- `tryInsert` path exercised when the store opts in.
- Pre-existing v2 record skips Phase 2 (both atomic and legacy paths).
- Phase 2 keyring failure is benign — Phase 1 fallback survives, marker
  + cleanup still run.
- Existing legacy-keyring / WSL / cleanup tests updated for the new
  two-upsert happy-path call count.
- `buildUserRecords` test harness now accepts `withTryInsert: true` so
  tests can pick the atomic vs legacy code path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@scottlovegrove scottlovegrove self-assigned this May 19, 2026
Copy link
Copy Markdown
Member

@doistbot doistbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR addresses migration data loss and race conditions by restructuring the legacy auth migration into a two-phase process that secures a fallback record before modifying the keyring. The approach effectively mitigates TOCTOU vulnerabilities and crash-related token loss, providing a much safer transition path. There are a few areas to refine, such as ensuring existing v2 records are fully readable before skipping the second phase, tightening error handling during the keyring move, and strengthening the test suite against implementation coupling and repetitive setup.

Share FeedbackReview Logs

Comment thread src/auth/keyring/migrate.ts Outdated
Comment thread src/auth/keyring/migrate.test.ts Outdated
Comment thread src/auth/keyring/migrate.ts Outdated
Comment thread src/auth/keyring/migrate.ts Outdated
Comment thread src/auth/keyring/migrate.ts Outdated
Comment thread src/auth/keyring/migrate.test.ts Outdated
Comment thread src/auth/keyring/migrate.test.ts
Comment thread src/auth/keyring/migrate.test.ts Outdated
scottlovegrove and others added 2 commits May 19, 2026 13:53
8 findings on #38. All resolved.

P1 — correctness
- Verify the existing v2 record is readable before retiring the legacy
  state. A clean record from a prior `set()` / `setBundle()` has no
  `fallbackToken` and depends on the keyring being reachable; if it
  isn't, cleaning up the legacy slot would brick the user. New
  `verifyExistingRecordReadable` short-circuits with
  `skipped(legacy-keyring-unreachable)` so the next retry has a
  working credential to fall back to.

P2 — correctness
- Inline Phase 2 instead of delegating to
  `writeRecordWithKeyringFallback`. The delegated path double-upserted
  the same fallback record on the offline branch (Phase 1 had already
  written it). Phase 2 now calls `setSecret` directly, falls through on
  `SecureStoreUnavailableError`, and only upserts the clean shape when
  the keyring write actually succeeded.
- Narrow the Phase 2 catch. Only `SecureStoreUnavailableError` is
  swallowed; other failures surface as
  `skipped(user-record-write-failed)` so operators see transient
  issues and the marker stays unset for retry.

P2 — refactor / DRY
- Extract `buildSingleTokenRecord(account, fallbackToken?)` into
  `record-write.ts`. Migration Phase 1 and `writeBundleWithKeyringFallback`
  both use it — no more drift between the two record-shape sites.
- `runMigration` test helper takes `withTryInsert?: boolean` and
  threads it through `buildUserRecords`. The three new
  manually-wired-up tests collapse into the standard helper, and
  pre-existing v2 record cases use `it.each([true, false])` to cover
  both paths with one body.

P2 — tests
- Replace the "exact upsert count" assertion with end-state assertions
  (token in keyring, no plaintext residue, legacy slot empty).
- New `listSpy` on `buildUserRecords` so the `tryInsert` test can
  assert `list()` was never consulted — the TOCTOU race the atomic
  path is meant to prevent.
- New regression: existing v2 record + keyring offline returns
  `skipped(legacy-keyring-unreachable)` without touching legacy state.
- New regression: existing v2 record with `fallbackToken` skips the
  readability probe entirely (record is self-sufficient).
- New regression: Phase 2 `SecureStoreUnavailable` falls back
  silently (Phase 1 record survives, marker still set, cleanup runs).
- New regression: Phase 2 non-keyring `setSecret` failure surfaces as
  `skipped(user-record-write-failed)`.

P3 — comment hygiene
- Drop the "Phase 1 + Phase 2 = two upserts" restatement comment.
- Drop the meta-comment justifying end-state assertions.
- JSDoc on `ensureV2Record` now spells out "time-of-check, time-of-use"
  rather than the bare TOCTOU acronym.

386 → 389 tests pass. Check / type-check / build all clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… → phase1Written

No behaviour change. Two passes:

1. JSDoc audit — collapsed restated detail across the main function's
   numbered list, the two helper JSDocs, and the inline comments. The
   main `migrateLegacyAuth` block goes from 32 lines to 19; helper docs
   compress to 6 lines each; inline comments either drop (when they
   duplicated JSDoc) or trim to the WHY only.

2. Naming nit (per Scott on #38): `phase1Wrote` → `phase1Written`. Past
   participle reads cleanly as a boolean flag.

Net -52 LOC.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@scottlovegrove
Copy link
Copy Markdown
Collaborator Author

@doistbot /review

Copy link
Copy Markdown
Member

@doistbot doistbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR successfully implements the two-phase migration strategy, significantly improving the resilience of the authentication migration against partial failures and data loss. The structured approach to fallback handling provides a much safer upgrade path for legacy credentials. There are a few critical edge cases to resolve regarding unhandled storage rejections that could crash the CLI and a retry short-circuit that might leave tokens in plaintext indefinitely. Additionally, there are some opportunities to deduplicate token read logic, optimize configuration reads, and tighten up test assertions and observability logs.

Share FeedbackReview Logs

Comment thread src/auth/keyring/migrate.ts Outdated
Comment thread src/auth/keyring/migrate.ts Outdated
Comment thread src/auth/keyring/record-write.ts Outdated
Comment thread src/auth/keyring/migrate.ts Outdated
Comment thread src/auth/keyring/migrate.ts Outdated
Comment thread src/auth/keyring/migrate.test.ts Outdated
Comment thread src/auth/keyring/migrate.test.ts Outdated
Comment thread src/auth/keyring/migrate.test.ts Outdated
Comment thread src/auth/keyring/migrate.ts Outdated
Comment thread src/auth/keyring/migrate.ts Outdated
10 findings on #38. All resolved.

P1 — correctness
- Phase 2 retry on existing fallback-bearing record. When the existence
  check returns false AND the existing record's `fallbackToken` matches
  our legacy token, treat it as a prior incomplete migration and finish
  Phase 2 (move to keyring, clear fallback). Without this, a transient
  Phase 2 failure stuck the user on plaintext fallback indefinitely;
  future retries skipped Phase 2 because "record exists".
- `userRecords.list()` failures inside the readability branch are now
  caught — the existence/readability path runs through the outer
  `migrateLegacyAuth` try/catch via `ensureV2Record`'s richer return,
  and `runPhase2` similarly bubbles errors to the caller as
  `skipped(user-record-write-failed)`.

P2 — refactor / DRY
- New `keyring/internal.ts` with `readAccessTokenForRecord(record,
  secureStore)` returning a structured `ReadAccessTokenOutcome` union.
  Single source for "fallback first, then keyring slot, then error
  variants". `KeyringTokenStore.active()` and the migration's readability
  probe both call it; consumers map the outcome to their own error
  contract (typed `CliError` vs. `MigrateSkipReason`).
- `ensureV2Record` returns `{ written: true } | { written: false; existing }`
  so the orchestration doesn't pay a second `list()` to re-fetch the
  existing record. Phase 2 retry and the external-record readability
  branch both reuse the returned record.
- Phase 2 extracted into `runPhase2(userRecords, userSlot, account,
  legacyToken)` so the new "initial write" and "retry" paths share one
  implementation.

P2 — observability
- New `user-keyring-unreachable` skip reason. The per-user slot being
  offline is now distinct from the legacy slot being offline; operators
  no longer see "legacy credential unreachable" when the legacy slot
  was just read successfully. Both reasons stay retryable.

P2 — docs
- `buildSingleTokenRecord` JSDoc no longer claims
  `writeBundleWithKeyringFallback` uses it. The bundle path carries
  expiry fields the single-token shape doesn't, so they're intentionally
  separate; the helper captures only the
  `hasRefreshToken: false` + optional `fallbackToken` pair that the
  migration paths share.

P3 — tests
- Dropped the `upsertSpy` count assertion in the Phase 2
  SecureStoreUnavailable test (coupled to non-atomic Phase 1).
- New regression: existing clean v2 record + per-user slot empty →
  `skipped(user-record-write-failed)`, marker stays unset.
- New regression: existing record with external `fallbackToken` skips
  Phase 2 AND never probes the per-user slot (`km.getCalls`
  assertion locks the no-IPC behaviour).
- New regression: existing record carrying our legacy token retries
  Phase 2 cleanly (setSecret + clean upsert; final state matches a
  one-shot migration).
- Dropped the "One handle for both…" restated comment per the repo's
  "comments explain why, not what" guidance.

386 → 391 tests pass. Check / type-check / build all clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@scottlovegrove scottlovegrove merged commit 9cf5e9c into main May 19, 2026
4 checks passed
@scottlovegrove scottlovegrove deleted the feat/auth-migration-bundle-shape branch May 19, 2026 13:41
doist-release-bot Bot added a commit that referenced this pull request May 19, 2026
## [0.18.0](v0.17.0...v0.18.0) (2026-05-19)

### Features

* **auth:** two-phase migrate write for the bundle-shape contract ([#38](#38)) ([9cf5e9c](9cf5e9c))
@doist-release-bot
Copy link
Copy Markdown
Contributor

🎉 This PR is included in version 0.18.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants