Skip to content

hir-94: actually reschedule queue entries when an account is over quota#18

Open
jaredzwick wants to merge 1 commit intopypesdev:mainfrom
jaredzwick:hir-94/queue-quota-reschedule
Open

hir-94: actually reschedule queue entries when an account is over quota#18
jaredzwick wants to merge 1 commit intopypesdev:mainfrom
jaredzwick:hir-94/queue-quota-reschedule

Conversation

@jaredzwick
Copy link
Copy Markdown
Collaborator

Summary

The quota-exceeded branch in processEmailQueue claimed to "reschedule for quota reset time" but the underlying updateQueueStatus call only touched status and errorMessagescheduledFor stayed at its original value. Since getNextPendingEmails filters by scheduledFor <= now, the same entry got re-picked on every batch, re-checked unsubscribe + quota, and re-skipped. Under sustained load with one over-quota account, the processor's entire batch fills with the same N entries every iteration and other accounts' work starves.

Changes

  • libs/db/src/queries/emailQueue.ts: new rescheduleQueueEntry(id, newScheduledFor, errorMessage?) helper. Keeps status pending, bumps scheduledFor, sets updatedAt. Does NOT increment attemptCount because no send was attempted.
  • src/lib/queueScheduling.ts (new): pure computeQuotaRescheduleAt(quotaResetAt, now?, minDelayMs?). Honors the account's stored quota reset when it's far enough out; floors at now + minDelay (default 1 minute) when the reset is null, in the past, equal to now, or sooner than the floor. Always returns a Date >= now so we never schedule backwards.
  • src/lib/emailQueueProcessor.ts: quota-exceeded branch calls rescheduleQueueEntry with the computed Date. Also drops the previous "reset is in the past → fall through and try to send" path. A fresh quota window is the account row's responsibility; falling through would attempt a send against an account whose quotaUsedToday is still at the limit. Safer to wait for the reset to be applied.

Tests

  • tests/int/queueScheduling.int.spec.ts — 12 specs: future reset honored; null/undefined fallback; past-reset floor; equal-to-now floor; sub-floor reset clamped to floor; just-past-floor reset honored; custom minDelayMs; negative minDelayMs clamped to zero; fresh-Date guarantee (output is not the input reference); default-now omission; "output >= now" property across a parametric input set.
  • All 12 new tests pass. Full pnpm test:int: 107/108 pass; the 1 failure is the pre-existing api.int.spec.ts Payload-secret issue on main, unrelated.
  • pnpm lint clean for all changed files.

Regression analysis

  • The processor's other branches (sent, failed, retry, max-attempts, unsubscribed, scheduled-in-future) are untouched.
  • updateQueueStatus is still used everywhere it was before. The new rescheduleQueueEntry is additive.
  • Removing the "reset already in the past → try to send anyway" fall-through is a behavior change, but the previous behavior was buggy: it would attempt a Gmail send against an account whose quotaUsedToday had not been reset, which Gmail would reject and the entry would then count as a real failure (incrementing attemptCount toward the max-attempts cap). The new behavior — wait for the reset — is strictly safer.

Test plan

  • Manually exhaust a connected Gmail account's daily quota (set dailyQuota: 1 and quotaUsedToday: 1 in the DB).
  • Create a campaign with 10 recipients on that account, plus 5 recipients on a second connected account.
  • POST /api/email-queue/process and confirm: the 5 entries on account feat: email template library + /templates page + Browse picker (HIR-75) #2 send; the 10 entries on the over-quota account get scheduledFor bumped to the account's quotaResetAt, and the next process call no longer re-picks them.
  • Manually set quotaUsedToday: 0 and quotaResetAt: now() + 1 hour, set the queue entry's scheduledFor: now() + 30 seconds, and confirm a process call right now does not re-pick it.

Cumulative HIR-94 progress

  • #8 — Gmail OAuth connect (22 specs)
  • #9 — Campaign create UI + recipient parser (9 specs)
  • #10 — RFC-correct MIME (41 specs)
  • 🟡 #13 — Per-recipient variables + substitution (25 new specs)
  • 🟡 This PR — Queue quota-rescheduling (12 specs)
  • ⏭ Next: CSV recipient upload; daily quota reset job.

🤖 Generated with Claude Code

The quota-exceeded branch in `processEmailQueue` claimed to "reschedule
for quota reset time" but the underlying `updateQueueStatus` call only
touched `status` and `errorMessage`. `scheduledFor` stayed at its
original value. The next batch picked the same entry again — because
`getNextPendingEmails` filters by `scheduledFor <= now` — wasted a slot
re-running the unsubscribe + quota checks, and skipped it again. Under
sustained load with one over-quota account, the processor would fill
its entire batch every iteration with the same N entries and never make
forward progress on other accounts' work.

This change:

- libs/db/src/queries/emailQueue.ts: new `rescheduleQueueEntry(id,
  newScheduledFor, errorMessage?)` helper. Keeps status pending,
  bumps `scheduledFor`, sets `updatedAt`. Does NOT increment
  `attemptCount` because no send was attempted.
- src/lib/queueScheduling.ts (new): pure `computeQuotaRescheduleAt(
  quotaResetAt, now?, minDelayMs?)` — picks the future moment to retry.
  Honors the account's stored quota reset when it's far enough out;
  floors at `now + minDelay` (default 1 minute) when the reset is
  null, in the past, equal to now, or sooner than the floor. Always
  returns a Date >= now so we never schedule into the past.
- src/lib/emailQueueProcessor.ts: quota-exceeded branch now calls
  `rescheduleQueueEntry` with the computed Date instead of
  `updateQueueStatus`. Also drops the previous "reset is in the past
  -> fall through and try to send" path: a fresh quota window resetting
  is the email-account row's responsibility, not the processor's, and
  the previous fall-through would attempt a send against an account
  whose `quotaUsedToday` was still at the limit. Safer to wait for the
  reset to be applied.

Tests: 12 new vitest specs covering future-reset honored, null/undefined
fallback, past-reset floor, equal-to-now floor, sub-floor reset
clamped, just-past-floor reset honored, custom minDelay, negative
minDelay clamped to zero, fresh-Date guarantee, default-now omission,
"never earlier than now" property across a parametric set of inputs.

107/108 tests pass in the full int suite (the 1 failure is the
pre-existing Payload-secret config issue on `main`, unrelated). Lint
clean for changed files.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant