Skip to content

feat(web): upgrade rate-limiter to Upstash Redis with in-memory fallback#49

Merged
bolin8017 merged 2 commits into
mainfrom
feat/upstash-rate-limiter
Apr 18, 2026
Merged

feat(web): upgrade rate-limiter to Upstash Redis with in-memory fallback#49
bolin8017 merged 2 commits into
mainfrom
feat/upstash-rate-limiter

Conversation

@bolin8017
Copy link
Copy Markdown
Owner

Summary

Closes the deferred P1 rate-limiter gap. The old webhook rate-limiter was a module-level Map that reset on every Vercel cold start and did not share state across regions.

  • Production path: when `UPSTASH_REDIS_REST_URL` + `UPSTASH_REDIS_REST_TOKEN` are set, every `checkRateLimit()` hits an Upstash sliding-window counter shared across all instances. Provision via Vercel Marketplace → Upstash Redis (auto-populates both env vars).
  • Fallback path: missing env vars OR an Upstash outage during a request transparently falls back to the in-memory Map. Webhook delivery is never blocked by a cache failure.

`checkRateLimit` is now `Promise`. All 3 call sites (Telegram webhook, LINE webhook, auth callback) `await` it.

Deployment

After merge:

  1. In Vercel dashboard → Integrations → Upstash → provision Redis, select `caffecode` project. `UPSTASH_REDIS_REST_URL` / `UPSTASH_REDIS_REST_TOKEN` get auto-populated.
  2. Redeploy (empty `vercel redeploy` or push any commit).

Before step 1, the code falls back to in-memory — no downtime.

Test plan

  • `pnpm --filter @caffecode/web exec tsc --noEmit` — clean
  • `pnpm --filter @caffecode/web exec vitest run` — 571/571 pass (rate-limiter tests exercise the in-memory fallback path; UPSTASH_* unset in test env)

🤖 Generated with Claude Code

The webhook rate-limiter was a module-level Map that reset on every
Vercel cold start and never shared state across regions — a real gap
for bursty webhook traffic, flagged as P1 in the review and deferred
pending external infrastructure.

This commit wires up `@upstash/ratelimit` + `@upstash/redis` with a
strict fallback:

- If `UPSTASH_REDIS_REST_URL` + `UPSTASH_REDIS_REST_TOKEN` are set,
  every `checkRateLimit()` call hits a sliding-window Upstash limit
  shared by every instance. Provisioning is a one-click install from
  the Vercel Marketplace (Upstash Redis integration auto-populates
  both env vars).
- If either env var is missing, or Upstash is unreachable during a
  request, the helper transparently falls back to the in-memory Map
  so webhook delivery is never blocked by a cache outage.

`checkRateLimit` signature changes from sync to `Promise<boolean>`.
All three call sites (Telegram webhook, LINE webhook, auth callback)
now `await` it. The test suite exercises the in-memory path
(UPSTASH_REDIS_REST_URL unset in test env) so behaviour is verified
end-to-end without the remote dependency.

Also adds the two new env vars to `turbo.json` build inputs and to
`.env.example` with provisioning notes, and lists PUSH_SELECT_CONCURRENCY
/ PUSH_DISPATCH_CONCURRENCY (added in PR #44) for cache-key correctness.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 18, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
caffecode Ready Ready Preview, Comment Apr 18, 2026 3:37am

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 18, 2026

📊 Coverage Report

Package Statements Branches Functions Lines
shared 92.22% ❌ (≥95%) 87.74% ❌ (≥90%) 96.87% ✅ (≥95%) 91.88% ❌ (≥95%)
web 94.19% ✅ (≥90%) 89.37% ✅ (≥85%) 96.47% ✅ (≥90%) 94.73% ✅ (≥90%)

🤖 Generated by CI · thresholds enforced via vitest

Per reviewer-pr49 finding: if UPSTASH_REDIS_REST_URL is set to a
malformed value (typo during `vercel env add`), `Redis.fromEnv()`
throws at module load and crashes every cold start of the webhook
route. Wrap the constructor in try/catch so malformed envs fall
through to the in-memory fallback instead.

Also promotes the silent Upstash-runtime-error fallback to a
`logger.warn` so incidents are visible in Pino output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bolin8017 bolin8017 merged commit cce9293 into main Apr 18, 2026
3 checks passed
@bolin8017 bolin8017 deleted the feat/upstash-rate-limiter branch April 18, 2026 03:37
bolin8017 added a commit that referenced this pull request Apr 18, 2026
Audit pass over every .md after merging 17 PRs (#38-#54) in the
review cycle. Nothing behavioural changed; this PR just retires claims
that no longer match the repo.

Corrections:

- README.md: bumped prerequisite from Node 22+/pnpm 9 to Node 22+/pnpm
  10 (CI and Vercel pinned to Node 24 via .nvmrc). Updated test counts
  185 -> 186 (shared) and 566 -> 571 (web) for a new TS total of 757;
  matches the counts verified at the end of the Cache Components PR.
- CLAUDE.md: same test count update; replaced "Web + Worker" in the
  architecture + deployment tables with "Web (includes cron API route)"
  since apps/worker/ was deleted in PR #33 and the push pipeline now
  runs as /api/cron/push inside the web app. Dropped `worker` from the
  scope enum. Added a short Cache Components note pointing at the rule
  file instead of duplicating the pattern here.
- .claude/rules/git-conventions.md: same scope enum fix, with a note
  explaining why `worker` was retired. Replaced the `fix(worker): ...`
  example with `fix(web): ...`.
- .claude/rules/deployment.md: renamed "Worker Cron" section to "Cron
  Entry" (matches push-pipeline.md terminology), rewrote the deploy
  step and the Web+Worker cloud-services entry, and added Upstash
  Redis as an optional service now that PR #49 uses it for the
  cross-instance webhook rate limiter.
- .claude/rules/notifications.md: removed the stale
  `apps/worker/src/channels/**` glob; clarified that the email
  renderer lives in packages/shared.
- docs/staging-setup.md: every "Web + Worker" phrasing updated to
  reflect the single Vercel deployment that now hosts the cron route.

No production references remain to `apps/worker/`, `pnpm@9.15.0`, or
outdated test counts.
bolin8017 added a commit that referenced this pull request Apr 18, 2026
## Summary

Audit pass over every `.md` after merging 17 PRs (#38#54) in the review
cycle. Nothing behavioural changed; just retire claims that no longer
match the repo.

## Corrections

- **`README.md`**: Node 22 → Node 22+ with CI/Vercel on Node 24 (via
`.nvmrc`); pnpm 9 → pnpm 10.33.0. Test counts: shared 185 → 186, web 566
→ 571, TS total 751 → 757.
- **`CLAUDE.md`**: same test count update; "Web + Worker" → "Web
(includes cron API route)" everywhere (apps/worker/ deleted in PR #33 —
push pipeline now runs as `/api/cron/push` inside the web app). Dropped
`worker` from the commit scope enum. Added a short Cache Components note
pointing at `.claude/rules/web-patterns.md` instead of duplicating the
pattern.
- **`.claude/rules/git-conventions.md`**: same scope enum fix with
explanation. Replaced `fix(worker): ...` example with `fix(web): ...`.
- **`.claude/rules/deployment.md`**: "Worker Cron" section → "Cron
Entry" (matches push-pipeline.md terminology). Added Upstash Redis as an
optional cloud service now that PR #49 uses it for the cross-instance
webhook rate limiter.
- **`.claude/rules/notifications.md`**: removed the stale
`apps/worker/src/channels/**` path glob; clarified that the email
renderer lives in `packages/shared`.
- **`docs/staging-setup.md`**: every "Web + Worker" phrasing updated to
reflect the single Vercel deployment.

## Test plan

- `grep` sweep confirms no remaining references to `apps/worker/`,
`pnpm@9.15.0`, `751 TS`, `185 tests`, `566 tests`, `Node.js 22+`
(without the updated Node 24 context), or "Web + Worker" as a claim
about current state.
- Only remaining mentions of `apps/worker/` are in explanatory sentences
describing the removal (CLAUDE.md, deployment.md, git-conventions.md).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant