feat: add opt-out flags for token negative cache and rate limiting#121
Conversation
Add two new configuration flags (defaulting to enabled for backward compatibility): - ENABLE_TOKEN_NEGATIVE_CACHE: controls the in-memory _missing_tokens TTLCache (24h TTL) in TokenStore - ENABLE_TOKEN_RATE_LIMIT: controls the middleware that short-circuits requests for missing tokens with 401/429 These flags allow operators to disable these features in multi-replica deployments where the per-process caches cause issues: 1. _missing_tokens is local to each replica, so a token cached as missing on one replica will be rejected for up to 24h even after it becomes valid in Redis (e.g. newly registered user) 2. The rate-limiting middleware uses request.client.host for IP tracking, which behind a reverse proxy (e.g. Cloudflare) resolves to the proxy IP, causing all users to share the same rate limit 3. The 401/429 responses are returned before CORSMiddleware runs, so browsers report them as CORS errors rather than the actual error Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Someone is attempting to deploy a commit to the Bimal Timilsina's projects Team on Vercel. A member of the Team first needs to authorize it. |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces new configuration flags to allow operators to disable the in-memory token negative cache and the token rate-limiting middleware. These changes aim to resolve issues encountered in multi-replica, Cloudflare-proxied Kubernetes environments, such as stale negative cache entries across replicas, inaccurate IP-based rate limiting, and CORS errors masking actual failures. By providing opt-out mechanisms, the system becomes more flexible for deployments where infrastructure already handles these concerns. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces two optional feature flags, ENABLE_TOKEN_NEGATIVE_CACHE and ENABLE_TOKEN_RATE_LIMIT, to give operators more control over token caching and rate limiting behavior. This is a well-reasoned change to address issues in multi-replica deployments. The implementation is straightforward and correctly uses the new settings to conditionally enable/disable the respective functionalities. I have one suggestion to improve the readability of the negative cache check.
| if settings.ENABLE_TOKEN_NEGATIVE_CACHE: | ||
| try: | ||
| if token in self._missing_tokens: | ||
| logger.debug(f"[REDIS] Negative cache hit for missing token {token}") | ||
| return None | ||
| except Exception as e: | ||
| logger.debug(f"Failed to check negative cache for {token}: {e}") |
There was a problem hiding this comment.
To improve readability and reduce nesting, you can combine the feature flag check with the cache lookup. Python's and operator uses short-circuit evaluation, so token in self._missing_tokens will only be evaluated if settings.ENABLE_TOKEN_NEGATIVE_CACHE is True, making this change safe.
try:
if settings.ENABLE_TOKEN_NEGATIVE_CACHE and token in self._missing_tokens:
logger.debug(f"[REDIS] Negative cache hit for missing token {token}")
return None
except Exception as e:
logger.debug(f"Failed to check negative cache for {token}: {e}")There was a problem hiding this comment.
Pull request overview
Adds operator-controlled feature flags to disable (a) the in-process “missing token” negative cache and (b) the middleware that short-circuits requests for known-missing tokens, to better support multi-replica/proxied deployments.
Changes:
- Introduces
ENABLE_TOKEN_NEGATIVE_CACHEandENABLE_TOKEN_RATE_LIMITsettings (both defaultTrue). - Gates
_missing_tokensnegative-cache reads/writes inTokenStore.get_user_data()behindENABLE_TOKEN_NEGATIVE_CACHE. - Adds an early pass-through in
block_missing_token_middlewarewhenENABLE_TOKEN_RATE_LIMITis disabled.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| app/services/token_store.py | Adds a settings gate for negative-cache behavior inside get_user_data(). |
| app/core/config.py | Defines the new settings flags with defaults. |
| app/core/app.py | Adds a settings gate to bypass the missing-token blocking/rate-limit middleware. |
Comments suppressed due to low confidence (1)
app/core/app.py:78
- block_missing_token_middleware only triggers when the token is present in token_store._missing_tokens. But _missing_tokens is only populated when ENABLE_TOKEN_NEGATIVE_CACHE is enabled, so setting ENABLE_TOKEN_NEGATIVE_CACHE=false while leaving ENABLE_TOKEN_RATE_LIMIT=true effectively disables this middleware’s behavior (without making that dependency explicit). Consider either gating this middleware on ENABLE_TOKEN_NEGATIVE_CACHE as well, or changing the middleware to determine “missing token” in a way that doesn’t rely on the negative cache (or at least documenting the coupling).
path = request.url.path.lstrip("/")
seg = path.split("/", 1)[0] if path else ""
try:
# If token is known-missing, short-circuit and track IP failures
if seg and seg in token_store._missing_tokens:
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| logger.debug(f"Failed to check negative cache for {token}: {e}") | ||
|
|
||
| logger.debug(f"[REDIS] Cache miss. Fetching data from redis for {token}") | ||
| key = self._format_key(token) |
There was a problem hiding this comment.
These debug logs include the raw token value. Tokens appear to be treated as sensitive elsewhere (e.g., redact_token(...) in other log lines), so logging the full token can leak credentials into log aggregation. Consider using redact_token(token) (or removing the token entirely) in these messages.
| if token in self._missing_tokens: | ||
| logger.debug(f"[REDIS] Negative cache hit for missing token {token}") | ||
| return None |
There was a problem hiding this comment.
The negative-cache hit log line prints the full token value, which can leak credentials into logs. Please redact the token (e.g., via redact_token(token)) or avoid including it in log messages.
| @alru_cache(maxsize=2000, ttl=43200) | ||
| async def get_user_data(self, token: str) -> dict[str, Any] | None: | ||
| # Short-circuit for tokens known to be missing | ||
| try: | ||
| if token in self._missing_tokens: | ||
| logger.debug(f"[REDIS] Negative cache hit for missing token {token}") | ||
| return None | ||
| except Exception as e: | ||
| logger.debug(f"Failed to check negative cache for {token}: {e}") | ||
| if settings.ENABLE_TOKEN_NEGATIVE_CACHE: | ||
| try: |
There was a problem hiding this comment.
With @alru_cache on get_user_data, results are cached per token at this layer. That means disabling ENABLE_TOKEN_NEGATIVE_CACHE may still leave a replica returning a cached “missing” (None) response for some period, so it may not achieve the goal of always re-checking Redis for previously-missing tokens in multi-replica deployments. Consider conditionally bypassing the alru_cache when negative caching is disabled, or ensuring that missing-token results are not cached / have a very short TTL.
Summary
ENABLE_TOKEN_NEGATIVE_CACHE(defaultTrue) to control the in-memory_missing_tokensTTLCache inTokenStoreENABLE_TOKEN_RATE_LIMIT(defaultTrue) to control the middleware that short-circuits requests for missing tokens with 401/429Both default to enabled, so existing deployments are unaffected.
Motivation
We deploy Watchly in a Cloudflare-proxied, multi-replica Kubernetes environment and hit several issues with the current token caching and rate-limiting:
Stale negative cache across replicas:
_missing_tokensis a per-processTTLCachewith a 24h TTL. In a multi-replica setup, a token cached as missing on one replica will be rejected for up to 24 hours even after it becomes valid in Redis (e.g. a newly registered user). This causes intermittent 401 errors depending on which replica handles the request.Incorrect IP detection behind proxies: The rate-limiting middleware uses
request.client.host, which behind Cloudflare (or any reverse proxy) resolves to the proxy's IP rather than the real client. This means all users share the same rate-limit counter — 8 failures from anyone triggers 429s for everyone on that replica.CORS errors masking real errors: Both the 401 and 429 responses are returned as
HTMLResponsebeforecall_nextis invoked, soCORSMiddlewarenever adds CORS headers. Browsers report these as CORS failures rather than showing the actual error, making debugging very difficult.These flags let operators disable these features where the infrastructure already handles the concerns differently (e.g. Redis for shared state, Cloudflare for rate limiting).
Test plan
ENABLE_TOKEN_NEGATIVE_CACHEandENABLE_TOKEN_RATE_LIMITboth default toTrue)ENABLE_TOKEN_NEGATIVE_CACHE=falseand confirm tokens are always looked up in RedisENABLE_TOKEN_RATE_LIMIT=falseand confirm the middleware passes all requests through🤖 Generated with Claude Code