Skip to content

feat: add opt-out flags for token negative cache and rate limiting#121

Merged
TimilsinaBimal merged 1 commit intoTimilsinaBimal:mainfrom
elfhosted:fix/token-cache-opt-out
Mar 23, 2026
Merged

feat: add opt-out flags for token negative cache and rate limiting#121
TimilsinaBimal merged 1 commit intoTimilsinaBimal:mainfrom
elfhosted:fix/token-cache-opt-out

Conversation

@funkypenguin
Copy link
Contributor

Summary

  • Adds ENABLE_TOKEN_NEGATIVE_CACHE (default True) to control the in-memory _missing_tokens TTLCache in TokenStore
  • Adds ENABLE_TOKEN_RATE_LIMIT (default True) to control the middleware that short-circuits requests for missing tokens with 401/429

Both default to enabled, so existing deployments are unaffected.

Motivation

We deploy Watchly in a Cloudflare-proxied, multi-replica Kubernetes environment and hit several issues with the current token caching and rate-limiting:

  1. Stale negative cache across replicas: _missing_tokens is a per-process TTLCache with a 24h TTL. In a multi-replica setup, a token cached as missing on one replica will be rejected for up to 24 hours even after it becomes valid in Redis (e.g. a newly registered user). This causes intermittent 401 errors depending on which replica handles the request.

  2. Incorrect IP detection behind proxies: The rate-limiting middleware uses request.client.host, which behind Cloudflare (or any reverse proxy) resolves to the proxy's IP rather than the real client. This means all users share the same rate-limit counter — 8 failures from anyone triggers 429s for everyone on that replica.

  3. CORS errors masking real errors: Both the 401 and 429 responses are returned as HTMLResponse before call_next is invoked, so CORSMiddleware never adds CORS headers. Browsers report these as CORS failures rather than showing the actual error, making debugging very difficult.

These flags let operators disable these features where the infrastructure already handles the concerns differently (e.g. Redis for shared state, Cloudflare for rate limiting).

Test plan

  • Verify default behavior is unchanged (ENABLE_TOKEN_NEGATIVE_CACHE and ENABLE_TOKEN_RATE_LIMIT both default to True)
  • Set ENABLE_TOKEN_NEGATIVE_CACHE=false and confirm tokens are always looked up in Redis
  • Set ENABLE_TOKEN_RATE_LIMIT=false and confirm the middleware passes all requests through
  • Test in a multi-replica environment that newly created tokens work immediately on all replicas

🤖 Generated with Claude Code

Add two new configuration flags (defaulting to enabled for backward
compatibility):

- ENABLE_TOKEN_NEGATIVE_CACHE: controls the in-memory _missing_tokens
  TTLCache (24h TTL) in TokenStore
- ENABLE_TOKEN_RATE_LIMIT: controls the middleware that short-circuits
  requests for missing tokens with 401/429

These flags allow operators to disable these features in multi-replica
deployments where the per-process caches cause issues:

1. _missing_tokens is local to each replica, so a token cached as
   missing on one replica will be rejected for up to 24h even after
   it becomes valid in Redis (e.g. newly registered user)
2. The rate-limiting middleware uses request.client.host for IP
   tracking, which behind a reverse proxy (e.g. Cloudflare) resolves
   to the proxy IP, causing all users to share the same rate limit
3. The 401/429 responses are returned before CORSMiddleware runs,
   so browsers report them as CORS errors rather than the actual error

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 23, 2026 00:55
@vercel
Copy link

vercel bot commented Mar 23, 2026

Someone is attempting to deploy a commit to the Bimal Timilsina's projects Team on Vercel.

A member of the Team first needs to authorize it.

@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces new configuration flags to allow operators to disable the in-memory token negative cache and the token rate-limiting middleware. These changes aim to resolve issues encountered in multi-replica, Cloudflare-proxied Kubernetes environments, such as stale negative cache entries across replicas, inaccurate IP-based rate limiting, and CORS errors masking actual failures. By providing opt-out mechanisms, the system becomes more flexible for deployments where infrastructure already handles these concerns.

Highlights

  • Opt-out flags for token negative cache: Introduced ENABLE_TOKEN_NEGATIVE_CACHE (default True) to control the in-memory _missing_tokens TTLCache in TokenStore, addressing stale cache issues in multi-replica environments.
  • Opt-out flags for token rate limiting: Added ENABLE_TOKEN_RATE_LIMIT (default True) to control the middleware that short-circuits requests for missing tokens, mitigating issues with incorrect IP detection behind proxies and CORS errors.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces two optional feature flags, ENABLE_TOKEN_NEGATIVE_CACHE and ENABLE_TOKEN_RATE_LIMIT, to give operators more control over token caching and rate limiting behavior. This is a well-reasoned change to address issues in multi-replica deployments. The implementation is straightforward and correctly uses the new settings to conditionally enable/disable the respective functionalities. I have one suggestion to improve the readability of the negative cache check.

Comment on lines +233 to +239
if settings.ENABLE_TOKEN_NEGATIVE_CACHE:
try:
if token in self._missing_tokens:
logger.debug(f"[REDIS] Negative cache hit for missing token {token}")
return None
except Exception as e:
logger.debug(f"Failed to check negative cache for {token}: {e}")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve readability and reduce nesting, you can combine the feature flag check with the cache lookup. Python's and operator uses short-circuit evaluation, so token in self._missing_tokens will only be evaluated if settings.ENABLE_TOKEN_NEGATIVE_CACHE is True, making this change safe.

        try:
            if settings.ENABLE_TOKEN_NEGATIVE_CACHE and token in self._missing_tokens:
                logger.debug(f"[REDIS] Negative cache hit for missing token {token}")
                return None
        except Exception as e:
            logger.debug(f"Failed to check negative cache for {token}: {e}")

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds operator-controlled feature flags to disable (a) the in-process “missing token” negative cache and (b) the middleware that short-circuits requests for known-missing tokens, to better support multi-replica/proxied deployments.

Changes:

  • Introduces ENABLE_TOKEN_NEGATIVE_CACHE and ENABLE_TOKEN_RATE_LIMIT settings (both default True).
  • Gates _missing_tokens negative-cache reads/writes in TokenStore.get_user_data() behind ENABLE_TOKEN_NEGATIVE_CACHE.
  • Adds an early pass-through in block_missing_token_middleware when ENABLE_TOKEN_RATE_LIMIT is disabled.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
app/services/token_store.py Adds a settings gate for negative-cache behavior inside get_user_data().
app/core/config.py Defines the new settings flags with defaults.
app/core/app.py Adds a settings gate to bypass the missing-token blocking/rate-limit middleware.
Comments suppressed due to low confidence (1)

app/core/app.py:78

  • block_missing_token_middleware only triggers when the token is present in token_store._missing_tokens. But _missing_tokens is only populated when ENABLE_TOKEN_NEGATIVE_CACHE is enabled, so setting ENABLE_TOKEN_NEGATIVE_CACHE=false while leaving ENABLE_TOKEN_RATE_LIMIT=true effectively disables this middleware’s behavior (without making that dependency explicit). Consider either gating this middleware on ENABLE_TOKEN_NEGATIVE_CACHE as well, or changing the middleware to determine “missing token” in a way that doesn’t rely on the negative cache (or at least documenting the coupling).
    path = request.url.path.lstrip("/")
    seg = path.split("/", 1)[0] if path else ""
    try:
        # If token is known-missing, short-circuit and track IP failures
        if seg and seg in token_store._missing_tokens:

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +239 to 242
logger.debug(f"Failed to check negative cache for {token}: {e}")

logger.debug(f"[REDIS] Cache miss. Fetching data from redis for {token}")
key = self._format_key(token)
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These debug logs include the raw token value. Tokens appear to be treated as sensitive elsewhere (e.g., redact_token(...) in other log lines), so logging the full token can leak credentials into log aggregation. Consider using redact_token(token) (or removing the token entirely) in these messages.

Copilot uses AI. Check for mistakes.
Comment on lines +235 to +237
if token in self._missing_tokens:
logger.debug(f"[REDIS] Negative cache hit for missing token {token}")
return None
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The negative-cache hit log line prints the full token value, which can leak credentials into logs. Please redact the token (e.g., via redact_token(token)) or avoid including it in log messages.

Copilot uses AI. Check for mistakes.
Comment on lines 230 to +234
@alru_cache(maxsize=2000, ttl=43200)
async def get_user_data(self, token: str) -> dict[str, Any] | None:
# Short-circuit for tokens known to be missing
try:
if token in self._missing_tokens:
logger.debug(f"[REDIS] Negative cache hit for missing token {token}")
return None
except Exception as e:
logger.debug(f"Failed to check negative cache for {token}: {e}")
if settings.ENABLE_TOKEN_NEGATIVE_CACHE:
try:
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With @alru_cache on get_user_data, results are cached per token at this layer. That means disabling ENABLE_TOKEN_NEGATIVE_CACHE may still leave a replica returning a cached “missing” (None) response for some period, so it may not achieve the goal of always re-checking Redis for previously-missing tokens in multi-replica deployments. Consider conditionally bypassing the alru_cache when negative caching is disabled, or ensuring that missing-token results are not cached / have a very short TTL.

Copilot uses AI. Check for mistakes.
@TimilsinaBimal TimilsinaBimal merged commit 4e1c3f6 into TimilsinaBimal:main Mar 23, 2026
4 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants