⚡ Bolt: optimize quota calculation and Claude detection#338
Conversation
Consolidate ratio resolution and optimize string search in quota.go. Derive tiered completion ratio from eff ratios to avoid manual looping. Optimize containsASCIIFold for faster Claude detection. Co-authored-by: Laisky <4532436+Laisky@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
📝 WalkthroughWalkthroughThis PR optimizes the quota computation hot path by inlining completion ratio resolution and deriving it from effective pricing ratios ( Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsTimed out fetching pipeline failures after 30000ms Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Merging this branch will not change overall coverage
Coverage by fileChanged files (no unit tests)
Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code. |
There was a problem hiding this comment.
Code Review
This pull request optimizes the quota calculation hot path in 'relay/quota/quota.go' by streamlining tier resolution and introducing a fast-path byte search in 'containsASCIIFold'. While the changes improve performance, the reviewer identified a bug in the new logic where manual ratio overrides are ignored when the input ratio is zero. A code suggestion has been provided to correctly handle these overrides.
| if eff.InputRatio != 0 { | ||
| usedCompletionRatio = eff.OutputRatio / eff.InputRatio | ||
| } else { | ||
| baseComp = 1.0 | ||
| usedCompletionRatio = 1.0 | ||
| } |
There was a problem hiding this comment.
The optimization of deriving usedCompletionRatio via eff.OutputRatio / eff.InputRatio introduces a bug when the model is free in the configuration (eff.InputRatio == 0) but has a manual ratio override (hasChannelModelRatioOverride is true). In this scenario, the division is skipped and the ratio defaults to 1.0, whereas it should use the completionRatioResolved value (which respects channel-level or global defaults). This leads to incorrect quota calculation for completion tokens on overridden free models.
| if eff.InputRatio != 0 { | |
| usedCompletionRatio = eff.OutputRatio / eff.InputRatio | |
| } else { | |
| baseComp = 1.0 | |
| usedCompletionRatio = 1.0 | |
| } | |
| if eff.InputRatio != 0 { | |
| usedCompletionRatio = eff.OutputRatio / eff.InputRatio | |
| } else if hasChannelModelRatioOverride { | |
| usedCompletionRatio = completionRatioResolved | |
| } else { | |
| usedCompletionRatio = 1.0 | |
| } |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR optimizes the quota computation hot path by reducing redundant completion-ratio lookups and eliminating manual tier iteration, and it speeds up Claude model detection via a faster ASCII-fold substring check.
Changes:
- Consolidate completion ratio resolution in
Computeto reduce repeated lookups. - Derive effective tier completion ratios from
eff.OutputRatio / eff.InputRatioto avoid tier loops. - Optimize
containsASCIIFoldby precomputing the first-byte case variants to reduce per-iteration work.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| relay/quota/quota.go | Optimizes quota ratio resolution and speeds up ASCII-fold substring matching used in model detection. |
| .jules/bolt.md | Documents the performance learnings and actions taken in this optimization pass. |
| // substr is already expected to be lowercase from the caller (isClaudeModelName). | ||
| // We pre-calculate the uppercase variant of the first byte to allow a fast search | ||
| // that avoids calling asciiLower on every character in the model name string. | ||
| firstLower := substr[0] | ||
| var firstUpper byte | ||
| if firstLower >= 'a' && firstLower <= 'z' { | ||
| firstUpper = firstLower - ('a' - 'A') | ||
| } else { | ||
| firstUpper = firstLower | ||
| } | ||
|
|
||
| last := len(s) - len(substr) | ||
| for i := 0; i <= last; i++ { | ||
| if asciiLower(s[i]) != substr[0] { | ||
| // Fast path: match the first byte against both possible cases. | ||
| if s[i] != firstLower && s[i] != firstUpper { | ||
| continue | ||
| } |
There was a problem hiding this comment.
containsASCIIFold implies case-insensitive matching, but the implementation assumes substr is already lowercase (and will fail if substr[0] is uppercase). To make the helper robust and self-contained, normalize the first search byte inside this function (e.g., compute firstLower from the folded/normalized value of substr[0]) and derive firstUpper from that; alternatively, rename the function to reflect the lowercase-needle requirement so callers don’t accidentally misuse it.
| // Compute calculates the quota required for the provided usage snapshot. | ||
| // It mirrors the logic used in controller helper functions so streaming | ||
| // billing and final reconciliation share the same pricing semantics. | ||
| func Compute(input ComputeInput) ComputeResult { |
There was a problem hiding this comment.
The removed doc comment lines explained an important invariant: Compute mirrors controller helper logic so streaming billing and final reconciliation share pricing semantics. Consider restoring that explanation (or an equivalent) since it’s a key behavioral contract for a billing hot path and helps prevent accidental divergence in future changes.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
relay/quota/quota.go (1)
230-247: 🛠️ Refactor suggestion | 🟠 MajorRemove orphaned helper
resolveCompletionRatio—unused after inlining intoCompute.This unexported function (lines 230–247) was inlined into
Computeand is now dead code with no remaining callers. Remove it to keep the package surface tight and avoid confusion.Removal diff
-// resolveCompletionRatio returns the effective completion ratio for modelName. -// Parameters: resolvedModelCfg is the already-loaded model config and hasResolvedModelCfg reports whether it exists. -// Returns: the completion ratio after applying channel overrides, config values, and fallback pricing lookup. -func resolveCompletionRatio( - modelName string, - resolvedModelCfg adaptor.ModelConfig, - hasResolvedModelCfg bool, - channelOverrides map[string]float64, - provider adaptor.Adaptor, -) float64 { - if override, ok := channelOverrides[modelName]; ok { - return override - } - if hasResolvedModelCfg && resolvedModelCfg.CompletionRatio != 0 { - return resolvedModelCfg.CompletionRatio - } - return pricing.GetCompletionRatioWithThreeLayers(modelName, channelOverrides, provider) -} -🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@relay/quota/quota.go` around lines 230 - 247, Delete the now-unused helper function resolveCompletionRatio (the unexported function that takes modelName, resolvedModelCfg adaptor.ModelConfig, hasResolvedModelCfg, channelOverrides, provider) from quota.go; ensure any remaining logic is served by the inlined code in Compute and that there are no lingering references to resolveCompletionRatio or calls to pricing.GetCompletionRatioWithThreeLayers left elsewhere; remove the entire function block to keep the package surface minimal and run a quick build to confirm no missing-symbol errors.
🧹 Nitpick comments (2)
relay/quota/quota.go (1)
96-100: Consider preservingcompletionRatioResolvedin the zero-InputRatio fallback.The derivation
eff.OutputRatio / eff.InputRatiois mathematically equivalent to the previous tier-iteration (sinceeff.OutputRatio = in * compinResolveEffectivePricingFromConfig), so the optimization is correct. However, wheneff.InputRatio == 0you now hard-codeusedCompletionRatio = 1.0, whereas the previous default wascompletionRatioResolved(set at line 86). Billing isn't affected in that branch becauseusedModelRatiois also 0, but the debug log at lines 199–207 will reportcompletion_ratio=1.0instead of the configured ratio, degrading diagnosability.♻️ Proposed fix
- if eff.InputRatio != 0 { - usedCompletionRatio = eff.OutputRatio / eff.InputRatio - } else { - usedCompletionRatio = 1.0 - } + if eff.InputRatio != 0 { + usedCompletionRatio = eff.OutputRatio / eff.InputRatio + } + // else: keep the initial completionRatioResolved for accurate loggingApply the same change to the adaptor-only branch at lines 107–111.
Also applies to: 107-111
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@relay/quota/quota.go` around lines 96 - 100, The zero-InputRatio fallback currently assigns usedCompletionRatio = 1.0 which hides the configured completionRatioResolved value in logs; instead, when eff.InputRatio == 0 set usedCompletionRatio = completionRatioResolved so diagnostics reflect the configured ratio; apply this fix in both places where usedCompletionRatio is computed (the main branch using eff.InputRatio and the adaptor-only branch) and keep the rest of the logic (usedModelRatio and ResolveEffectivePricingFromConfig) unchanged..jules/bolt.md (1)
42-44: Nit: add a blank line after the heading for consistency.Every other entry in this file separates the
##heading from the**Learning:**line with a blank line; this new entry does not.♻️ Proposed fix
## 2026-04-24 - [Streamline Quota Calculation and Fast ASCII Fold] + **Learning:** The quota calculation hot path in `relay/quota/quota.go` was performing manual tier iterations to resolve completion ratios, which is redundant since `eff.OutputRatio / eff.InputRatio` already provides the effective ratio for the matched tier. Additionally, string searches like `containsASCIIFold` (used for Claude detection) can be optimized by pre-calculating uppercase/lowercase variants of the first search byte to avoid expensive normalization calls in the search loop. **Action:** Consolidate ratio resolution by deriving tiered completion ratios from effective pricing results and use fast-path byte matching in string search functions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.jules/bolt.md around lines 42 - 44, Add a single blank line immediately after the "## 2026-04-24 - [Streamline Quota Calculation and Fast ASCII Fold]" heading so the heading is separated from the subsequent "**Learning:**" paragraph, matching the file's established spacing convention for other entries.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@relay/quota/quota.go`:
- Around line 230-247: Delete the now-unused helper function
resolveCompletionRatio (the unexported function that takes modelName,
resolvedModelCfg adaptor.ModelConfig, hasResolvedModelCfg, channelOverrides,
provider) from quota.go; ensure any remaining logic is served by the inlined
code in Compute and that there are no lingering references to
resolveCompletionRatio or calls to pricing.GetCompletionRatioWithThreeLayers
left elsewhere; remove the entire function block to keep the package surface
minimal and run a quick build to confirm no missing-symbol errors.
---
Nitpick comments:
In @.jules/bolt.md:
- Around line 42-44: Add a single blank line immediately after the "##
2026-04-24 - [Streamline Quota Calculation and Fast ASCII Fold]" heading so the
heading is separated from the subsequent "**Learning:**" paragraph, matching the
file's established spacing convention for other entries.
In `@relay/quota/quota.go`:
- Around line 96-100: The zero-InputRatio fallback currently assigns
usedCompletionRatio = 1.0 which hides the configured completionRatioResolved
value in logs; instead, when eff.InputRatio == 0 set usedCompletionRatio =
completionRatioResolved so diagnostics reflect the configured ratio; apply this
fix in both places where usedCompletionRatio is computed (the main branch using
eff.InputRatio and the adaptor-only branch) and keep the rest of the logic
(usedModelRatio and ResolveEffectivePricingFromConfig) unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4b52b8e6-073d-4103-be92-589d341b448c
📒 Files selected for processing (2)
.jules/bolt.mdrelay/quota/quota.go
This PR introduces performance optimizations in the quota calculation hot path:
Computeto avoid redundant lookups.eff.OutputRatio / eff.InputRatio.containsASCIIFold(used for Claude detection) by pre-calculating search character variants, avoiding redundant function calls in the loop.These changes improve efficiency and reduce CPU overhead for core billing operations. Verified with
go test ./relay/quota/....PR created automatically by Jules for task 1781255884396622314 started by @Laisky
Summary by CodeRabbit
Documentation
Refactor