⚡ Bolt: optimize quota calculation and Claude detection by Laisky · Pull Request #338 · Laisky/one-api

Laisky · 2026-04-24T16:14:22Z

This PR introduces performance optimizations in the quota calculation hot path:

Consolidates completion ratio resolution in Compute to avoid redundant lookups.
Eliminates manual tier iteration loops by deriving effective completion ratios from eff.OutputRatio / eff.InputRatio.
Optimizes containsASCIIFold (used for Claude detection) by pre-calculating search character variants, avoiding redundant function calls in the loop.

These changes improve efficiency and reduce CPU overhead for core billing operations. Verified with go test ./relay/quota/....

PR created automatically by Jules for task 1781255884396622314 started by @Laisky

Summary by CodeRabbit

Documentation
- Added documentation for quota performance optimizations.
Refactor
- Optimized quota completion ratio calculation efficiency.
- Improved string search performance through enhanced processing.

Consolidate ratio resolution and optimize string search in quota.go. Derive tiered completion ratio from eff ratios to avoid manual looping. Optimize containsASCIIFold for faster Claude detection. Co-authored-by: Laisky <4532436+Laisky@users.noreply.github.com>

google-labs-jules · 2026-04-24T16:14:23Z

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.

For security, I will only act on instructions from the user who triggered this task.

coderabbitai · 2026-04-24T16:14:35Z

📝 Walkthrough

Walkthrough

This PR optimizes the quota computation hot path by inlining completion ratio resolution and deriving it from effective pricing ratios (eff.OutputRatio / eff.InputRatio) instead of iterating through tiers. It also optimizes ASCII-fold string detection via precomputed uppercase variants and fast byte-matching.

Changes

Cohort / File(s)	Summary
Documentation `.jules/bolt.md`	Added documentation section describing completion ratio optimization and ASCII-fold detection improvements on the quota hot path.
Quota Computation Logic `relay/quota/quota.go`	Refactored `Compute` to inline `completionRatioResolved` derivation with priority fallback (channel ratio → model config → three-layer pricing), replaced tier-iteration completion ratio calculation with direct ratio derivation (`eff.OutputRatio / eff.InputRatio`), and optimized `containsASCIIFold` with precomputed uppercase byte and fast path matching.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

⚡ Bolt: optimize quota calculation hot path #319: Modifies the same relay/quota/quota.go Compute logic to derive completion ratio from effective pricing using the same approach.
⚡ Bolt: optimize global pricing lookups to avoid redundant cloning #305: Related through completion-ratio resolution changes; this PR inlines resolution while that PR modifies the fallback three-layer lookup APIs.
⚡ Bolt: optimize three-layer pricing lookup redundancy #307: Connected via quota computation's use of three-layer completion-ratio logic and its performance optimization.

Poem

🐰 Ratios dance in pricing's gleam,
Where tiers once toiled in loops extreme,
Now effective paths shine bright and fast,
No fold too costly, folds unsurpassed! ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main changes: optimizing quota calculation and Claude detection through performance improvements in the hot path.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch bolt-optimize-quota-calculation-1781255884396622314

Warning

Review ran into problems

🔥 Problems

Timed out fetching pipeline failures after 30000ms

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-24T16:16:56Z

Merging this branch will not change overall coverage

Impacted Packages	Coverage Δ	🤖
github.com/Laisky/one-api/relay/quota	0.00% (ø)

Coverage by file

Changed files (no unit tests)

Changed File	Coverage Δ	Total	Covered	Missed	🤖
github.com/Laisky/one-api/relay/quota/quota.go	0.00% (ø)	0	0	0

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

gemini-code-assist

Code Review

This pull request optimizes the quota calculation hot path in 'relay/quota/quota.go' by streamlining tier resolution and introducing a fast-path byte search in 'containsASCIIFold'. While the changes improve performance, the reviewer identified a bug in the new logic where manual ratio overrides are ignored when the input ratio is zero. A code suggestion has been provided to correctly handle these overrides.

gemini-code-assist · 2026-04-24T16:17:23Z

+		if eff.InputRatio != 0 {
+			usedCompletionRatio = eff.OutputRatio / eff.InputRatio
 		} else {
-			baseComp = 1.0
+			usedCompletionRatio = 1.0
 		}


The optimization of deriving usedCompletionRatio via eff.OutputRatio / eff.InputRatio introduces a bug when the model is free in the configuration (eff.InputRatio == 0) but has a manual ratio override (hasChannelModelRatioOverride is true). In this scenario, the division is skipped and the ratio defaults to 1.0, whereas it should use the completionRatioResolved value (which respects channel-level or global defaults). This leads to incorrect quota calculation for completion tokens on overridden free models.

Suggested change

if eff.InputRatio != 0 {

usedCompletionRatio = eff.OutputRatio / eff.InputRatio

} else {

baseComp = 1.0

usedCompletionRatio = 1.0

}

if eff.InputRatio != 0 {

usedCompletionRatio = eff.OutputRatio / eff.InputRatio

} else if hasChannelModelRatioOverride {

usedCompletionRatio = completionRatioResolved

} else {

usedCompletionRatio = 1.0

}

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR optimizes the quota computation hot path by reducing redundant completion-ratio lookups and eliminating manual tier iteration, and it speeds up Claude model detection via a faster ASCII-fold substring check.

Changes:

Consolidate completion ratio resolution in Compute to reduce repeated lookups.
Derive effective tier completion ratios from eff.OutputRatio / eff.InputRatio to avoid tier loops.
Optimize containsASCIIFold by precomputing the first-byte case variants to reduce per-iteration work.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
relay/quota/quota.go	Optimizes quota ratio resolution and speeds up ASCII-fold substring matching used in model detection.
.jules/bolt.md	Documents the performance learnings and actions taken in this optimization pass.

Copilot · 2026-04-24T16:18:11Z

+	// substr is already expected to be lowercase from the caller (isClaudeModelName).
+	// We pre-calculate the uppercase variant of the first byte to allow a fast search
+	// that avoids calling asciiLower on every character in the model name string.
+	firstLower := substr[0]
+	var firstUpper byte
+	if firstLower >= 'a' && firstLower <= 'z' {
+		firstUpper = firstLower - ('a' - 'A')
+	} else {
+		firstUpper = firstLower
+	}
+
 	last := len(s) - len(substr)
 	for i := 0; i <= last; i++ {
-		if asciiLower(s[i]) != substr[0] {
+		// Fast path: match the first byte against both possible cases.
+		if s[i] != firstLower && s[i] != firstUpper {
 			continue
 		}


containsASCIIFold implies case-insensitive matching, but the implementation assumes substr is already lowercase (and will fail if substr[0] is uppercase). To make the helper robust and self-contained, normalize the first search byte inside this function (e.g., compute firstLower from the folded/normalized value of substr[0]) and derive firstUpper from that; alternatively, rename the function to reflect the lowercase-needle requirement so callers don’t accidentally misuse it.

Copilot · 2026-04-24T16:18:11Z

 // Compute calculates the quota required for the provided usage snapshot.
-// It mirrors the logic used in controller helper functions so streaming
-// billing and final reconciliation share the same pricing semantics.
 func Compute(input ComputeInput) ComputeResult {


The removed doc comment lines explained an important invariant: Compute mirrors controller helper logic so streaming billing and final reconciliation share pricing semantics. Consider restoring that explanation (or an equivalent) since it’s a key behavioral contract for a billing hot path and helps prevent accidental divergence in future changes.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

relay/quota/quota.go (1)

230-247: 🛠️ Refactor suggestion | 🟠 Major

Remove orphaned helper resolveCompletionRatio—unused after inlining into Compute.

This unexported function (lines 230–247) was inlined into Compute and is now dead code with no remaining callers. Remove it to keep the package surface tight and avoid confusion.

Removal diff

-// resolveCompletionRatio returns the effective completion ratio for modelName.
-// Parameters: resolvedModelCfg is the already-loaded model config and hasResolvedModelCfg reports whether it exists.
-// Returns: the completion ratio after applying channel overrides, config values, and fallback pricing lookup.
-func resolveCompletionRatio(
-	modelName string,
-	resolvedModelCfg adaptor.ModelConfig,
-	hasResolvedModelCfg bool,
-	channelOverrides map[string]float64,
-	provider adaptor.Adaptor,
-) float64 {
-	if override, ok := channelOverrides[modelName]; ok {
-		return override
-	}
-	if hasResolvedModelCfg && resolvedModelCfg.CompletionRatio != 0 {
-		return resolvedModelCfg.CompletionRatio
-	}
-	return pricing.GetCompletionRatioWithThreeLayers(modelName, channelOverrides, provider)
-}
-

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@relay/quota/quota.go` around lines 230 - 247, Delete the now-unused helper
function resolveCompletionRatio (the unexported function that takes modelName,
resolvedModelCfg adaptor.ModelConfig, hasResolvedModelCfg, channelOverrides,
provider) from quota.go; ensure any remaining logic is served by the inlined
code in Compute and that there are no lingering references to
resolveCompletionRatio or calls to pricing.GetCompletionRatioWithThreeLayers
left elsewhere; remove the entire function block to keep the package surface
minimal and run a quick build to confirm no missing-symbol errors.

🧹 Nitpick comments (2)

relay/quota/quota.go (1)
96-100: Consider preserving completionRatioResolved in the zero-InputRatio fallback.

The derivation eff.OutputRatio / eff.InputRatio is mathematically equivalent to the previous tier-iteration (since eff.OutputRatio = in * comp in ResolveEffectivePricingFromConfig), so the optimization is correct. However, when eff.InputRatio == 0 you now hard-code usedCompletionRatio = 1.0, whereas the previous default was completionRatioResolved (set at line 86). Billing isn't affected in that branch because usedModelRatio is also 0, but the debug log at lines 199–207 will report completion_ratio=1.0 instead of the configured ratio, degrading diagnosability.
♻️ Proposed fix
-		if eff.InputRatio != 0 {
-			usedCompletionRatio = eff.OutputRatio / eff.InputRatio
-		} else {
-			usedCompletionRatio = 1.0
-		}
+		if eff.InputRatio != 0 {
+			usedCompletionRatio = eff.OutputRatio / eff.InputRatio
+		}
+		// else: keep the initial completionRatioResolved for accurate logging
Apply the same change to the adaptor-only branch at lines 107–111.
Also applies to: 107-111
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@relay/quota/quota.go` around lines 96 - 100, The zero-InputRatio fallback
currently assigns usedCompletionRatio = 1.0 which hides the configured
completionRatioResolved value in logs; instead, when eff.InputRatio == 0 set
usedCompletionRatio = completionRatioResolved so diagnostics reflect the
configured ratio; apply this fix in both places where usedCompletionRatio is
computed (the main branch using eff.InputRatio and the adaptor-only branch) and
keep the rest of the logic (usedModelRatio and
ResolveEffectivePricingFromConfig) unchanged.
.jules/bolt.md (1)
42-44: Nit: add a blank line after the heading for consistency.

Every other entry in this file separates the ## heading from the **Learning:** line with a blank line; this new entry does not.
♻️ Proposed fix
 ## 2026-04-24 - [Streamline Quota Calculation and Fast ASCII Fold]
+
 **Learning:** The quota calculation hot path in `relay/quota/quota.go` was performing manual tier iterations to resolve completion ratios, which is redundant since `eff.OutputRatio / eff.InputRatio` already provides the effective ratio for the matched tier. Additionally, string searches like `containsASCIIFold` (used for Claude detection) can be optimized by pre-calculating uppercase/lowercase variants of the first search byte to avoid expensive normalization calls in the search loop.
 **Action:** Consolidate ratio resolution by deriving tiered completion ratios from effective pricing results and use fast-path byte matching in string search functions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.jules/bolt.md around lines 42 - 44, Add a single blank line immediately
after the "## 2026-04-24 - [Streamline Quota Calculation and Fast ASCII Fold]"
heading so the heading is separated from the subsequent "**Learning:**"
paragraph, matching the file's established spacing convention for other entries.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@relay/quota/quota.go`:
- Around line 230-247: Delete the now-unused helper function
resolveCompletionRatio (the unexported function that takes modelName,
resolvedModelCfg adaptor.ModelConfig, hasResolvedModelCfg, channelOverrides,
provider) from quota.go; ensure any remaining logic is served by the inlined
code in Compute and that there are no lingering references to
resolveCompletionRatio or calls to pricing.GetCompletionRatioWithThreeLayers
left elsewhere; remove the entire function block to keep the package surface
minimal and run a quick build to confirm no missing-symbol errors.

---

Nitpick comments:
In @.jules/bolt.md:
- Around line 42-44: Add a single blank line immediately after the "##
2026-04-24 - [Streamline Quota Calculation and Fast ASCII Fold]" heading so the
heading is separated from the subsequent "**Learning:**" paragraph, matching the
file's established spacing convention for other entries.

In `@relay/quota/quota.go`:
- Around line 96-100: The zero-InputRatio fallback currently assigns
usedCompletionRatio = 1.0 which hides the configured completionRatioResolved
value in logs; instead, when eff.InputRatio == 0 set usedCompletionRatio =
completionRatioResolved so diagnostics reflect the configured ratio; apply this
fix in both places where usedCompletionRatio is computed (the main branch using
eff.InputRatio and the adaptor-only branch) and keep the rest of the logic
(usedModelRatio and ResolveEffectivePricingFromConfig) unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4b52b8e6-073d-4103-be92-589d341b448c

📥 Commits

Reviewing files that changed from the base of the PR and between fd96426 and 234a147.

📒 Files selected for processing (2)

.jules/bolt.md
relay/quota/quota.go

Copilot AI review requested due to automatic review settings April 24, 2026 16:14

Copilot started reviewing on behalf of Laisky April 24, 2026 16:14 View session

gemini-code-assist Bot reviewed Apr 24, 2026

View reviewed changes

Copilot AI reviewed Apr 24, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 24, 2026

View reviewed changes

Uh oh!

Conversation

Laisky commented Apr 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

google-labs-jules Bot commented Apr 24, 2026

Uh oh!

coderabbitai Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Review ran into problems

Uh oh!

github-actions Bot commented Apr 24, 2026

Merging this branch will not change overall coverage

Changed files (no unit tests)

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Laisky commented Apr 24, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 24, 2026 •

edited

Loading