Skip to content

Add unified cache usage metrics to Usage interface#5657

Open
sobychacko wants to merge 1 commit intospring-projects:mainfrom
sobychacko:feature/unified-cache-usage-metrics
Open

Add unified cache usage metrics to Usage interface#5657
sobychacko wants to merge 1 commit intospring-projects:mainfrom
sobychacko:feature/unified-cache-usage-metrics

Conversation

@sobychacko
Copy link
Contributor

Prompt caching can save up to 90% on input token costs, but today reading cache metrics requires provider-specific code — casting to native SDK types, navigating nested records, or reading from metadata maps differently for each provider.

Add getCacheReadInputTokens() and getCacheWriteInputTokens() as default methods on the Usage interface, returning null for providers that don't support caching. Update DefaultUsage with new fields, constructor, and JSON serialization. Update UsageCalculator to accumulate cache metrics across streaming chunks and tool-calling loops.

Wire up cache metrics extraction in four providers:

  • Anthropic: cacheReadInputTokens and cacheCreationInputTokens (sync and streaming)
  • Bedrock: cacheReadInputTokens and cacheWriteInputTokens (sync and streaming)
  • OpenAI SDK: cachedTokens from PromptTokensDetails (read only)
  • Google GenAI: cachedContentTokenCount override (read only)

Add unit tests for DefaultUsage cache fields including serialization round-trips. Add integration test assertions to Anthropic and Bedrock caching tests verifying Usage interface values match native SDK values.

Update usage-handling reference documentation with cache metrics table.

Prompt caching can save up to 90% on input token costs, but today
reading cache metrics requires provider-specific code — casting to
native SDK types, navigating nested records, or reading from metadata
maps differently for each provider.

Add getCacheReadInputTokens() and getCacheWriteInputTokens() as default
methods on the Usage interface, returning null for providers that don't
support caching. Update DefaultUsage with new fields, constructor, and
JSON serialization. Update UsageCalculator to accumulate cache metrics
across streaming chunks and tool-calling loops.

Wire up cache metrics extraction in four providers:

 - Anthropic: cacheReadInputTokens and cacheCreationInputTokens (sync
   and streaming)
 - Bedrock: cacheReadInputTokens and cacheWriteInputTokens (sync and
   streaming)
 - OpenAI SDK: cachedTokens from PromptTokensDetails (read only)
 - Google GenAI: cachedContentTokenCount override (read only)

Add unit tests for DefaultUsage cache fields including serialization
round-trips. Add integration test assertions to Anthropic and Bedrock
caching tests verifying Usage interface values match native SDK values.

Update usage-handling reference documentation with cache metrics table.

Signed-off-by: Soby Chacko <soby.chacko@broadcom.com>
@sobychacko sobychacko added this to the 2.0.0-M4 milestone Mar 22, 2026
@sobychacko sobychacko added the enhancement New feature or request label Mar 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants