-
Notifications
You must be signed in to change notification settings - Fork 2
[BOT ISSUE] Google GenAI thinking model spans missing thoughtsTokenCount metric #69
Description
Summary
The Google GenAI instrumentation extracts promptTokenCount, candidatesTokenCount, totalTokenCount, and cachedContentTokenCount from the usageMetadata object, but does not extract thoughtsTokenCount. This field is returned by Gemini thinking models (Gemini 2.5 Flash, Gemini 2.5 Pro, etc.) and represents the number of tokens used for internal reasoning — analogous to OpenAI's reasoning_tokens which this repo already captures for the Responses API.
Braintrust's own Gemini integration docs state that "reasoning tokens are automatically tracked" and "The wrapper captures completion_reasoning_tokens in the metrics" — but the Java SDK does not implement this.
What is missing
In BraintrustApiClient.tagSpan() (lines 128–149), the usageMetadata extraction handles four fields:
if (usage.containsKey("promptTokenCount")) {
metrics.put("prompt_tokens", (Number) usage.get("promptTokenCount"));
}
if (usage.containsKey("candidatesTokenCount")) {
metrics.put("completion_tokens", (Number) usage.get("candidatesTokenCount"));
}
if (usage.containsKey("totalTokenCount")) {
metrics.put("tokens", (Number) usage.get("totalTokenCount"));
}
if (usage.containsKey("cachedContentTokenCount")) {
metrics.put("prompt_cached_tokens", (Number) usage.get("cachedContentTokenCount"));
}The missing extraction would be:
if (usage.containsKey("thoughtsTokenCount")) {
metrics.put("completion_reasoning_tokens", (Number) usage.get("thoughtsTokenCount"));
}This would align with:
- The OpenAI Responses API handler in
InstrumentationSemConv.tagOpenAIResponse()which already extractsoutput_tokens_details.reasoning_tokens→completion_reasoning_tokens - The Braintrust docs which state
completion_reasoning_tokensis captured for Gemini thinking models
Braintrust docs status
- Braintrust Gemini integration docs at https://www.braintrust.dev/docs/integrations/ai-providers/gemini state that reasoning tokens are "automatically tracked" and the wrapper captures
completion_reasoning_tokens: supported (documented as working, but not implemented in this Java SDK)
Upstream sources
- Google GenAI thinking docs: https://ai.google.dev/gemini-api/docs/thinking — documents
thoughtsTokenCountinusageMetadataas a standard field for thinking models, accessible viaresponse.usage_metadata.thoughts_token_count - Google GenAI API reference: https://ai.google.dev/api/generate-content —
UsageMetadataincludesthoughtsTokenCountfield - Gemini thinking models: Gemini 2.5 Flash and Gemini 2.5 Pro return this field by default when thinking is enabled
Local files inspected
braintrust-sdk/instrumentation/genai_1_18_0/src/main/java/com/google/genai/BraintrustApiClient.java— lines 128–149 (tagSpanusageMetadata extraction;thoughtsTokenCountnot handled)braintrust-sdk/src/main/java/dev/braintrust/instrumentation/InstrumentationSemConv.java— lines 144–150 (OpenAIoutput_tokens_details.reasoning_tokens→completion_reasoning_tokensis the established pattern)braintrust-sdk/instrumentation/genai_1_18_0/src/test/java/dev/braintrust/instrumentation/genai/v1_18_0/BraintrustGenAITest.java— no test uses a thinking model or assertscompletion_reasoning_tokens