Skip to content

[BOT ISSUE] Google GenAI thinking model spans missing thoughtsTokenCount metric #69

@braintrust-bot

Description

@braintrust-bot

Summary

The Google GenAI instrumentation extracts promptTokenCount, candidatesTokenCount, totalTokenCount, and cachedContentTokenCount from the usageMetadata object, but does not extract thoughtsTokenCount. This field is returned by Gemini thinking models (Gemini 2.5 Flash, Gemini 2.5 Pro, etc.) and represents the number of tokens used for internal reasoning — analogous to OpenAI's reasoning_tokens which this repo already captures for the Responses API.

Braintrust's own Gemini integration docs state that "reasoning tokens are automatically tracked" and "The wrapper captures completion_reasoning_tokens in the metrics" — but the Java SDK does not implement this.

What is missing

In BraintrustApiClient.tagSpan() (lines 128–149), the usageMetadata extraction handles four fields:

if (usage.containsKey("promptTokenCount")) {
    metrics.put("prompt_tokens", (Number) usage.get("promptTokenCount"));
}
if (usage.containsKey("candidatesTokenCount")) {
    metrics.put("completion_tokens", (Number) usage.get("candidatesTokenCount"));
}
if (usage.containsKey("totalTokenCount")) {
    metrics.put("tokens", (Number) usage.get("totalTokenCount"));
}
if (usage.containsKey("cachedContentTokenCount")) {
    metrics.put("prompt_cached_tokens", (Number) usage.get("cachedContentTokenCount"));
}

The missing extraction would be:

if (usage.containsKey("thoughtsTokenCount")) {
    metrics.put("completion_reasoning_tokens", (Number) usage.get("thoughtsTokenCount"));
}

This would align with:

  • The OpenAI Responses API handler in InstrumentationSemConv.tagOpenAIResponse() which already extracts output_tokens_details.reasoning_tokenscompletion_reasoning_tokens
  • The Braintrust docs which state completion_reasoning_tokens is captured for Gemini thinking models

Braintrust docs status

Upstream sources

  • Google GenAI thinking docs: https://ai.google.dev/gemini-api/docs/thinking — documents thoughtsTokenCount in usageMetadata as a standard field for thinking models, accessible via response.usage_metadata.thoughts_token_count
  • Google GenAI API reference: https://ai.google.dev/api/generate-contentUsageMetadata includes thoughtsTokenCount field
  • Gemini thinking models: Gemini 2.5 Flash and Gemini 2.5 Pro return this field by default when thinking is enabled

Local files inspected

  • braintrust-sdk/instrumentation/genai_1_18_0/src/main/java/com/google/genai/BraintrustApiClient.java — lines 128–149 (tagSpan usageMetadata extraction; thoughtsTokenCount not handled)
  • braintrust-sdk/src/main/java/dev/braintrust/instrumentation/InstrumentationSemConv.java — lines 144–150 (OpenAI output_tokens_details.reasoning_tokenscompletion_reasoning_tokens is the established pattern)
  • braintrust-sdk/instrumentation/genai_1_18_0/src/test/java/dev/braintrust/instrumentation/genai/v1_18_0/BraintrustGenAITest.java — no test uses a thinking model or asserts completion_reasoning_tokens

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions