-
Notifications
You must be signed in to change notification settings - Fork 2
[BOT ISSUE] OpenAI Responses API input_tokens_details.cached_tokens not captured in metrics #70
Description
Summary
The OpenAI Responses API response handler in InstrumentationSemConv.tagOpenAIResponse() extracts output_tokens_details.reasoning_tokens as completion_reasoning_tokens (lines 144–150), but does not extract the sibling field input_tokens_details.cached_tokens. This means prompt caching usage on Responses API calls is invisible in Braintrust metrics, even though the code already demonstrates the pattern for extracting nested token details at this level.
This is distinct from #58 (which covers Chat Completions prompt_tokens_details.cached_tokens and completion_tokens_details.reasoning_tokens) — the Responses API uses different field names (input_tokens_details/output_tokens_details instead of prompt_tokens_details/completion_tokens_details) and a different code path.
What is missing
In InstrumentationSemConv.tagOpenAIResponse() (lines 144–150), only output_tokens_details is checked:
// Reasoning tokens (Responses API)
if (usage.has("output_tokens_details")) {
JsonNode details = usage.get("output_tokens_details");
if (details.has("reasoning_tokens")) {
metrics.put("completion_reasoning_tokens", details.get("reasoning_tokens"));
}
}The missing extraction:
if (usage.has("input_tokens_details")) {
JsonNode details = usage.get("input_tokens_details");
if (details.has("cached_tokens")) {
metrics.put("prompt_cached_tokens", details.get("cached_tokens"));
}
}A real Responses API usage object with prompt caching looks like:
{
"input_tokens": 9708,
"output_tokens": 167,
"total_tokens": 9875,
"input_tokens_details": {
"cached_tokens": 5578
},
"output_tokens_details": {
"reasoning_tokens": 0
}
}Today, reasoning_tokens is captured but cached_tokens is silently dropped.
Braintrust docs status
- The Braintrust OpenAI integration docs at https://www.braintrust.dev/docs/integrations/ai-providers/openai do not specifically mention cached token metrics for the Responses API: not_found
- The Gemini integration docs capture
prompt_cached_tokensas a named metric, suggesting this is a recognized metric name in the Braintrust ecosystem
Upstream sources
- OpenAI Responses API: The
usageobject includesinput_tokens_details.cached_tokens— confirmed in community discussion and OpenAI prompt caching docs - OpenAI Java SDK: The
Responseobject'sUsageclass exposesinputTokensDetails()withcachedTokens()
Local files inspected
braintrust-sdk/src/main/java/dev/braintrust/instrumentation/InstrumentationSemConv.java— lines 135–150 (tagOpenAIResponseResponses API usage handling;output_tokens_details.reasoning_tokensextracted at line 148, noinput_tokens_detailscheck)braintrust-sdk/instrumentation/openai_2_8_0/src/test/java/dev/braintrust/instrumentation/openai/v2_8_0/BraintrustOpenAITest.java—testWrapOpenAiResponsesdoes not assert cached token metricsbraintrust-sdk/instrumentation/genai_1_18_0/src/main/java/com/google/genai/BraintrustApiClient.java— line 145 showsprompt_cached_tokensis an established metric name (used for Gemini'scachedContentTokenCount)