-
Notifications
You must be signed in to change notification settings - Fork 2
[BOT ISSUE] OpenAI Chat Completions missing detailed token metrics (reasoning_tokens, cached_tokens) #58
Description
Summary
The OpenAI Chat Completions response handler does not extract completion_tokens_details or prompt_tokens_details from the usage object. These nested fields (including reasoning_tokens for o-series models and cached_tokens for prompt caching) are present in real API responses and in this repo's own test cassettes, but are silently dropped.
Notably, the Responses API handler in the same function does extract output_tokens_details.reasoning_tokens (lines 145-150), making this an inconsistency within the same file.
What is missing
In InstrumentationSemConv.tagOpenAIResponse() (lines 127-155), the Chat Completions usage extraction only reads top-level fields:
if (usage.has("prompt_tokens")) metrics.put("prompt_tokens", usage.get("prompt_tokens"));
if (usage.has("completion_tokens")) metrics.put("completion_tokens", usage.get("completion_tokens"));
if (usage.has("total_tokens")) metrics.put("tokens", usage.get("total_tokens"));The following nested fields are never extracted for Chat Completions:
completion_tokens_details.reasoning_tokens— reasoning tokens used by o-series models (o1, o3, etc.)completion_tokens_details.accepted_prediction_tokens— accepted predicted output tokenscompletion_tokens_details.rejected_prediction_tokens— rejected predicted output tokensprompt_tokens_details.cached_tokens— prompt tokens served from cache (important for cost tracking)
The repo's own Chat Completions test cassettes contain these fields — e.g., test-harness/src/testFixtures/resources/cassettes/openai/__files/chat_completions-22dbf888-ee29-4f5e-962b-b3e5eebb9960.json includes both prompt_tokens_details.cached_tokens and completion_tokens_details.reasoning_tokens — but no test asserts they are captured.
Braintrust docs status
- Braintrust docs at https://braintrust.dev/docs/integrations/ai-providers/openai.md mention logging
prompt_tokens,completion_tokens, andtokensfor streaming, but do not detail sub-token breakdowns. - Status: unclear (top-level metrics documented, detailed breakdowns not explicitly mentioned)
Upstream sources
- OpenAI Chat Completions API docs: https://platform.openai.com/docs/api-reference/chat/object — the
usageobject includescompletion_tokens_detailsandprompt_tokens_details - OpenAI reasoning models: https://platform.openai.com/docs/guides/reasoning — o-series models return
reasoning_tokensincompletion_tokens_details - OpenAI prompt caching: https://platform.openai.com/docs/guides/prompt-caching —
cached_tokensinprompt_tokens_details
Local files inspected
braintrust-sdk/src/main/java/dev/braintrust/instrumentation/InstrumentationSemConv.java— lines 111-156 (tagOpenAIResponse); lines 144-150 show Responses API does extractoutput_tokens_details.reasoning_tokensbraintrust-sdk/instrumentation/openai_2_8_0/src/test/java/dev/braintrust/instrumentation/openai/v2_8_0/BraintrustOpenAITest.javatest-harness/src/testFixtures/resources/cassettes/openai/__files/chat_completions-22dbf888-ee29-4f5e-962b-b3e5eebb9960.json— containscompletion_tokens_detailsandprompt_tokens_detailsin response