-
Notifications
You must be signed in to change notification settings - Fork 3
[BOT ISSUE] OpenAI Realtime API not instrumented — zero visibility for real-time voice/text sessions #1731
Description
Summary
The OpenAI Node SDK provides a GA Realtime API (client.realtime) for low-latency, bidirectional voice and text interaction with GPT-4o models. This repo has zero instrumentation for any Realtime API surface — no channels, no plugin handlers, no wrapper proxies, and no auto-instrumentation configs. Users building voice-enabled AI applications with the OpenAI Realtime API get no Braintrust spans at all.
What instrumentation is missing
No coverage in any layer:
- Wrapper (
js/src/wrappers/oai.ts): No proxy for therealtimeresource. Onlychat.completions,embeddings,moderations, andresponsesare wrapped. - Auto-instrumentation config (
js/src/auto-instrumentations/configs/openai.ts): No config entry for anyrealtime.*methods. - Channels (
js/src/instrumentation/plugins/openai-channels.ts): No channel definitions for Realtime API. - Plugin (
js/src/instrumentation/plugins/openai-plugin.ts): No handler for Realtime calls. - Vendor types (
js/src/vendor-sdk-types/openai-common.ts): No type definitions for Realtime resources.
A grep for RealtimeSession, clientSecrets, and realtime.connect across js/src/ returns zero matches.
Key upstream API surfaces with no Braintrust tracing:
| SDK Method | Description |
|---|---|
client.realtime.clientSecrets.create() |
HTTP POST — creates an ephemeral client secret/token for WebSocket connection. Accepts model, voice, tools, instructions, and other session config. |
RealtimeSession class |
WebSocket session that emits events including response.done (with full usage metrics: input/output audio tokens, text tokens), conversation.item.created, tool call events, and error events. |
The clientSecrets.create() call is a standard HTTP request that could be instrumented like any other method to give session-level visibility. The RealtimeSession event stream is architecturally different (WebSocket-based) but its response.done events contain usage metrics that are directly analogous to what's captured for chat.completions.create() and responses.create().
Braintrust docs status
not_found — The Braintrust docs at https://www.braintrust.dev/docs/instrument/wrap-providers document OpenAI as a supported provider for standard LLM calls and streaming, but make no mention of the Realtime API, real-time voice, or WebSocket-based interactions.
Upstream reference
- OpenAI Realtime API guide: https://platform.openai.com/docs/guides/realtime
- OpenAI Node SDK Realtime resource:
client.realtime(GA — not underclient.beta) - SDK source: https://github.com/openai/openai-node/tree/master/src/resources/realtime
- Supported models: GPT-4o, GPT-4o-mini (real-time audio and text)
- Usage metrics in
response.doneevents:input_tokens,output_tokens,input_audio_tokens,output_audio_tokens - Session configuration: model, voice, tools, instructions, temperature, modalities, turn detection
Local files inspected
js/src/wrappers/oai.ts— norealtimeproperty interceptionjs/src/auto-instrumentations/configs/openai.ts— no Realtime configsjs/src/instrumentation/plugins/openai-channels.ts— no Realtime channelsjs/src/instrumentation/plugins/openai-plugin.ts— no Realtime handlersjs/src/vendor-sdk-types/openai-common.ts— no Realtime typese2e/scenarios/openai-instrumentation/— no Realtime test scenarios