[BOT ISSUE] OpenAI embeddings spans missing output capture and incomplete input extraction

## Summary

The OpenAI instrumentation recognizes embeddings calls (span is named `"Embeddings"`) but the span data is largely empty because the response and request parsing logic in `InstrumentationSemConv` only handles Chat Completions and Responses API field shapes.

Specifically:
- **No `output_json`**: the response parser checks for `choices` (Chat Completions) or `output` (Responses API), but embeddings responses have `data[].embedding` — neither branch matches
- **Incomplete `input_json`**: the request parser checks for `messages` or `input` (as array), but embeddings `input` can be a single string, which fails the `isArray()` check on line 102
- **Partial metrics**: `prompt_tokens` and `total_tokens` are captured correctly from `usage`, but `completion_tokens` is absent (embeddings don't produce completion tokens) — this is fine but worth noting

## What is missing

In `InstrumentationSemConv.tagOpenAIRequest()` (lines 78–108):
```java
if (requestJson.has("messages")) {
    span.setAttribute("braintrust.input_json", toJson(requestJson.get("messages")));
} else if (requestJson.has("input") && requestJson.get("input").isArray()) {
    span.setAttribute("braintrust.input_json", toJson(requestJson.get("input")));
}
```
- Embeddings requests use `input` which can be a string (`"Hello world"`), an array of strings, or an array of token arrays. Single-string inputs fail the `isArray()` guard and are silently dropped.

In `InstrumentationSemConv.tagOpenAIResponse()` (lines 111–156):
```java
if (responseJson.has("choices")) {
    span.setAttribute("braintrust.output_json", toJson(responseJson.get("choices")));
} else if (responseJson.has("output")) {
    span.setAttribute("braintrust.output_json", toJson(responseJson.get("output")));
}
```
- Embeddings responses have `data` (array of embedding objects) and `model` — neither `choices` nor `output`. No `output_json` is set.

Additionally, the `model` field from the embeddings request (e.g., `text-embedding-3-small`) is correctly extracted into metadata, and usage metrics partially work — so this is a detail gap rather than total failure.

## Braintrust docs status

- The Braintrust OpenAI integration docs at `braintrust.dev/docs/integrations/ai-providers/openai` do not mention embeddings: **not_found**
- No embeddings instrumentation is documented for any provider

## Upstream sources

- **OpenAI Embeddings API**: https://platform.openai.com/docs/api-reference/embeddings/create — documents `input` (string or array), response `data[].embedding`, and `usage` with `prompt_tokens`/`total_tokens`
- **OpenAI Java SDK**: `client.embeddings().create()` is a stable, documented API surface

## Local files inspected

- `braintrust-sdk/src/main/java/dev/braintrust/instrumentation/InstrumentationSemConv.java` — lines 78–108 (`tagOpenAIRequest`: `input` string case not handled), lines 111–156 (`tagOpenAIResponse`: no `data` field extraction), line 244 (span name correctly maps to `"Embeddings"`)
- `braintrust-sdk/instrumentation/openai_2_8_0/src/main/java/dev/braintrust/instrumentation/openai/v2_8_0/TracingHttpClient.java` — HTTP-level wrapping captures the call, delegates to `InstrumentationSemConv`
- `braintrust-sdk/instrumentation/openai_2_8_0/src/test/java/dev/braintrust/instrumentation/openai/v2_8_0/BraintrustOpenAITest.java` — no embeddings test exists

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BOT ISSUE] OpenAI embeddings spans missing output capture and incomplete input extraction #66

Summary

What is missing

Braintrust docs status

Upstream sources

Local files inspected

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BOT ISSUE] OpenAI embeddings spans missing output capture and incomplete input extraction #66

Description

Summary

What is missing

Braintrust docs status

Upstream sources

Local files inspected

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions