-
Notifications
You must be signed in to change notification settings - Fork 2
[BOT ISSUE] Google GenAI embedContent spans lack embedding-specific input, metrics, and span type #65
Description
Summary
When client.models.embedContent() is called through the instrumented Google GenAI client, the call is captured at the HTTP level but the span contains almost no useful embedding-specific detail. The tagSpan() method in BraintrustApiClient only extracts fields relevant to generateContent (like contents, generationConfig, usageMetadata), which are absent from embedding requests and responses.
The span is created with the correct operation name (embed_content) but has:
- Empty
input_json: only{"model": "..."}— the actualcontentbeing embedded,taskType,title, andoutputDimensionalityare not extracted - No metrics: embedding responses use
metadata.billableCharacterCountinstead ofusageMetadata, so no token/character counts are captured - Incorrect span type: marked as
type: "llm"rather thantype: "embedding"(or equivalent)
The full response is stored in output_json as a raw dump, so the embedding vector data is technically present but not meaningfully structured.
What is missing
In BraintrustApiClient.tagSpan() (lines 50–161):
Request parsing (lines 97–112):
- Checks for
contents(generateContent field) but embedContent usescontent(singular) - Checks for
generationConfigbut embedContent usestaskType,title,outputDimensionality - Result:
input_jsononly contains{"model": "..."}for embedding calls
Response parsing (lines 116–149):
- Checks for
usageMetadatawithpromptTokenCount/candidatesTokenCountbut embedContent responses havemetadatawithbillableCharacterCount - Result: no metrics are captured
Span attributes (line 156):
- Hardcodes
type: "llm"for all calls including embeddings
Braintrust docs status
- The Braintrust Gemini integration docs at
braintrust.dev/docs/integrations/ai-providers/geminido not mention embeddings: not_found - No embeddings instrumentation is documented for any provider in Java
Upstream sources
- Google GenAI embeddings docs: https://ai.google.dev/gemini-api/docs/embeddings — documents
embedContentas a stable, first-class API with models likegemini-embedding-001 - Google GenAI Java SDK:
client.models.embedContent()is available withEmbedContentConfig(taskType, title, outputDimensionality) and returnsEmbedContentResponsewithembeddingsandmetadata - embedContent request format: uses
content(singular),taskType,title,outputDimensionality— none of which match the generateContent fields currently extracted - embedContent response format: returns
embedding.valuesarray andmetadata.billableCharacterCount— notusageMetadata
Local files inspected
braintrust-sdk/instrumentation/genai_1_18_0/src/main/java/com/google/genai/BraintrustApiClient.java— lines 50–161 (tagSpanonly extracts generateContent-relevant fields), lines 325–333 (getOperationcorrectly parsesembedContenttoembed_content)braintrust-sdk/instrumentation/genai_1_18_0/src/test/java/dev/braintrust/instrumentation/genai/v1_18_0/BraintrustGenAITest.java— no embedContent test existsbraintrust-sdk/instrumentation/genai_1_18_0/src/main/java/com/google/genai/BraintrustInstrumentation.java— wraps ApiClient generically, no embedding-specific logic