fix(claude_agent_sdk): log ResultMessage metrices by Mahhheshh · Pull Request #225 · braintrustdata/braintrust-sdk-python

Mahhheshh (Mahhheshh) · 2026-04-08T14:11:28Z

Capture previously dropped ResultMessage fields in _handle_result.

ref 5 (#149)

starfolkai · 2026-04-08T14:21:07Z

Drop into this review session: sfk devbox connect pr-225-braintrust-sdk-python-17a8 --attach

Found 2 issues:

Test guards use hasattr but production code filters with if v is not None -- these have different semantics. When a ResultMessage attribute exists but has value None (e.g., stop_sequence is None when no stop sequence was triggered), hasattr returns True but the production filter excludes it from metadata. The assertion assert task_span.get("metadata", {}).get("stop_sequence") is not None would then fail. All five new guards (lines 175-184) should use getattr(result_message, ..., None) is not None to match the if v is not None filter in _handle_result (line 739).

braintrust-sdk-python/py/src/braintrust/integrations/claude_agent_sdk/test_claude_agent_sdk.py

Lines 174 to 185 in 5c9b3dd

    
               assert task_span.get("metadata", {}).get("session_id") is not None 
        
           if hasattr(result_message, "stop_reason"): 
        
               assert task_span.get("metadata", {}).get("stop_reason") is not None 
        
           if hasattr(result_message, "stop_sequence"): 
        
               assert task_span.get("metadata", {}).get("stop_sequence") is not None 
        
           if hasattr(result_message, "total_cost_usd"): 
        
               assert task_span.get("metadata", {}).get("total_cost_usd") is not None 
        
           if hasattr(result_message, "duration_ms"): 
        
               assert task_span.get("metadata", {}).get("duration_ms") is not None 
        
           if hasattr(result_message, "duration_api_ms"): 
        
               assert task_span.get("metadata", {}).get("duration_api_ms") is not None 
        
           llm_spans = [s for s in spans if s["span_attributes"]["type"] == SpanTypeAttribute.LLM]

Production filter for reference:

braintrust-sdk-python/py/src/braintrust/integrations/claude_agent_sdk/tracing.py

Lines 728 to 740 in 5c9b3dd

    
           result_metadata = { 
        
               k: v 
        
               for k, v in { 
        
                   "num_turns": getattr(message, "num_turns", None), 
        
                   "session_id": getattr(message, "session_id", None), 
        
                   "stop_reason": getattr(message, "stop_reason", None), 
        
                   "stop_sequence": getattr(message, "stop_sequence", None), 
        
                   "total_cost_usd": getattr(message, "total_cost_usd", None), 
        
                   "duration_ms": getattr(message, "duration_ms", None), 
        
                   "duration_api_ms": getattr(message, "duration_api_ms", None), 
        
               }.items() 
        
               if v is not None 
        
           }

The stop_sequence assertion is dead code -- the cassette for test_calculator_with_multiple_operations does not include stop_sequence in the result payload, so the guard is always False and the assertion at line 177-178 never executes. This field has zero test coverage. Consider either updating the cassette to include a result with a non-null stop_sequence, or adding a targeted unit test with a synthetic ResultMessage (the existing ResultMessage stub at line 1360 also needs updating -- it only defines num_turns and session_id).

braintrust-sdk-python/py/src/braintrust/integrations/claude_agent_sdk/test_claude_agent_sdk.py

Lines 176 to 179 in 5c9b3dd

    
               assert task_span.get("metadata", {}).get("stop_reason") is not None 
        
           if hasattr(result_message, "stop_sequence"): 
        
               assert task_span.get("metadata", {}).get("stop_sequence") is not None 
        
           if hasattr(result_message, "total_cost_usd"):

Fake ResultMessage stub that needs the new fields:

braintrust-sdk-python/py/src/braintrust/integrations/claude_agent_sdk/test_claude_agent_sdk.py

Lines 1360 to 1377 in 5c9b3dd

    
           class ResultMessage: 
        
               def __init__( 
        
                   self, 
        
                   *, 
        
                   input_tokens: int = 1, 
        
                   output_tokens: int = 1, 
        
                   cache_creation_input_tokens: int = 0, 
        
                   num_turns: int = 1, 
        
                   session_id: str = "session-123", 
        
               ): 
        
                   self.usage = types.SimpleNamespace( 
        
                       input_tokens=input_tokens, 
        
                       output_tokens=output_tokens, 
        
                       cache_creation_input_tokens=cache_creation_input_tokens, 
        
                   ) 
        
                   self.num_turns = num_turns 
        
                   self.session_id = session_id

Abhijeet Prasad (AbhiPrasad) · 2026-04-08T18:24:54Z

please rebase! CI should be fixed

fix(claude_agent_sdk): log ResultMessage metrices

6338f8a

Mahhheshh (Mahhheshh) force-pushed the claude_agent_sdk_imp branch from 5c9b3dd to 6338f8a Compare April 8, 2026 14:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(claude_agent_sdk): log ResultMessage metrices#225

fix(claude_agent_sdk): log ResultMessage metrices#225
Mahhheshh (Mahhheshh) wants to merge 1 commit intobraintrustdata:mainfrom
Mahhheshh:claude_agent_sdk_imp

Mahhheshh (Mahhheshh) commented Apr 8, 2026

Uh oh!

starfolkai bot commented Apr 8, 2026

Uh oh!

Abhijeet Prasad (AbhiPrasad) commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Mahhheshh (Mahhheshh) commented Apr 8, 2026

Uh oh!

starfolkai bot commented Apr 8, 2026

Uh oh!

Abhijeet Prasad (AbhiPrasad) commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants