Skip to content

Llm response token counts#1682

Draft
umaannamalai wants to merge 7 commits intomainfrom
llm-response-token-counts
Draft

Llm response token counts#1682
umaannamalai wants to merge 7 commits intomainfrom
llm-response-token-counts

Conversation

@umaannamalai
Copy link
Copy Markdown
Contributor

@umaannamalai umaannamalai commented Mar 10, 2026

This PR includes changes to send token counts pulled from LLM response objects directly from the agent.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 10, 2026

MegaLinter analysis: Success

Descriptor Linter Files Fixed Errors Warnings Elapsed time
✅ ACTION actionlint 8 0 0 1.05s
✅ MARKDOWN markdownlint 7 0 0 0 1.45s
✅ PYTHON ruff 1030 0 0 0 1.11s
✅ PYTHON ruff-format 1030 0 0 0 0.41s
✅ YAML prettier 19 0 0 0 1.66s
✅ YAML v8r 19 0 0 6.29s
✅ YAML yamllint 19 0 0 0.75s

See detailed reports in MegaLinter artifacts

MegaLinter is graciously provided by OX Security
Show us your support by starring ⭐ the repository

@mergify mergify bot added the tests-failing Tests failing in CI. label Mar 10, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 11, 2026

Codecov Report

❌ Patch coverage is 90.32258% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.92%. Comparing base (7343f58) to head (35c85ad).

Files with missing lines Patch % Lines
newrelic/hooks/mlmodel_gemini.py 67.39% 7 Missing and 8 partials ⚠️
newrelic/hooks/external_botocore.py 97.32% 0 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1682      +/-   ##
==========================================
+ Coverage   81.85%   81.92%   +0.07%     
==========================================
  Files         214      214              
  Lines       25683    25813     +130     
  Branches     4076     4090      +14     
==========================================
+ Hits        21022    21147     +125     
- Misses       3264     3265       +1     
- Partials     1397     1401       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mergify mergify bot added tests-failing Tests failing in CI. and removed tests-failing Tests failing in CI. labels Mar 11, 2026
umaannamalai and others added 7 commits March 27, 2026 15:58
* Add OpenAI token counts.

* Add token counts to langchain + openai tests.

* Remove unused expected events.

* Linting

* Add OpenAI token counts.

* Add token counts to langchain + openai tests.

* Remove unused expected events.

* [MegaLinter] Apply linters fixes

---------

Co-authored-by: Tim Pansino <timpansino@gmail.com>
* Add bedrock token counting.

* [MegaLinter] Apply linters fixes

* Add bedrock token counting.

* Add safeguards when grabbing token counts.

* Remove extra None defaults.

* Cleanup default None checks.

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* Add response token count logic to Gemini instrumentation.

* Update token counting util functions.

* Linting

* Add response token count logic to Gemini instrumentation.

* Update token counting util functions.

* [MegaLinter] Apply linters fixes

* Bump tests.

---------

Co-authored-by: Tim Pansino <timpansino@gmail.com>
@TimPansino TimPansino force-pushed the llm-response-token-counts branch from a6e4630 to 35c85ad Compare March 27, 2026 22:58
@mergify mergify bot removed the tests-failing Tests failing in CI. label Mar 27, 2026
Copy link
Copy Markdown

@JiwaniZakir JiwaniZakir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In create_chat_completion_message_event, the all_token_counts parameter is added to the signature and checked for truthiness, but its value is never actually used — both the request message (line ~231) and response message (line ~274) unconditionally set "token_count": 0 whenever the flag is truthy. This looks like a bug where the actual per-message token count should be extracted from all_token_counts rather than hardcoded to zero. If all_token_counts is a dict or list keyed by message index, the lookup logic is missing entirely.

Additionally, extract_bedrock_titan_embedding_model_response sets only response.usage.total_tokens (equal to inputTextTokenCount), while the text model response handler sets all three — prompt_tokens, completion_tokens, and total_tokens. The inconsistency is reasonable for embeddings (no output), but it would be worth a comment clarifying the intentional omission of prompt_tokens and completion_tokens to avoid confusion for future maintainers.

In extract_bedrock_titan_text_model_streaming_response, the accumulation pattern (bedrock_attrs.get(..., 0) + value) assumes invocation metrics may appear across multiple chunks, but amazon-bedrock-invocationMetrics is typically only present in the final chunk — so for most chunks prompt_tokens and completion_tokens will be 0, meaning the accumulation adds unnecessary overhead without risk of double-counting. A guard checking whether the metrics key exists before updating would make the intent clearer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants