-
Notifications
You must be signed in to change notification settings - Fork 3k
Description
** Please make sure you read the contribution guide and file the issues in the right place. **
Contribution guide.
🔴 Required Information
Please ensure all items in this section are completed to allow for efficient
triaging. Requests without complete information may be rejected / deprioritized.
If an item is not applicable to you - please mark it as N/A
Is your feature request related to a specific problem?
The feature request is related to Tracing / OpenTelemetry.
Describe the Solution You'd Like
ADK framework currently supports a variety of Tools (MCP, Python native-functions, 3P custom tools, etc,.) and when the tool execution fails for any reason the framework also supports retries by allowing the LLM models / Agents to correct inputs.
For every tool execution attempt (including failures during the retry loop), trace details are exported using OpenTelemetry. Currently, these trace details only contain verbose error messages that are not easy to parse, aggregate, or understand by downstream observability applications.
In addition to this, the GenAI Semconv field for error type (error.type) is currently unpopulated in ADK.
To solve this problem, I propose having a standard error code based reporting of errors in addition to the existing error message based reporting. This is in-line with the the OTel Semconv standards.
Impact on your work
How does this feature impact your work and what are you trying to achieve?
By reporting accurate and easily parsable error types, we can facilitate faster debugging and classification of issues.
Willingness to contribute
Are you interested in implementing this feature yourself or submitting a PR?
Yes
🟡 Recommended Information
Describe Alternatives You've Considered
We can parse the error messages / traces exported by tools but these can vary based on the type or developer of the tool.
Proposed API / Implementation
To solve this problem, we propose introducing standard, error code-based reporting (e.g., HTTP status codes like 400 or 500) for tool failures, in addition to the existing error message-based reporting. This approach aligns natively with OpenTelemetry Semantic Conventions (OTel Semconv) and the expectations for the error.code field in OTel.
This error-code based reporting can be nested inside a new Error Type - ToolExecutionError which can be used by new & existing tools.
Additional Context
Add any other context or screenshots about the feature request here.