feat(endpoints): Add OpenAI Responses API endpoint with fixes and integration tests#43
Merged
athewsey merged 10 commits intoawslabs:mainfrom Apr 15, 2026
Merged
feat(endpoints): Add OpenAI Responses API endpoint with fixes and integration tests#43athewsey merged 10 commits intoawslabs:mainfrom
athewsey merged 10 commits intoawslabs:mainfrom
Conversation
athewsey
requested changes
Apr 6, 2026
Collaborator
|
Also almost forgot - we should add the relevant module placeholder .md under docs api reference |
… test suite - Add ResponseEndpoint and ResponseStreamEndpoint classes for OpenAI Responses API support - Implement non-streaming and streaming response handling with proper error management - Add structured output support with response format validation and serialization - Create comprehensive unit test suite covering response parsing, error handling, format validation, model parameters, payload parsing, properties, and serialization - Add integration tests for Bedrock response endpoint functionality - Export new response endpoint classes from endpoints module - Update integration test configuration with response endpoint fixtures
- Rename max_tokens to max_output_tokens in create_payload (Response API parameter name) - Fix _parse_response to handle usage=None (Bedrock Mantle) and use input_tokens/output_tokens with fallback to prompt_tokens/completion_tokens - Rewrite _parse_stream_response to process typed events (response.output_text.delta, response.completed) instead of the old chunk-with-output-array format - Fix test_response_bedrock.py to use ResponseUsage attribute names (input_tokens/output_tokens) - Add integration tests for ResponseEndpoint and ResponseStreamEndpoint - Add example notebook for Response API on Bedrock - Update all unit test mocks to match new behavior
- Rename ResponseEndpoint -> OpenAIResponseEndpoint and ResponseStreamEndpoint -> OpenAIResponseStreamEndpoint for consistency with OpenAICompletionEndpoint naming convention - Change logger.error() to logger.exception() for stack trace consistency with bedrock_invoke.py and litellm.py - Rewrite test_response_bedrock.py to test LLMeter endpoint wrappers instead of raw OpenAI SDK - Update serialization test assertions for new class names - Update example notebook references
- Add docs/reference/endpoints/openai_response.md placeholder - Add openai_response to mkdocs.yml nav under endpoints - Update connect_endpoints user guide to mention Response API endpoints
- Type invoke() payload as CompletionCreateParams / ResponseCreateParams - Type create_payload() return as SDK TypedDicts using cast() - Replace jmespath with plain list comprehension in _parse_payload - Rewrite stream parsers using typed ChatCompletionChunk / event types, removing all hasattr/getattr fallbacks and type: ignore comments - Make OpenAIResponseStreamEndpoint inherit from OpenAIResponseEndpoint, deduplicating _parse_payload and create_payload - Use collections.abc.Sequence instead of typing.Sequence
Previous rename from Response{Stream}Endpoint to
OpenAIResponse{Stream}Endpoint had missed some corresponding test
class names and mentions in test docstrings.
Claude had originally written separate files for testing 1/ that OpenAI works with Bedrock Mantle endpoints at all, and 2/ that the LLMeter Endpoint worked with this combination. We'd already adjusted the tests in 1/ since we only want to focus on LLMeter-specific aspects, so one of these files was now redundant.
Use OpenAI SDK entities in payload generation and parsing. Clean up typing, including severing responses endpoints from inheriting from ChatCompletions endpoints. Update test stubbing of OpenAI SDK to reflect this separate import pathway.
Tweak OpenAI Responses intro comments on user guide and example notebook for clarity.
athewsey
approved these changes
Apr 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds the OpenAI Responses API endpoint support for LLMeter, with fixes to align with the actual API behavior.
Changes
Endpoint fixes (
llmeter/endpoints/openai_response.py)max_tokenstomax_output_tokensincreate_payload(Response API parameter name)_parse_responseto handleusage=None(Bedrock Mantle doesn't always return it) and useinput_tokens/output_tokenswith fallback toprompt_tokens/completion_tokens_parse_stream_responseto process typed events (response.output_text.delta,response.completed) instead of the old chunk-with-output-array formatIntegration tests
tests/integ/test_response_endpoint.py— integration tests forResponseEndpointandResponseStreamEndpointwrappers against Bedrock Mantletests/integ/test_response_bedrock.pyto useResponseUsageattribute names (input_tokens/output_tokens)Unit test updates
spec-based usage mocks (input_tokens/output_tokens) and event-based streaming mocksExample notebook
examples/LLMeter with OpenAI Response API on Bedrock.ipynbdemonstrating non-streaming and streaming usage with Runner and plottingTesting