Skip to content

Add responses_format classifier filter for AI API body-aware routing#372

Open
nerdalert wants to merge 1 commit into
praxis-proxy:mainfrom
nerdalert:brent-agentic-loop-body-classifier
Open

Add responses_format classifier filter for AI API body-aware routing#372
nerdalert wants to merge 1 commit into
praxis-proxy:mainfrom
nerdalert:brent-agentic-loop-body-classifier

Conversation

@nerdalert
Copy link
Copy Markdown
Member

Summary

This is the first in a stacked implementation for Epic #354, “Responses API and Agentic Loop Orchestration.” It establishes the body-aware routing foundation that later PRs will build on for Responses orchestration, state, tool calls, and streaming. This is long but want to add some framing here.

  • Adds responses_format built-in HTTP filter that classifies request bodies as Responses API (input), Chat Completions (messages), unknown JSON, invalid JSON, or non-JSON
  • Promotes classification facts (format, model, stream, store, previous_response_id presence, conversation presence) to internal routing headers
    (x-praxis-ai-*), durable metadata, and filter results
  • Carries body-derived filter_results from pre-read body inspection into request-phase branch evaluation so on_result can route using classifier output
  • Does not mutate the request body: uses ReadOnly access with StreamBuffer
  • Bounds body-derived promoted values and suppresses control-character values before header/result promotion
  • on_invalid defaults to continue for mixed-traffic listeners; configurable to reject for AI-only listeners

Architectural callouts worth mentioning:

  • This PR lays the pre-upstream classification and policy-routing groundwork for a future agentic Responses orchestrator, while keeping the actual model/tool.
    loop out of branch re-entry because that lifecycle needs response-aware orchestration.

    • This PR is the routing/classification foundation, not the agent loop itself. It intentionally stops at request-body classification and promotion so later stacked PRs can build orchestration, state, tool execution, and streaming on top.
    • Body-derived filter_results now cross from the pre-read body phase into request-phase branch evaluation. That is an architectural change to the Pingora request context, not just a filter-local change.
    • The filter uses StreamBuffer + ReadOnly: Praxis buffers enough body to classify it, but does not mutate or rewrite the body. That keeps this PR safe for pass-through routing.
    • on_invalid: continue is the default to support mixed-traffic listeners. on_invalid: reject is intended for AI-only listeners and now rejects valid JSON that is not Responses or Chat Completions.
    • Promotion is deliberately split across three surfaces:
      x-praxis-ai-* internal headers for router matching, durable metadata for later lifecycle phases, and filter_results for branch-chain decisions.
    • Body-derived promoted values are bounded/sanitized before header/result promotion. This avoids turning model names or other request fields into oversized or unsafe synthetic headers.
    • This PR relies on existing internal header hygiene: x-praxis-ai-* headers are usable inside Praxis for routing but stripped before upstream.

    Test plan

    • Unit: Responses with string input classified as responses
    • Unit: Responses with item-array input classified as responses
    • Unit: Chat Completions with messages classified as chat_completions
    • Unit: Unknown JSON classified as unknown
    • Unit/integration: Unknown JSON rejects when on_invalid: reject
    • Unit: Non-JSON continues/rejects according to on_invalid
    • Unit/integration: Invalid JSON rejects when on_invalid: reject
    • Unit: All extracted facts promoted to headers, durable metadata, and filter results
    • Unit: Missing optional facts are not promoted
    • Unit/integration: Oversized and control-character model values are not promoted
    • Integration: Route by classifier result (responses vs chat_completions vs default)
    • Integration: Route by promoted model header
    • Integration: Route by promoted stream header
    • Integration: Route by body-derived filter_results via on_result
    • Integration: Reserved x-praxis-ai-* headers stripped before upstream
    • Integration: Body byte-for-byte unchanged after classification
    • Integration: Large body over 64 KiB classified and forwarded
    • Schema: Example config parses (all_example_configs_parse)
    • Functional: Example config routes Responses, Chat Completions, and unknown traffic
    • Full suites: filter, protocol, integration, schema, and nightly fmt pass

    Issues: Responses API format detection and routing #361, partial OpenAI Responses API: stateless pass-through mode #355, part of Epic: Responses API and Agentic Loop Orchestration #354

@nerdalert nerdalert requested a review from a team May 20, 2026 18:45
@praxis-bot
Copy link
Copy Markdown
Collaborator

PR too large

This PR adds 2100 lines (limit: 500).

Large PRs are difficult to review and more likely to introduce subtle bugs. Please break this contribution into smaller, focused PRs that each address a single concern.

See our coding conventions for guidance.

If this PR legitimately requires a large change, a maintainer can add the skip/pr-hygiene label to skip this check.

@praxis-bot praxis-bot closed this May 20, 2026
@nerdalert nerdalert added the skip/pr-hygiene PR size and description bypass label May 20, 2026
@nerdalert nerdalert reopened this May 20, 2026
@nerdalert nerdalert force-pushed the brent-agentic-loop-body-classifier branch 3 times, most recently from a908123 to db27174 Compare May 20, 2026 20:15
@shaneutt shaneutt self-assigned this May 20, 2026
@github-project-automation github-project-automation Bot moved this to Backlog in AI Gateway May 20, 2026
@shaneutt shaneutt moved this from Backlog to Review in AI Gateway May 20, 2026
@shaneutt shaneutt added this to the v0.4.0 milestone May 20, 2026
Comment thread filter/src/builtins/http/ai/responses_format/classify.rs
Comment thread filter/src/builtins/http/ai/responses_format/config.rs Outdated
Comment thread filter/src/builtins/http/ai/responses_format/config.rs
Comment thread filter/src/builtins/http/ai/responses_format/mod.rs
Comment thread filter/src/builtins/http/ai/responses_format/mod.rs Outdated
Comment thread filter/src/builtins/http/ai/responses_format/tests.rs
@nerdalert nerdalert force-pushed the brent-agentic-loop-body-classifier branch from 032a241 to bf97d1e Compare May 21, 2026 21:05
@nerdalert nerdalert requested a review from a team May 21, 2026 21:05
Add responses_format body classifier for AI routing

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip/pr-hygiene PR size and description bypass

Projects

Status: Review

Development

Successfully merging this pull request may close these issues.

3 participants