feat: wire plumbing — real query data, ProxyError, streaming body limit#59
Conversation
Wire LocalGatewayQuery to return real data:
- Router.list_routes() exposes all routes as RouteSnapshot with backend
health scores and draining status
- CircuitBreakerManager.snapshot_all() and open_count() iterate the
ArcSwap breaker map
- RateLimiter.snapshot_all() and exhausted_count() iterate the ArcSwap
bucket map, expose TokenBucket.capacity() and refill_rate()
- LocalGatewayQuery.snapshot() now reports real open_circuits and
exhausted_rate_limiters counts
- list_routes/list_circuit_breakers/list_rate_limiters support filtering
and return sorted results
- diagnose() passes real route/CB/RL data to the diagnostics engine
Wire ProxyError through the proxy (eliminates string matching):
- forward_to_backend returns Result<..., ProxyError> instead of String
- Timeout errors use ProxyError::Timeout (not "TIMEOUT:" prefix)
- Body limit errors use ProxyError::BodyTooLarge
- record_request_metrics uses e.status_code()/e.status_str() instead
of e.starts_with("TIMEOUT:")
- error_to_response takes ProxyError, maps via status_code()
- From<String> impl provides backward compat for legacy error sites
Streaming body limit (prevents OOM):
- Use http_body_util::Limited to enforce 10MB cap during streaming
- Body reading stops at the limit instead of collecting everything first
- Returns 413 (ProxyError::BodyTooLarge) without buffering oversized data
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR wires the control-plane “query” surfaces (status/routes/diagnose) to real in-process gateway state, replaces stringly-typed proxy errors with a structured ProxyError, and enforces request body size limits while streaming to prevent buffering large payloads.
Changes:
- Add snapshot/list APIs on
Router,CircuitBreakerManager, andRateLimiterto expose live route/CB/RL state to admin/CLI/MCP queries. - Introduce
ProxyErrorwith HTTP status mapping, and propagate it through request handling/forwarding/metrics. - Enforce a 10MB request body cap during streaming reads using
http_body_util::Limited.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| control/src/proxy/router.rs | Adds list_routes() to expose configured routes as agent-api snapshots including backend draining/health hints. |
| control/src/proxy/request_handler.rs | Switches handler pipeline to ProxyError, updates metrics/error mapping, and keeps retry logic working under typed errors. |
| control/src/proxy/rate_limiter.rs | Exposes token bucket capacity/refill rate and adds snapshot_all() + exhausted_count() for admin queries. |
| control/src/proxy/forwarder.rs | Uses ProxyError and enforces streaming body size limit with http_body_util::Limited. |
| control/src/proxy/circuit_breaker.rs | Adds snapshot_all() + open_count() for admin status/diagnostics. |
| control/src/error.rs | Introduces ProxyError enum with status mapping and legacy From<String> conversion. |
| control/src/admin/local_query.rs | Replaces placeholder query responses with live router/CB/RL data and adds filtering/sorting. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let err_str = e.to_string(); | ||
| if err_str.contains("length limit exceeded") { |
There was a problem hiding this comment.
Body-size limit detection is currently based on e.to_string().contains("length limit exceeded"), which is brittle and may break if the error message changes. Prefer matching on the concrete error type/variant returned by http_body_util::Limited (or using an API like is_limit_exceeded if available) rather than string inspection.
| let err_str = e.to_string(); | |
| if err_str.contains("length limit exceeded") { | |
| // Detect body-size limit violations using the concrete error type | |
| // produced by `http_body_util::Limited`, rather than inspecting | |
| // the error message string. | |
| if e.is::<http_body_util::LengthLimitError>() { |
There was a problem hiding this comment.
Fixed in 589fa15 — now uses e.is::<http_body_util::LengthLimitError>() for type-safe detection instead of string matching.
| "Request body exceeded size limit" | ||
| ); | ||
| return ProxyError::BodyTooLarge { | ||
| size: MAX_BODY_SIZE + 1, |
There was a problem hiding this comment.
ProxyError::BodyTooLarge is constructed with size: MAX_BODY_SIZE + 1, which is not the actual received size and can be misleading in logs/API responses. If the limiting layer doesn’t expose the real size, consider reporting size: MAX_BODY_SIZE (or omitting/setting to 0/None) and keeping max as the authoritative limit.
| size: MAX_BODY_SIZE + 1, | |
| size: MAX_BODY_SIZE, |
There was a problem hiding this comment.
Fixed in 589fa15 — now reports size: 0 since the actual received size is unknown when streaming was aborted at the limit. The max field still shows the configured cap.
| Err(ref e) => { | ||
| let status_code = e.status_code(); | ||
| let status_str = e.status_str(); | ||
|
|
||
| HTTP_REQUESTS_TOTAL |
There was a problem hiding this comment.
With ProxyError now carrying client/proxy-side conditions (e.g., BodyTooLarge = 413), treating every Err(e) uniformly as a backend error can skew metrics/health. Consider branching on the ProxyError variant here so client-side errors don’t get counted the same way as backend failures (and aren’t later attributed to backend health/circuit breaker).
There was a problem hiding this comment.
Fixed in 589fa15 — BodyTooLarge errors now skip record_backend_response and circuit_breaker.record_failure. Only backend-originated errors (Timeout, BackendError) affect health metrics.
| /// Allow `?` on String errors (legacy compatibility during migration) | ||
| impl From<String> for ProxyError { | ||
| fn from(s: String) -> Self { | ||
| ProxyError::BackendError { message: s } |
There was a problem hiding this comment.
From<String> for ProxyError maps all legacy String errors to BackendError (502). This can misclassify internal/proxy errors that still use String with ?, leading to incorrect HTTP status/metrics. Consider removing this blanket conversion once migration is complete, or narrowing it (e.g., only for known backend-connection strings) and updating call sites to construct the correct ProxyError variant explicitly.
| /// Allow `?` on String errors (legacy compatibility during migration) | |
| impl From<String> for ProxyError { | |
| fn from(s: String) -> Self { | |
| ProxyError::BackendError { message: s } | |
| /// Allow `?` on String errors (legacy compatibility during migration). | |
| /// Generic `String` errors are treated as internal/filter failures (500), | |
| /// not as backend connection errors (502), to avoid misclassification. | |
| impl From<String> for ProxyError { | |
| fn from(s: String) -> Self { | |
| ProxyError::FilterError { message: s } |
There was a problem hiding this comment.
Added transitional docs in 589fa15. The From<String> impl is explicitly marked as a migration shim with a note that callers should construct specific variants directly. Will be removed once all error sites are migrated.
| let retry_req = match retry_builder.body(Full::new(Bytes::new())) { | ||
| Ok(r) => r, | ||
| Err(e) => { | ||
| last_result = Err(format!("Failed to build retry request: {}", e)); | ||
| last_result = Err(ProxyError::BackendError { | ||
| message: format!("Failed to build retry request: {}", e), |
There was a problem hiding this comment.
When rebuilding the retry request, ensure the retry URI preserves the original request’s query string. Currently the retry path construction earlier in this block uses path (from req.uri().path()), so retries can silently drop ?query=... parameters and change GET/HEAD semantics.
There was a problem hiding this comment.
Fixed in 589fa15 — retry URI now uses path_and_query (from req.uri().path_and_query()) instead of just path. Query parameters like ?foo=bar are preserved across retries.
- Use LengthLimitError type check instead of string matching for body limit detection (fixes brittle e.to_string().contains() pattern) - Report size: 0 in BodyTooLarge since actual size is unknown when streaming was aborted at the limit - Don't count BodyTooLarge (client error) against backend health/circuit breaker — only backend-originated errors affect health metrics - Preserve query string in retry URIs (was silently dropping ?query=...) - Add transitional docs to From<String> for ProxyError Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
rauta status,rauta routes list,rauta diagnosenow return live gateway state instead of empty placeholderse.starts_with("TIMEOUT:")string matching — errors are now typed with proper HTTP status codeshttp_body_util::Limitedenforces 10MB cap during reading, not after full collection (prevents OOM)What changed
Router — new
list_routes()method returnsVec<RouteSnapshot>with backend health scores and draining statusCircuitBreakerManager — new
snapshot_all()andopen_count()for iterating all breakers via ArcSwapRateLimiter — new
snapshot_all(),exhausted_count(),TokenBucket::capacity(),TokenBucket::refill_rate()LocalGatewayQuery — fully wired:
snapshot()reports realopen_circuitsandexhausted_rate_limiterslist_routes()/list_circuit_breakers()/list_rate_limiters()return real data with filteringdiagnose()feeds real route/CB/RL data to the diagnostics engineProxyError — replaces
Result<..., String>inforward_to_backendandhandle_request:ProxyError::Timeout→ 504ProxyError::BackendError→ 502ProxyError::BodyTooLarge→ 413From<String>impl for backward compatBody streaming —
http_body_util::Limited::new(body, MAX_BODY_SIZE)wraps the body stream, aborting read at 10MB instead of buffering everything firstTest plan
curl localhost:9091/api/v1/statusshows real open_circuits countcurl localhost:9091/api/v1/routesshows configured routes🤖 Generated with Claude Code