Add adaptive concurrency filter configuration to DestinationRule#3661
Add adaptive concurrency filter configuration to DestinationRule#3661prashanthjos wants to merge 5 commits intoistio:masterfrom
Conversation
Adds AdaptiveConcurrency message to the DestinationRule proto, enabling configuration of Envoy's adaptive concurrency filter via TrafficPolicy. The message structure mirrors Envoy's v3 AdaptiveConcurrency proto with GradientControllerConfig, ConcurrencyLimitCalculationParams, and MinimumRTTCalculationParams. Field names and numbers are aligned with the upstream Envoy proto.
cc2f2a8 to
aac1d4b
Compare
| // Adaptive concurrency settings for dynamically adjusting the allowed number of | ||
| // outstanding requests based on sampled latencies. This enables the Envoy | ||
| // adaptive concurrency filter for the destination. | ||
| AdaptiveConcurrency adaptive_concurrency = 9; |
There was a problem hiding this comment.
-
Does this completely override the static
max_connectionscircuit breaker threshold? Should we describe that here so that its more clear what to expect if both are configured? -
Is it also worth noting that new connections that exceed the adaptive concurrency threshold won't be reflected in circuit breaker statistics, the way they would when connections exceed the
max_connectionsthreshold? (At least that is how I understand it).
There was a problem hiding this comment.
Normally when the circuit breaker threshold is exceeded in Envoy, you also have the x-envoy-overloaded header returned as part of the response. Some users might rely on that header, or maybe the client-side Envoy proxy relies on that in some way.
Should we note that the x-envoy-overloaded header won't be present when the adaptive concurrency threshold is exceeded?
There was a problem hiding this comment.
- Does this completely override the static
max_connectionscircuit breaker threshold? Should we describe that here so that its more clear what to expect if both are configured?- Is it also worth noting that new connections that exceed the adaptive concurrency threshold won't be reflected in circuit breaker statistics, the way they would when connections exceed the
max_connectionsthreshold? (At least that is how I understand it).
@ericdbishop these are good points, I digged a little bit more into Envoy code, here is their adaptive filter. When the adaptive concurrency filter blocks a request, it does so here , the request never reaches the router at all:
if (controller_->forwardingDecision() == Controller::RequestForwardingAction::Block) {
decoder_callbacks_->sendLocalReply(config_->concurrencyLimitExceededStatus(),
"reached concurrency limit", nullptr, absl::nullopt,
"reached_concurrency_limit");
return Http::FilterHeadersStatus::StopIteration;
Whereas circuit breakers are checked much later, inside the connection pool when the router tries to establish an upstream connection:
const bool can_create_connection = host_->canCreateConnection(priority_);
if (!can_create_connection) {
host_->cluster().trafficStats()->upstream_cx_overflow_.inc();
}
if (!host_->cluster().resourceManager(priority_).pendingRequests().canCreate()) {
ENVOY_LOG(debug, "max pending streams overflow");
onPoolFailure(nullptr, absl::string_view(), ConnectionPool::PoolFailureReason::Overflow,
context);
host_->cluster().trafficStats()->upstream_rq_pending_overflow_.inc();
return nullptr;
}
So, No, Adaptive Concurrency Does NOT Override Circuit Breakers. If both are configured, a request must pass both checks independently:
- First, the adaptive concurrency filter checks if num_rq_outstanding < concurrency_limit. If not, the request is rejected immediately with a 503.
- If it passes, the request continues through the filter chain to the router, where circuit breaker thresholds (max_connections, max_pending_requests, max_requests) are checked independently at the connection pool level.
In practice, whichever limit is more restrictive will be the binding constraint. If the adaptive concurrency limit is lower than max_connections, most requests will be shed by the filter before circuit breakers ever see them.
When the adaptive concurrency filter blocks a request, it increments only its own stat:
adaptive_concurrency.gradient_controller.rq_blocked
Since the request never reaches the router or connection pool, none of the circuit breaker stats are affected:
- No
upstream_cx_overflow_ - No
upstream_rq_pending_overflow_ - No
upstream_rq_retry_overflow_
Someone monitoring only circuit breaker stats would not see that requests are being shed.This makes the three key facts explicit:
- They don't override each other, they're layered
- They limit different things at different points
- They have separate stats, so operators need to monitor both
This is part of Envoy and little we can do with Istio, In any case I have updated the comment.
There was a problem hiding this comment.
Normally when the circuit breaker threshold is exceeded in Envoy, you also have the
x-envoy-overloadedheader returned as part of the response. Some users might rely on that header, or maybe the client-side Envoy proxy relies on that in some way.Should we note that the
x-envoy-overloadedheader won't be present when the adaptive concurrency threshold is exceeded?
This is true, the header is not added, I have updated the comment !
There was a problem hiding this comment.
The notes you added to the comment look good to me, so at least users are aware of the differences. Thanks!
|
@ramaraochavali @keithmattix can you please help with the review |
| message MinimumRTTCalculationParams { | ||
| // The time interval between recalculating the minimum request round-trip time. | ||
| // Has to be positive. If set to zero, dynamic sampling of the minRTT is disabled. | ||
| google.protobuf.Duration interval = 1; |
There was a problem hiding this comment.
What is the default?
There was a problem hiding this comment.
interval has no default, users must either set interval (to enable dynamic minRTT sampling) or set fixed_value (to use a static minRTT). If you set neither, config validation throws an exception.
| // The fixed value for the minRTT. This value is used when minRTT is not sampled | ||
| // dynamically. If dynamic sampling of the minRTT is disabled (i.e. `interval` is | ||
| // set to zero), this field must be set. | ||
| google.protobuf.Duration fixed_value = 6; |
There was a problem hiding this comment.
So at least one of fixed_value and or interval must be set? Should it be a oneof?
There was a problem hiding this comment.
I think its a good suggestion but these fields are not a oneof in the upstream Envoy v3 proto also. I've kept the structure aligned with Envoy's proto to maintain a clear mapping for the control plane translation layer and updated the field comments instead, if interval is set, fixed_value is ignored.
Happy to switch to a oneof if we prefer a stricter Istio-native API over Envoy alignment. Let me know your preference.
| // calculation window. This is represented as a percentage of the interval duration. | ||
| // Defaults to 15%. This is recommended to prevent all hosts in a cluster from | ||
| // being in a minRTT calculation window at the same time. | ||
| google.protobuf.DoubleValue jitter = 3; |
There was a problem hiding this comment.
Does this really need to be a double if, as in the example, folks are setting whole number percentages? Same goes for buffer
There was a problem hiding this comment.
hmm fair point, changed jitter, buffer, and sample_aggregate_percentile from DoubleValue to UInt32Value. These are whole-number percentages (0–100) in practice, and UInt32Value still preserves nullability so unset values correctly fall back to Envoy's defaults (15%, 25%, p50 respectively).
Note: Envoy's proto uses type.v3.Percent which wraps a double, but since Istio doesn't import Envoy types and fractional percentages aren't practically useful here, UInt32Value is a better fit and consistent with other fields in this message (request_count, min_concurrency). Let me know what you think.
| MinimumRTTCalculationParams min_rtt_calc_params = 3 [(google.api.field_behavior) = REQUIRED]; | ||
| } | ||
|
|
||
| oneof concurrency_controller_config { |
There was a problem hiding this comment.
It's a oneof in the Envoy v3 proto to allow for future concurrency controller types beyond the gradient controller, but from what I can understand, Envoy has only ever had the gradient controller (since the filter was introduced), so in practice there's only one option. We can still drop OneOf but if we have to align with Envoy Proto we can let it stay.
let me know your thoughts I am open to change to plain field.
| } | ||
|
|
||
| // If set to false, the adaptive concurrency filter will operate as a pass-through | ||
| // filter. If unspecified, the filter will be enabled. |
There was a problem hiding this comment.
This is going to change every single Envoy deployment to use adaptive concurrency right? That is very likely to throttle people's production traffic
There was a problem hiding this comment.
i have another question, does this field even make sense? If I wanted to disable it, wouldn't I not specify adaptiveConcurrency directly?
There was a problem hiding this comment.
+1. I have asked the same question in the design doc also
There was a problem hiding this comment.
+1, agreed. Removed the enabled field. I was dedicatedly following the Envoy's spec. Diving a little deep into Envoy I understood in Envoy's filter chain model, enabled (as a RuntimeFeatureFlag) allows toggling the filter at runtime without removing the config. Since Istio doesn't expose Envoy's runtime feature flags, the field add's no value here.
Apologies @ramaraochavali I missed your comment in the design doc.
| // from sampled request latencies. | ||
| message ConcurrencyLimitCalculationParams { | ||
| // The allowed upper-bound on the calculated concurrency limit. Defaults to 1000. | ||
| google.protobuf.UInt32Value max_concurrency_limit = 2; |
There was a problem hiding this comment.
How would service owners arrive at what is the correct value for this?
There was a problem hiding this comment.
I updated the comments to the best of my knowledge :) , unfortunately this isn't documented well enough in Envoy also. I would let any experts correct me here and I would be happy to change the comments.
| google.protobuf.DoubleValue jitter = 3; | ||
|
|
||
| // The concurrency limit set while measuring the minRTT. Defaults to 3. | ||
| google.protobuf.UInt32Value min_concurrency = 4; |
There was a problem hiding this comment.
What does this indicate and how would service owner know what to set?
There was a problem hiding this comment.
This is the lower bound for concurrency (i.e. ensure the adaptive concurrency filter doesn't throttle the concurrency below this threshold). I think most of these descriptions are the same as the Envoy API reference.
The main docs page explains the calculation with more detail, but it is a lot to interpret for the user. Even the defaults for adaptive concurrency used at Lyft were pretty different than the Envoy defaults, and some settings might have to tuned at the per-service level.
With all of the tuning available, this is overall an advanced feature, maybe this PR should be accompanied by an istio.io docs user guide?
There was a problem hiding this comment.
Yes we can do that, while merging the implementation PR, I will try my best to do justice to the documentation :) .
ramaraochavali
left a comment
There was a problem hiding this comment.
We need to give guidance on how would service owner sets some of the values what are the implications of setting a specific value with proper examples. This is very hard to configure and get it right unless we provide proper documentation and guidance.
|
Sure @ramaraochavali we can try to update the documentation, by looking at the code and available Envoy documentation, I will do that as part of implementation PR. |
Adds support for configuring Envoy's adaptive concurrency filter through Istio's DestinationRule TrafficPolicy.
The adaptive concurrency filter dynamically adjusts the allowed number of outstanding requests to upstream hosts based on sampled latency measurements, providing automatic concurrency limiting without manual tuning.
Example usage: