Add adaptive concurrency filter configuration to DestinationRule by prashanthjos · Pull Request #3661 · istio/api

prashanthjos · 2026-03-06T02:22:45Z

Adds support for configuring Envoy's adaptive concurrency filter through Istio's DestinationRule TrafficPolicy.

The adaptive concurrency filter dynamically adjusts the allowed number of outstanding requests to upstream hosts based on sampled latency measurements, providing automatic concurrency limiting without manual tuning.

Example usage:

apiVersion: networking.istio.io/v1
kind: DestinationRule
metadata:
  name: my-service-adaptive
spec:
  host: my-service.default.svc.cluster.local
  trafficPolicy:
    adaptiveConcurrency:
      enabled: true
      gradientControllerConfig:
        sampleAggregatePercentile: 50
        concurrencyLimitParams:
          maxConcurrencyLimit: 1000
          concurrencyUpdateInterval: 100ms
        minRttCalcParams:
          interval: 60s
          requestCount: 50
          jitter: 10
          minConcurrency: 3
          buffer: 25
      concurrencyLimitExceededStatus: 503

Adds AdaptiveConcurrency message to the DestinationRule proto, enabling configuration of Envoy's adaptive concurrency filter via TrafficPolicy. The message structure mirrors Envoy's v3 AdaptiveConcurrency proto with GradientControllerConfig, ConcurrencyLimitCalculationParams, and MinimumRTTCalculationParams. Field names and numbers are aligned with the upstream Envoy proto.

prashanthjos · 2026-03-06T02:33:56Z

This is the original issue, and this is the proposal document.

ericdbishop · 2026-03-06T14:24:36Z

networking/v1alpha3/destination_rule.proto

+  // Adaptive concurrency settings for dynamically adjusting the allowed number of
+  // outstanding requests based on sampled latencies. This enables the Envoy
+  // adaptive concurrency filter for the destination.
+  AdaptiveConcurrency adaptive_concurrency = 9;


Does this completely override the static max_connections circuit breaker threshold? Should we describe that here so that its more clear what to expect if both are configured?

Is it also worth noting that new connections that exceed the adaptive concurrency threshold won't be reflected in circuit breaker statistics, the way they would when connections exceed the max_connections threshold? (At least that is how I understand it).

Normally when the circuit breaker threshold is exceeded in Envoy, you also have the x-envoy-overloaded header returned as part of the response. Some users might rely on that header, or maybe the client-side Envoy proxy relies on that in some way.

Should we note that the x-envoy-overloaded header won't be present when the adaptive concurrency threshold is exceeded?

Does this completely override the static max_connections circuit breaker threshold? Should we describe that here so that its more clear what to expect if both are configured?

Is it also worth noting that new connections that exceed the adaptive concurrency threshold won't be reflected in circuit breaker statistics, the way they would when connections exceed the max_connections threshold? (At least that is how I understand it).

@ericdbishop these are good points, I digged a little bit more into Envoy code, here is their adaptive filter. When the adaptive concurrency filter blocks a request, it does so here , the request never reaches the router at all:

if (controller_->forwardingDecision() == Controller::RequestForwardingAction::Block) { decoder_callbacks_->sendLocalReply(config_->concurrencyLimitExceededStatus(), "reached concurrency limit", nullptr, absl::nullopt, "reached_concurrency_limit"); return Http::FilterHeadersStatus::StopIteration;

Whereas circuit breakers are checked much later, inside the connection pool when the router tries to establish an upstream connection:

const bool can_create_connection = host_->canCreateConnection(priority_); if (!can_create_connection) { host_->cluster().trafficStats()->upstream_cx_overflow_.inc(); }

if (!host_->cluster().resourceManager(priority_).pendingRequests().canCreate()) { ENVOY_LOG(debug, "max pending streams overflow"); onPoolFailure(nullptr, absl::string_view(), ConnectionPool::PoolFailureReason::Overflow, context); host_->cluster().trafficStats()->upstream_rq_pending_overflow_.inc(); return nullptr; }

So, No, Adaptive Concurrency Does NOT Override Circuit Breakers. If both are configured, a request must pass both checks independently:

First, the adaptive concurrency filter checks if num_rq_outstanding < concurrency_limit. If not, the request is rejected immediately with a 503.

If it passes, the request continues through the filter chain to the router, where circuit breaker thresholds (max_connections, max_pending_requests, max_requests) are checked independently at the connection pool level.

In practice, whichever limit is more restrictive will be the binding constraint. If the adaptive concurrency limit is lower than max_connections, most requests will be shed by the filter before circuit breakers ever see them.

When the adaptive concurrency filter blocks a request, it increments only its own stat:

adaptive_concurrency.gradient_controller.rq_blocked

Since the request never reaches the router or connection pool, none of the circuit breaker stats are affected:

No upstream_cx_overflow_

No upstream_rq_pending_overflow_

No upstream_rq_retry_overflow_

Someone monitoring only circuit breaker stats would not see that requests are being shed.This makes the three key facts explicit:

They don't override each other, they're layered

They limit different things at different points

They have separate stats, so operators need to monitor both

This is part of Envoy and little we can do with Istio, In any case I have updated the comment.

Normally when the circuit breaker threshold is exceeded in Envoy, you also have the x-envoy-overloaded header returned as part of the response. Some users might rely on that header, or maybe the client-side Envoy proxy relies on that in some way.

Should we note that the x-envoy-overloaded header won't be present when the adaptive concurrency threshold is exceeded?

This is true, the header is not added, I have updated the comment !

The notes you added to the comment look good to me, so at least users are aware of the differences. Thanks!

prashanthjos · 2026-03-09T05:00:34Z

@ramaraochavali @keithmattix can you please help with the review

keithmattix · 2026-03-11T22:41:47Z

networking/v1alpha3/destination_rule.proto

+    message MinimumRTTCalculationParams {
+      // The time interval between recalculating the minimum request round-trip time.
+      // Has to be positive. If set to zero, dynamic sampling of the minRTT is disabled.
+      google.protobuf.Duration interval = 1;


What is the default?

interval has no default, users must either set interval (to enable dynamic minRTT sampling) or set fixed_value (to use a static minRTT). If you set neither, config validation throws an exception.

keithmattix · 2026-03-11T22:44:06Z

networking/v1alpha3/destination_rule.proto

+      // The fixed value for the minRTT. This value is used when minRTT is not sampled
+      // dynamically. If dynamic sampling of the minRTT is disabled (i.e. `interval` is
+      // set to zero), this field must be set.
+      google.protobuf.Duration fixed_value = 6;


So at least one of fixed_value and or interval must be set? Should it be a oneof?

I think its a good suggestion but these fields are not a oneof in the upstream Envoy v3 proto also. I've kept the structure aligned with Envoy's proto to maintain a clear mapping for the control plane translation layer and updated the field comments instead, if interval is set, fixed_value is ignored.

Happy to switch to a oneof if we prefer a stricter Istio-native API over Envoy alignment. Let me know your preference.

keithmattix · 2026-03-11T22:46:32Z

networking/v1alpha3/destination_rule.proto

+      // calculation window. This is represented as a percentage of the interval duration.
+      // Defaults to 15%. This is recommended to prevent all hosts in a cluster from
+      // being in a minRTT calculation window at the same time.
+      google.protobuf.DoubleValue jitter = 3;


Does this really need to be a double if, as in the example, folks are setting whole number percentages? Same goes for buffer

hmm fair point, changed jitter, buffer, and sample_aggregate_percentile from DoubleValue to UInt32Value. These are whole-number percentages (0–100) in practice, and UInt32Value still preserves nullability so unset values correctly fall back to Envoy's defaults (15%, 25%, p50 respectively).

Note: Envoy's proto uses type.v3.Percent which wraps a double, but since Istio doesn't import Envoy types and fractional percentages aren't practically useful here, UInt32Value is a better fit and consistent with other fields in this message (request_count, min_concurrency). Let me know what you think.

keithmattix · 2026-03-11T22:47:12Z

networking/v1alpha3/destination_rule.proto

+    MinimumRTTCalculationParams min_rtt_calc_params = 3 [(google.api.field_behavior) = REQUIRED];
+  }
+
+  oneof concurrency_controller_config {


Why is this a oneof?

It's a oneof in the Envoy v3 proto to allow for future concurrency controller types beyond the gradient controller, but from what I can understand, Envoy has only ever had the gradient controller (since the filter was introduced), so in practice there's only one option. We can still drop OneOf but if we have to align with Envoy Proto we can let it stay.

let me know your thoughts I am open to change to plain field.

keithmattix · 2026-03-11T22:48:03Z

networking/v1alpha3/destination_rule.proto

+  }
+
+  // If set to false, the adaptive concurrency filter will operate as a pass-through
+  // filter. If unspecified, the filter will be enabled.


This is going to change every single Envoy deployment to use adaptive concurrency right? That is very likely to throttle people's production traffic

i have another question, does this field even make sense? If I wanted to disable it, wouldn't I not specify adaptiveConcurrency directly?

+1. I have asked the same question in the design doc also

+1, agreed. Removed the enabled field. I was dedicatedly following the Envoy's spec. Diving a little deep into Envoy I understood in Envoy's filter chain model, enabled (as a RuntimeFeatureFlag) allows toggling the filter at runtime without removing the config. Since Istio doesn't expose Envoy's runtime feature flags, the field add's no value here.

Apologies @ramaraochavali I missed your comment in the design doc.

ramaraochavali · 2026-03-12T14:33:10Z

networking/v1alpha3/destination_rule.proto

+    // from sampled request latencies.
+    message ConcurrencyLimitCalculationParams {
+      // The allowed upper-bound on the calculated concurrency limit. Defaults to 1000.
+      google.protobuf.UInt32Value max_concurrency_limit = 2;


How would service owners arrive at what is the correct value for this?

I updated the comments to the best of my knowledge :) , unfortunately this isn't documented well enough in Envoy also. I would let any experts correct me here and I would be happy to change the comments.

ramaraochavali · 2026-03-12T14:35:54Z

networking/v1alpha3/destination_rule.proto

+      google.protobuf.DoubleValue jitter = 3;
+
+      // The concurrency limit set while measuring the minRTT. Defaults to 3.
+      google.protobuf.UInt32Value min_concurrency = 4;


What does this indicate and how would service owner know what to set?

This is the lower bound for concurrency (i.e. ensure the adaptive concurrency filter doesn't throttle the concurrency below this threshold). I think most of these descriptions are the same as the Envoy API reference.

The main docs page explains the calculation with more detail, but it is a lot to interpret for the user. Even the defaults for adaptive concurrency used at Lyft were pretty different than the Envoy defaults, and some settings might have to tuned at the per-service level.

With all of the tuning available, this is overall an advanced feature, maybe this PR should be accompanied by an istio.io docs user guide?

Yes we can do that, while merging the implementation PR, I will try my best to do justice to the documentation :) .

ramaraochavali

We need to give guidance on how would service owner sets some of the values what are the implications of setting a specific value with proper examples. This is very hard to configure and get it right unless we provide proper documentation and guidance.

prashanthjos · 2026-03-16T06:29:08Z

Sure @ramaraochavali we can try to update the documentation, by looking at the code and available Envoy documentation, I will do that as part of implementation PR.

prashanthjos requested a review from a team as a code owner March 6, 2026 02:22

istio-testing added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 6, 2026

prashanthjos force-pushed the adaptive-concurrency branch from cc2f2a8 to aac1d4b Compare March 6, 2026 02:27

ericdbishop reviewed Mar 6, 2026

View reviewed changes

prashanthjos added 2 commits March 6, 2026 09:18

Updating the comment for adaptive_concurrency field

61833a0

Updating the comment for adaptive_concurrency field

9e0d05c

keithmattix requested changes Mar 11, 2026

View reviewed changes

ramaraochavali reviewed Mar 12, 2026

View reviewed changes

ramaraochavali requested changes Mar 12, 2026

View reviewed changes

prashanthjos added 2 commits March 12, 2026 22:55

Updating the review comments

ca070cd

Updating the review comments

7c2ad9a

Conversation

prashanthjos commented Mar 6, 2026

Uh oh!

prashanthjos commented Mar 6, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

prashanthjos Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

prashanthjos commented Mar 9, 2026

Uh oh!

keithmattix Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

prashanthjos Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

prashanthjos Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sschepens Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

prashanthjos Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ramaraochavali left a comment

Choose a reason for hiding this comment

Uh oh!

prashanthjos commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

prashanthjos Mar 6, 2026 •

edited

Loading

keithmattix Mar 11, 2026 •

edited

Loading

prashanthjos Mar 13, 2026 •

edited

Loading

prashanthjos Mar 13, 2026 •

edited

Loading

sschepens Mar 12, 2026 •

edited

Loading

prashanthjos Mar 13, 2026 •

edited

Loading