feat: support quota mode for BackendTrafficPolicy by yuzisun · Pull Request #7999 · envoyproxy/gateway

yuzisun · 2026-01-20T12:56:52Z

What this PR does / why we need it:
Add quota mode API support for BackendTrafficPolicy which is implemented in envoy ratelimit envoyproxy/ratelimit#1045.

Add QuotaMode field added at the rate limit rule level
Validation prevents quota mode from being used with local rate limits (only global is supported)
xDS translation propagates quota_mode to all descriptor types in the rate limit service configuration
Documentation automatically generated showing the new field

Release Notes: Yes

netlify · 2026-01-20T12:57:00Z

✅ Deploy Preview for cerulean-figolla-1f9435 ready!

Name	Link
🔨 Latest commit	`0ca3b7d`
🔍 Latest deploy log	https://app.netlify.com/projects/cerulean-figolla-1f9435/deploys/69828ba90aa78d000804fd3a
😎 Deploy Preview	https://deploy-preview-7999--cerulean-figolla-1f9435.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

zirain · 2026-01-20T13:11:34Z

+	// Only supported for Global Rate Limits.
+	//
+	// +optional
+	QuotaMode *bool `json:"quotaMode,omitempty"`


qq: can you use shadow and quota at the same time?

yes, see test https://github.com/envoyproxy/ratelimit/pull/1045/files#diff-2491de0d8f7753e12d204b647aa71c2e8ab961dd656e76a475966d72e82bd2d4R878.

Global shadow mode overrides the overall response code to OK, Individual descriptor statuses remain accurate (showing which would be over limit).

Sorry, what I want to know is that what will happen if we set them both on one descriptor?

With both modes turned on then only quota violations are recorded, routing decision is not changed. The quota is over limit on backend A, envoy won't reroute to backend B.

zirain · 2026-01-20T13:12:00Z

 		for mIdx, match := range rule.HeaderMatches {
 			pbDesc := new(rlsconfv3.RateLimitDescriptor)
 			pbDesc.ShadowMode = isRuleShadowMode(rule)
+			pbDesc.QuotaMode = isRuleQuotaMode(rule)


need to bump go-control-plane?

ye looks like need to update go-control-plane first

arkodg · 2026-01-22T05:46:06Z

 	//
 	// +optional
 	ShadowMode *bool `json:"shadowMode,omitempty"`
+	// QuotaMode indicates whether this rate-limit rule runs in quota mode.


how will this be used in Envoy Gateway ? which metadata will be populated ? does this replace shadow mode ?

quotaModeViolations are populated in the metadata with which descriptor indices violated quotas. It is not going to replace shadow mode as the key difference here is that this mode affects the routing not simply observing and envoy is going to use this metadata to do quota aware routing which @yanavlasov is implementing from the envoy side.

cool, thanks for highlighting this, this feels like a new feature and piggying off ratelimit API doesnt feel right, can we rethink what a quota based routing API would look like

@arkodg this is the prerequisite, in order to do quota based routing, we need rate limit to populate the dynamic metadata and not rejecting with 429 when over the limit.

leveraging the ratelimit service to generate quota decision is an implementation detail, the feature here is quota based routing, so the APIs in Envoy Gateway should be geared towards that imo

@arkodg see the proposal here envoyproxy/ai-gateway#1813, we use the rate limit dynamic metadata in the response to set the routing header to be able to route to different endpoint pools when quota limit is over. So this is something between shadow and the normal mode to allow extproc or envoy to make the routing decision by accessing the metadata. It could be a load balancing type on BackendTrafficPolicy in the future, but for now we can use header to select the endpoint pools, what APIs you have in mind ?

@arkodg are you suggesting making quota configuration independent from the rate limit config?

reading the doc dan linked, here's an example of the user facing API in AI Gateway

perModelQuota: - modelName: claude-4-sonnet costExpression: input_tokens + 3 * output_tokens + 0.1 * cached_input_tokens + 1.25 * cache_creation_input_tokens rules: - clientSelectors: - headers: - name: service_tier value: reserved quotaValue: limit: 1M duration: 30s - clientSelectors: - headers: - name: service_tier value: default quotaValue: limit: 2M duration: 60s

to achieve this, here's the plumbing Envoy AI Gateway needs to do

Use the EG API to configure its Global RateLimit which in turn configures Envoy Proxy as well Envoy RLS and for this case set the quotaMode in the RLS entry envoyproxy/ratelimit@a28b84d

Edit the xDS Cluster to enable this feature

From an Envoy Gateway perspective the Quota Mode is vague because it doesnt provide an end to end solution, like what AI Gateway provides, it only sets some fields in the RLS Entry which generates the metadata.

One solution could be to piggyback off ShadowMode and also emit this metadata for that case, and document this

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

codecov · 2026-02-01T08:25:28Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.71%. Comparing base (c3f2982) to head (1c28823).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #7999      +/-   ##
==========================================
- Coverage   73.71%   73.71%   -0.01%     
==========================================
  Files         241      241              
  Lines       36552    36561       +9     
==========================================
+ Hits        26944    26950       +6     
- Misses       7703     7704       +1     
- Partials     1905     1907       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

yanavlasov

/wait-any

yanavlasov · 2026-02-04T16:04:59Z

 	//
 	// +optional
 	ShadowMode *bool `json:"shadowMode,omitempty"`
+	// QuotaMode indicates whether this rate-limit rule runs in quota mode.


@arkodg are you suggesting making quota configuration independent from the rate limit config?

github-actions · 2026-03-07T20:02:46Z

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. Please feel free to give a status update now, ping for review, when it's ready. Thank you for your contributions!

yuzisun requested a review from a team as a code owner January 20, 2026 12:56

zirain reviewed Jan 20, 2026

View reviewed changes

arkodg reviewed Jan 22, 2026

View reviewed changes

yuzisun added 2 commits February 1, 2026 03:04

feat: support quota mode for BackendTrafficPolicy

90c82dd

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

update go-control-plane

73ede4d

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

yuzisun force-pushed the quota_mode branch from 8f8aa01 to 73ede4d Compare February 1, 2026 08:16

yuzisun added 5 commits February 1, 2026 03:49

fix lint

2031fb5

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

fix codegen

c17dfd7

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Merge branch 'main' into quota_mode

1c28823

Merge branch 'main' into quota_mode

7abcfe5

Merge branch 'main' into quota_mode

0ca3b7d

yanavlasov reviewed Feb 4, 2026

View reviewed changes

github-actions Bot added the stale label Mar 7, 2026

github-actions Bot closed this Mar 15, 2026

Conversation

yuzisun commented Jan 20, 2026

Uh oh!

netlify Bot commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for cerulean-figolla-1f9435 ready!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yuzisun Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Feb 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

yanavlasov left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

netlify Bot commented Jan 20, 2026 •

edited

Loading

yuzisun Jan 23, 2026 •

edited

Loading

codecov Bot commented Feb 1, 2026 •

edited

Loading