[Experimental-Feature] DeliverySpec.RetryAfter#5813
Conversation
Codecov Report
@@ Coverage Diff @@
## main #5813 +/- ##
==========================================
+ Coverage 82.02% 82.23% +0.21%
==========================================
Files 220 220
Lines 7527 7572 +45
==========================================
+ Hits 6174 6227 +53
+ Misses 918 911 -7
+ Partials 435 434 -1
Continue to review full report at Codecov.
|
| // header value will be used when calculating the next backoff duration. This will | ||
| // only be considered when a 429 (Too Many Requests) or 503 (Service Unavailable) | ||
| // response code is received and Retry is greater than 0. | ||
| Enabled bool `json:"enabled"` |
There was a problem hiding this comment.
shouldn't we always respect the Retry-After?
I am not really sure I like the toggling here.
There was a problem hiding this comment.
This PR is stale - based on the original design which has evolved (is evolving in the associated Issue). Yes, everyone was against the "enabled" flag and in the latest design it is gone. I'm waiting to update this PR for the new design to settle - i need to summarize and push for agreement again...
| // - https://www.iso.org/iso-8601-date-and-time-format.html | ||
| // - https://en.wikipedia.org/wiki/ISO_8601 | ||
| // +optional | ||
| MaxDuration *string `json:"maxDuration,omitempty"` |
There was a problem hiding this comment.
Assuming we do not toggle (just thinking out loud) and have respecting Retry-after as default.
Setting MaxDuration: -1 could disable it.
If nothing (or a value greater than 0) is provided we do as state on the comments
🤷
There was a problem hiding this comment.
Yes the idea is that at least in GA it would be always respected and you could opt out by specifying "PT0S" or something similar.
|
/assign @matzew |
|
PR has been refactored to align with current plan and is ready for review! |
There was a problem hiding this comment.
awesome piece of code.
added minor comments and learned some stuff about testing at eventing, thanks for that.
Could we add to the docs PR (not sure if also to the API) that in case the backoff + retry-after competing for setting the duration, the largest is honored?
eventing/pkg/kncloudevents/message_sender.go
Lines 143 to 148 in 6bb059a
| retryConfig.RequestTimeout, _ = timeout.Duration() | ||
| } | ||
|
|
||
| if spec.RetryAfterMax != nil { |
There was a problem hiding this comment.
Not sure if this is too picky but I think it would be a nice policy to re-check the feature enablement here. If an admin enables a feature, then disable it and restart the controller, I think it would be expected that this value is not used.
This would be important for security related features, probably not for this one. Leaving the comment here for your consideration.
There was a problem hiding this comment.
While this might be nice to do it is a bit problematic. The goal was to do all validation of the experimental-feature in the Webhook in order to avoid requiring downstream implementations from having to load/watch the config-features ConfigMap into their context and then plumb that context through. I suppose we could build a separate ConfigMap lookup inline here, but that feels heavyweight and non-standard with other ConfigMap tracking logic. I'm inclined to not recheck the feature flags here but we can let others weigh in?
There was a problem hiding this comment.
I think you are right, and I didn't thought of other implementations.
On my side, this can be resolved.
There was a problem hiding this comment.
While this might be nice to do it is a bit problematic. The goal was to do all validation of the experimental-feature in the Webhook in order to avoid requiring downstream implementations from having to load/watch the
config-featuresConfigMap into their context and then plumb that context through. I suppose we could build a separate ConfigMap lookup inline here, but that feels heavyweight and non-standard with other ConfigMap tracking logic. I'm inclined to not recheck the feature flags here but we can let others weigh in?
I think we actually must check if the feature is enabled. New experimental API fields will be stored as unknown fields if the feature is disabled and the webhook will ignore them. This means in a "enable-disable" scenario, this code will find a valid value for the RetryAfterMax and will honor it, even thought the feature is disabled.
I believe experimental features are designed to be part of eventing core as:
- API changes that're optionally implemented in alternative implementations.
- A change in the reference implementation. This is to validate the feature, as well as provide a reference implementation for other alternative implementations.
That means, that by design, if an alternative implementation will adopt an experimental feature, it needs to watch the features CM and provide its own implementation. Otherwise, it can wait until the graduation of the feature and becomes "on by default".
|
Well, that's my opinion, if folks feel confident in reviewing big PRs,
go for it.
Lines of code are not a problem, it's the scope.
…On Thu, Nov 18, 2021 at 7:13 PM Ben Moss ***@***.***> wrote:
I don't think we need to split this up
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub, or unsubscribe.
--
Pierangelo Di Pilato
Software Engineer
Red Hat, Inc
https://www.redhat.com/
|
|
@lionelvillard - are you still planning on reviewing? I realize it's a holiday week so it can wait till next week if you're out - just checking ; ) |
|
Can you replace /unhold |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lionelvillard, odacremolbap, travis-minke-sap The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
yep - good call - thanks! |
|
The following is the coverage report on the affected files.
|
|
/retest |
|
/lgtm |
There was a problem hiding this comment.
This doesn't work for a Channel-based Broker since the filter handler doesn't propagate response headers back to the channel.
eventing/pkg/broker/filter/filter_handler.go
Lines 201 to 307 in 820db20
|
@pierDipi can you open an issue? |
|
Done |
devguyio
left a comment
There was a problem hiding this comment.
I believe we need a followup to check if the feature is enabled. This current behavior will be problematic in case of enabling then disabling the feature.
| retryConfig.RequestTimeout, _ = timeout.Duration() | ||
| } | ||
|
|
||
| if spec.RetryAfterMax != nil { |
There was a problem hiding this comment.
While this might be nice to do it is a bit problematic. The goal was to do all validation of the experimental-feature in the Webhook in order to avoid requiring downstream implementations from having to load/watch the
config-featuresConfigMap into their context and then plumb that context through. I suppose we could build a separate ConfigMap lookup inline here, but that feels heavyweight and non-standard with other ConfigMap tracking logic. I'm inclined to not recheck the feature flags here but we can let others weigh in?
I think we actually must check if the feature is enabled. New experimental API fields will be stored as unknown fields if the feature is disabled and the webhook will ignore them. This means in a "enable-disable" scenario, this code will find a valid value for the RetryAfterMax and will honor it, even thought the feature is disabled.
I believe experimental features are designed to be part of eventing core as:
- API changes that're optionally implemented in alternative implementations.
- A change in the reference implementation. This is to validate the feature, as well as provide a reference implementation for other alternative implementations.
That means, that by design, if an alternative implementation will adopt an experimental feature, it needs to watch the features CM and provide its own implementation. Otherwise, it can wait until the graduation of the feature and becomes "on by default".
This PR addresses parts of #5811 by creating the experimental-feature in the Alpha stage.
TLDR = New experimental feature allowing users to opt-in to respecting the Retry-After header when calculating the appropriate backoff duration for 429 and 503 responses.
Proposed Changes
delivery-retryafterfeature flag.DeliverySpecwith new optionalRetryAfterMaxcomponentDeliverySpec.RetryAfterMaxusage against feature flag in WebHook validationDeliverySpec.RetryAfterMaxto Subscription Controller/Reconciler updating ofChannel.Spec.SubscribersRetryConfigstruct to include RetryAfterMax configuration.RetryConfigFromDeliverySpec()to parseDeliverySpec.RetryAfterMaxintoRetryConfig.SendWithRetries()to consider Retry-After headers andRetryConfigsettings when calculating backoff durations.Open Questions
pkg/reconciler/subscription/subscription_test.goto cover the changes in the Subscription controller. This introduced the need to have the test be aware of experimental-features. I think this a win but wanted to confirm it was acceptable (ie - no need to keep the controller test "pure").v1beta1version of the DeliverySpec without any experimental-feature gates. I wasn't sure of the rationale for doing so and have NOT done so in this PR. If someone could explain why we'd wan to do this I'll be happy to include it.Pre-review Checklist
Release Note
Docs
knative/docs#4361
Holding for
Experimental-Features review process & refactoring approachreview/hold