-
Notifications
You must be signed in to change notification settings - Fork 5.4k
cluster: copy active cluster hosts during warming #6021
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
b21eb32
copy active cluster hosts
ramaraochavali e8aeaf6
revert unnecessary change
ramaraochavali a17dcb0
Merge branch 'master' into fix/empty_cla
ramaraochavali 2e6f53b
fix compilation after merge
ramaraochavali 2d472ea
address review feedback
ramaraochavali 19fe1f4
add test for tls update verification
ramaraochavali c3bbf68
wip
ramaraochavali f42f1e1
Merge branch 'master' into fix/empty_cla
ramaraochavali 398f37a
handle empty host EDS update
ramaraochavali f0c1da9
simplify condition and docs
ramaraochavali 95dc0c4
simplify condition and docs
ramaraochavali c932fc8
test docs
ramaraochavali c72f8d4
format
ramaraochavali f92ad64
rename method and add more docs
ramaraochavali 899b126
add link to xds protocol clause
ramaraochavali e968017
formatting
ramaraochavali File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ramaraochavali @snowp @htuch you all are clearly fine with this, so I'm not understanding something, but why does an EDS response for any other cluster cause an empty update for this cluster? Isn't that a bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I think I understand, we are looking at this text in the XDS protocol docs: "When a requested resource is missing in a RDS or EDS update, Envoy will retain the last known value for this resource." Right?
I guess this seems OK, though I do wonder if this clause is adding extra complexity that we don't really need, especially with incremental coming. Meaning, it seems like your management server could not accept new connections until it's ready to serve them and we can avoid all of this extra code, potential for bugs, behavior differences, etc. That would be my preference.
If we do decide to keep this, can we spell out the clause we are relying on here in the docs and link to them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@htuch knows better but IIRC it is required for initial loading otherwise Envoy was struck at initialization - I tried to change that behaviour at some point via #4276 but got a regression #4485 and reverted it via #4490
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
I can add the clause in the doc, but my preference would be to fix this on Envoy side - it should not clear hosts when it gets EDS response for some other cluster. For EDS/RDS we are not mandating to send all responses every time (which is the right thing to do). Things may change with incremental but we can deal with it when it comes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ramaraochavali we just discussed this issue briefly on the community call. I think we might need to have a meeting on this. Given that chat, I'm still pretty uncomfortable with this change for 2 reasons:
I know @htuch as some thoughts here so will let him weigh in and then we can maybe do a meeting if needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe one way to simplify this discussion is to come up with a bunch of user stories, e.g. "As an Envoy user, I want to upgrade a service from HTTP to HTTPS.. the recommended steps are ...".
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this and it seems potential inconsistency can not be avoided in all cases.
I think we need to think about a short term solution and longer term solutions like immutable resources etc
For short term we have couple of solutions
Do we have any other ideas?
Longer term, for sure we should come up with bunch of user stories and think about solutions like immutable resources etc as @htuch mentioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ramaraochavali take a look at a convo I had with @htuch in Envoy slack: https://envoyproxy.slack.com/archives/CEFDKQ3RQ/p1551391835017700
I'm pretty strongly of the opinion that we should not finish init/warming until we explicitly receive a named response for an EDS fetch. I think this is the most clear solution and IMO the behavior that most people would expect.
@htuch brings up the valid point that this may not work in all cases if the hosts have changed, but IMO this is very rare, and we should offer guidance in the XDS docs that this edge case should be handled by a cluster rename and correct sequencing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattklein123 @htuch @snowp Created this doc https://docs.google.com/document/d/1Ca4YX9XNnV9rjTR7U0_xyCxUDKfoSun8pWSUNO9N-tY/edit?usp=sharing so that we can discuss there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. So since we got consensus on finishing warming only on named responses, I am going to close this PR and open another PR with a fix.