Skip to content

api: Add send_all_clusters field to LRS response.#10613

Merged
htuch merged 15 commits intoenvoyproxy:masterfrom
markdroth:lrs_send_all_clusters
May 1, 2020
Merged

api: Add send_all_clusters field to LRS response.#10613
htuch merged 15 commits intoenvoyproxy:masterfrom
markdroth:lrs_send_all_clusters

Conversation

@markdroth
Copy link
Copy Markdown
Contributor

Signed-off-by: Mark D. Roth roth@google.com

Description: Add a send_all_clusters field to LRS response, triggered by a new client capability. This avoids the need for the server to enumerate the full list of clusters if it always wants data for all clusters.
Risk Level: Low
Testing: N/A
Docs Changes: Included in PR.
Release Notes: N/A

As discussed with @htuch and @jaychenatr.

Signed-off-by: Mark D. Roth <roth@google.com>
@repokitteh-read-only
Copy link
Copy Markdown

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/.

🐱

Caused by: #10613 was opened by markdroth.

see: more, trace.

@htuch htuch self-assigned this Apr 1, 2020
Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, needs fix_format.
/wait

// Not populated if send_all_clusters is true.
repeated string clusters = 1 [(validate.rules).repeated = {min_items: 1}];
// If true, the client should send all clusters it knows about.
// The server may only specify this if the client advertises the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's safe to set this regardless of the capability, you just might not get back the expected result.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded to indicate that clients will not honor the field unless they advertise that capability.

// [#not-implemented-hide:] Not configuration. TBD how to doc proto APIs.
message LoadStatsResponse {
// Clusters to report stats for.
// Not populated if send_all_clusters is true.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: *send_all_clusters*

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

:ref:`overprovisioning_factor<envoy_api_field_ClusterLoadAssignment.Policy.overprovisioning_factor>`
field. If graceful failover functionality is required, it must be supplied by the management
server.
- **envoy.lrs.supports_send_all_clusters**: This feature indicates that the client supports
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan on implementing this in Envoy? It's not a ton of work.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's something I can do fairly quickly, I can take a stab at it. Can you point me at the code that would need to be changed?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would mean changing the loop logic in https://github.com/envoyproxy/envoy/blob/master/source/common/upstream/load_stats_reporter.cc#L48, instead of looping over known clusters from LRS and looking them up in cluster manager, we would iterate over all the clusters in cluster manager in the wildcard case.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for implementing; have you added the client feature capability announcement yet?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also added a bullet to the release notes about this.

Signed-off-by: Mark D. Roth <roth@google.com>
Signed-off-by: Mark D. Roth <roth@google.com>
@markdroth
Copy link
Copy Markdown
Contributor Author

I don't understand what's causing the docs CI failure. The ref I used looks like the right format -- in fact, it exactly matches the form used in a couple of places in load_report.proto itself. Is it possible that lrs.proto is simply not being loaded into RST when client_features.rst is being formatted?

@htuch
Copy link
Copy Markdown
Member

htuch commented Apr 2, 2020

@markdroth yeah, https://www.envoyproxy.io/docs/envoy/latest/api-v2/service/load_stats/v2/lrs.proto is empty. This is due to a known issue, we don't have 1st class support for generating docs for service APIs, see #3091.

I'd suggest just writing it out in natural language and leaving a [#comment:] referencing this bug.

Signed-off-by: Mark D. Roth <roth@google.com>
@htuch
Copy link
Copy Markdown
Member

htuch commented Apr 7, 2020

Please merge master to pick up #10672. We no longer accept changes to v2 (without explicit exception), so any API modifications should happen in v3. If this PR is adding a new proto, please follow the updated instructions in https://github.com/envoyproxy/envoy/blob/master/api/STYLE.md#adding-an-extension-configuration-to-the-api.

Signed-off-by: Mark D. Roth <roth@google.com>
Signed-off-by: Mark D. Roth <roth@google.com>
…ters

Signed-off-by: Mark D. Roth <roth@google.com>
@stale
Copy link
Copy Markdown

stale Bot commented Apr 25, 2020

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale Bot added the stale stalebot believes this issue/PR has not been touched recently label Apr 25, 2020
Signed-off-by: Mark D. Roth <roth@google.com>
@stale stale Bot removed the stale stalebot believes this issue/PR has not been touched recently label Apr 27, 2020
Signed-off-by: Mark D. Roth <roth@google.com>
@markdroth
Copy link
Copy Markdown
Contributor Author

@htuch I think everything is working here now, but the coverage CI is still failing. It's not obvious to me from the error message what the problem is. Can you please take a look and advise? Thanks!

// If true, the client should send all clusters it knows about.
// Only clients that advertise the "envoy.lrs.supports_send_all_clusters" capability in their
// :ref:`client_features<envoy_api_field_core.Node.client_features>` field will honor this field.
bool send_all_clusters = 4;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be v3-only, but I imagine this is going to cause you folks pain? @envoyproxy/api-shepherds

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, unfortunately, this is something that we need in v2 for product reasons.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to have to recuse myself and ask the other @envoyproxy/api-shepherds to weigh in on whether an exception can be made here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lizan can we get your opinion on this one as well? Thanks.

:ref:`overprovisioning_factor<envoy_api_field_ClusterLoadAssignment.Policy.overprovisioning_factor>`
field. If graceful failover functionality is required, it must be supplied by the management
server.
- **envoy.lrs.supports_send_all_clusters**: This feature indicates that the client supports
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for implementing; have you added the client feature capability announcement yet?

Comment thread source/common/upstream/load_stats_reporter.cc Outdated
Comment thread test/integration/load_stats_integration_test.cc
@htuch
Copy link
Copy Markdown
Member

htuch commented Apr 28, 2020

@markdroth you might want to merge master before/after making the suggested changes above. Odds are CI will be fixed by then.

…ters

Signed-off-by: Mark D. Roth <roth@google.com>
Signed-off-by: Mark D. Roth <roth@google.com>
Signed-off-by: Mark D. Roth <roth@google.com>
Signed-off-by: Mark D. Roth <roth@google.com>
Signed-off-by: Mark D. Roth <roth@google.com>
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small API comment, thank you!

Comment thread api/envoy/service/load_stats/v2/lrs.proto Outdated
Signed-off-by: Mark D. Roth <roth@google.com>
Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo v2 modificaiton discussion.

@markdroth
Copy link
Copy Markdown
Contributor Author

@mattklein123 Any objections to making this change in v2 as well? gRPC is working on a design to migrate to v3, but we'll need this before we can make that happen.

@mattklein123
Copy link
Copy Markdown
Member

@mattklein123 Any objections to making this change in v2 as well? gRPC is working on a design to migrate to v3, but we'll need this before we can make that happen.

I'm OK with it.

@markdroth
Copy link
Copy Markdown
Contributor Author

@lizan Can we get your approval on making this change in v2? Thanks!

@htuch htuch merged commit b0f45ed into envoyproxy:master May 1, 2020
@markdroth markdroth deleted the lrs_send_all_clusters branch May 1, 2020 21:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants