Skip to content

Refactor cluster manager init sequence.#33221

Merged
htuch merged 7 commits intoenvoyproxy:mainfrom
yanjunxiang-google:refactor_cluster_manager
Apr 11, 2024
Merged

Refactor cluster manager init sequence.#33221
htuch merged 7 commits intoenvoyproxy:mainfrom
yanjunxiang-google:refactor_cluster_manager

Conversation

@yanjunxiang-google
Copy link
Copy Markdown
Contributor

@yanjunxiang-google yanjunxiang-google commented Mar 29, 2024

This is to address the issue: #33218

Commit Message:
Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional API Considerations:]

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
@yanjunxiang-google yanjunxiang-google marked this pull request as ready for review April 1, 2024 21:23
@yanjunxiang-google
Copy link
Copy Markdown
Contributor Author

/assign @wbpcode @htuch

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The set of dependencies leading to the need to break the reference circle might benefit from some rethinking longer term, but this seems OK to me. @wbpcode you were interested in reviewing this as well.

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Copy link
Copy Markdown
Member

@wbpcode wbpcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution. This seems good to me. (Although I cannot ensure it completely resolve our problem, the status is very complex when doing the initialization. But now it at least better then original condition)

Comment thread envoy/upstream/cluster_manager.h Outdated
Comment thread source/server/configuration_impl.cc Outdated
Comment on lines +131 to +134

// clusterManagerFromProto() and init() have to be called consecutively.
cluster_manager_ = cluster_manager_factory.clusterManagerFromProto(bootstrap);
RETURN_IF_NOT_OK(cluster_manager_->init(bootstrap));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this completely resolve our problem. This ensure the server context could always return a valid reference to the cluster manager. But what will happen when the upstream network filter or upstream http filter try to access the actual content of the cluster manager when doing the initialization?

Copy link
Copy Markdown
Contributor Author

@yanjunxiang-google yanjunxiang-google Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments!

  1. When router filter is trying to call processFilters() for upstream filters:

    context.serverFactoryContext().clusterManager(), *upstream_ctx_, prefix);

    it does access context.serverFactoryContext().clusterManager(). IIUC, this is in the much later stage, so the cluster manager instance is already initialized.

  2. In the ClusterInfoImpl c'tor, which is called during cluster_manager_->init()

    factory_context.clusterManager(), upstream_context_, prefix);

factory_.clusterFromProto(cluster, *this, outlier_event_logger_, added_via_api);

It's accessing the cluster_manager_, i.e, *this, before it is initialized. This means ClusterInfoImpl c'tor only needs an instance of cluster manager, and does not require it is initialized.

Unfortunately, for composite filter in upstream chain case, it needs to access the cluster_manager_ from from outside here:

factory_context, server_factory_context.clusterManager(), false, "http", nullptr);

That's the reason we need to make sure the server_factory_context.clusterManager() is setup before it is initialized.

BTW, for composite filter in upstream case, it also only needs an instance of cluster manager, and does not require it is initialized.

@wbpcode
Copy link
Copy Markdown
Member

wbpcode commented Apr 3, 2024

Please check the CI. Then we can try to merge it and see the effect :)

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
@wbpcode
Copy link
Copy Markdown
Member

wbpcode commented Apr 8, 2024

LGTM, Thanks for your update. Could you check the CI then we can merge this PR.

@yanjunxiang-google
Copy link
Copy Markdown
Contributor Author

/retest

1 similar comment
@yanjunxiang-google
Copy link
Copy Markdown
Contributor Author

/retest

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
@yanjunxiang-google
Copy link
Copy Markdown
Contributor Author

/retest

@yanjunxiang-google
Copy link
Copy Markdown
Contributor Author

CI is complaining below. Should be non-related to this change.

Code coverage for source/common/stats is lower than limit of 96.6 (96.5)

…tor_cluster_manager

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Copy link
Copy Markdown
Member

@wbpcode wbpcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. thanks for this contribution. cc @htuch for final pass or merge.

@wbpcode wbpcode assigned htuch and unassigned htuch and wbpcode Apr 10, 2024
@yanjunxiang-google
Copy link
Copy Markdown
Contributor Author

Kind ping!

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@htuch htuch merged commit 42d56bc into envoyproxy:main Apr 11, 2024
mum4k pushed a commit to envoyproxy/nighthawk that referenced this pull request Apr 17, 2024
- sync files
- add `# unique` in .bazelrc to avoid CI failure https://dev.azure.com/cncf/4684fb3d-0389-4e0b-8251-221942316e06/_apis/build/builds/168119/logs/57
- add `cluster_manager_->initialize(bootstrap_);` to fix a CI crash which is introduced by envoyproxy/envoy#33221 

Signed-off-by: Qin Qin <qqin@google.com>
alyssawilk pushed a commit to alyssawilk/envoy that referenced this pull request Apr 29, 2024
This is to address the issue: envoyproxy#33218

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants