Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions include/envoy/upstream/upstream.h
Original file line number Diff line number Diff line change
Expand Up @@ -354,6 +354,11 @@ class PrioritySet {
*/
virtual const std::vector<HostSetPtr>& hostSetsPerPriority() const PURE;

/**
* @return true if the priority set does have any hosts in any priorities.
*/
virtual bool empty() const PURE;

/**
* Parameter class for updateHosts.
*/
Expand Down Expand Up @@ -771,6 +776,11 @@ class Cluster {
* @return the const PrioritySet for the cluster.
*/
virtual const PrioritySet& prioritySet() const PURE;

/**
* @return true, if this cluster is initialized by empty config update.
*/
virtual bool initializedByEmptyConfig() const PURE;
};

typedef std::shared_ptr<Cluster> ClusterSharedPtr;
Expand Down
43 changes: 40 additions & 3 deletions source/common/upstream/cluster_manager_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -473,18 +473,55 @@ bool ClusterManagerImpl::addOrUpdateCluster(const envoy::api::v2::Cluster& clust
cluster_warming_cb(cluster_name, ClusterWarmingState::Starting);
cluster_entry->cluster_->initialize([this, cluster_name, cluster_warming_cb] {
auto warming_it = warming_clusters_.find(cluster_name);
auto& cluster_entry = *warming_it->second;
auto& warming_cluster_entry = *warming_it->second;

// If the cluster is being updated, we need to cancel any pending merged updates.
// Otherwise, applyUpdates() will fire with a dangling cluster reference.
updates_map_.erase(cluster_name);

// If management server sends a EDS response, for any other cluster grpc_mux_impl calls
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ramaraochavali @snowp @htuch you all are clearly fine with this, so I'm not understanding something, but why does an EDS response for any other cluster cause an empty update for this cluster? Isn't that a bug?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I think I understand, we are looking at this text in the XDS protocol docs: "When a requested resource is missing in a RDS or EDS update, Envoy will retain the last known value for this resource." Right?

I guess this seems OK, though I do wonder if this clause is adding extra complexity that we don't really need, especially with incremental coming. Meaning, it seems like your management server could not accept new connections until it's ready to serve them and we can avoid all of this extra code, potential for bugs, behavior differences, etc. That would be my preference.

If we do decide to keep this, can we spell out the clause we are relying on here in the docs and link to them?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@htuch knows better but IIRC it is required for initial loading otherwise Envoy was struck at initialization - I tried to change that behaviour at some point via #4276 but got a regression #4485 and reverted it via #4490

Copy link
Copy Markdown
Contributor Author

@ramaraochavali ramaraochavali Feb 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are looking at this text in the XDS protocol docs: "When a requested resource is missing in a RDS or EDS update, Envoy will retain the last known value for this resource." Right?

Yes.

If we do decide to keep this, can we spell out the clause we are relying on here in the docs and link to them?

I can add the clause in the doc, but my preference would be to fix this on Envoy side - it should not clear hosts when it gets EDS response for some other cluster. For EDS/RDS we are not mandating to send all responses every time (which is the right thing to do). Things may change with incremental but we can deal with it when it comes?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ramaraochavali we just discussed this issue briefly on the community call. I think we might need to have a meeting on this. Given that chat, I'm still pretty uncomfortable with this change for 2 reasons:

  1. It's not clear to me that's correct to copy the hosts from an old cluster to a new cluster.
  2. It adds a bunch of complexity that might lead to hard to understand edge cases.
  3. It really seems to me like you could fix this in your management server: don't accept connections to a management server (do not report healthy to your LB) until the management server can fully serve connections.

I know @htuch as some thoughts here so will let him weigh in and then we can maybe do a meeting if needed?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe one way to simplify this discussion is to come up with a bunch of user stories, e.g. "As an Envoy user, I want to upgrade a service from HTTP to HTTPS.. the recommended steps are ...".

Copy link
Copy Markdown
Contributor Author

@ramaraochavali ramaraochavali Mar 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The management server can still do an EDS update immediately after CDS, even if we reuse existing EDS resources, we just get a period of potential inconsistency.

I think it comes down to whether a management server can guarantee whether an EDS update following a CDS update truly reflect the new CDS state. In some setups this is true, but in others it isn't.

I agree with this and it seems potential inconsistency can not be avoided in all cases.

I think we need to think about a short term solution and longer term solutions like immutable resources etc

For short term we have couple of solutions

  • Reuse existing EDS resources till management server provides a new set of resources matching the CDS update (Whether this CDS update requires a new set of EDS resources is another question - which we are assuming right now that all CDS updates require EDS response which is kind of looks not correct)
  • Do not finish warming till a named response comes for updated warming case

Do we have any other ideas?

Longer term, for sure we should come up with bunch of user stories and think about solutions like immutable resources etc as @htuch mentioned.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ramaraochavali take a look at a convo I had with @htuch in Envoy slack: https://envoyproxy.slack.com/archives/CEFDKQ3RQ/p1551391835017700

I'm pretty strongly of the opinion that we should not finish init/warming until we explicitly receive a named response for an EDS fetch. I think this is the most clear solution and IMO the behavior that most people would expect.

@htuch brings up the valid point that this may not work in all cases if the hosts have changed, but IMO this is very rare, and we should offer guidance in the XDS docs that this edge case should be handled by a cluster rename and correct sequencing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. So since we got consensus on finishing warming only on named responses, I am going to close this PR and open another PR with a fix.

// onConfigUpdate on this cluster with empty resources.
// See
// https://github.com/envoyproxy/envoy/blob/master/source/common/config/grpc_mux_impl.cc#L161
// for more details. If the cluster is in fully initialized state, that would just increment
// update_empty stat. However, if the cluster is in warming state the initialization call back
// would be triggered and warming cluster would not have any hosts. So if onConfigUpdate was
// triggered by an EDS update that had no references to this cluster and active cluster has
// some hosts, copy the active cluster priority set to the warming cluster to prevent the
// hosts from being cleared after warming.
// See https://github.com/envoyproxy/envoy/issues/5168 for more context.
// This also ensures that we adhere to the clause "When a requested resource is missing in a
// RDS or EDS update, Envoy will retain the last known value for this resource." as documented
// in https://github.com/envoyproxy/data-plane-api/blob/master/XDS_PROTOCOL.md.
const auto active_it = active_clusters_.find(cluster_name);
if (active_it != active_clusters_.end()) {
const auto& active_cluster_entry = *active_it->second;
if (warming_cluster_entry.cluster_->initializedByEmptyConfig() &&
!active_cluster_entry.cluster_->prioritySet().empty()) {
ENVOY_LOG(debug, "copying host set from active cluster {} to warming cluster",
cluster_name);
const auto& active_host_sets =
active_cluster_entry.cluster_->prioritySet().hostSetsPerPriority();
for (size_t priority = 0; priority < active_host_sets.size(); ++priority) {
const auto& active_host_set = active_host_sets[priority];
// TODO(ramaraochavali): Can we skip these copies by exporting out const shared_ptr from
// HostSet?
HostVectorConstSharedPtr hosts_copy(new HostVector(active_host_set->hosts()));
HostsPerLocalityConstSharedPtr hosts_per_locality_copy =
active_host_set->hostsPerLocality().clone();
warming_cluster_entry.cluster_->prioritySet().updateHosts(
priority, HostSetImpl::partitionHosts(hosts_copy, hosts_per_locality_copy),
active_host_set->localityWeights(), {}, {},
active_host_set->overprovisioningFactor());
}
}
}
active_clusters_[cluster_name] = std::move(warming_it->second);
warming_clusters_.erase(warming_it);

ENVOY_LOG(info, "warming cluster {} complete", cluster_name);
createOrUpdateThreadLocalCluster(cluster_entry);
onClusterInit(*cluster_entry.cluster_);
createOrUpdateThreadLocalCluster(warming_cluster_entry);
onClusterInit(*warming_cluster_entry.cluster_);
cluster_warming_cb(cluster_name, ClusterWarmingState::Finished);
updateGauges();
});
Expand Down
6 changes: 3 additions & 3 deletions source/common/upstream/eds.cc
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ void EdsClusterImpl::onConfigUpdate(const ResourceVector& resources, const std::
if (resources.empty()) {
ENVOY_LOG(debug, "Missing ClusterLoadAssignment for {} in onConfigUpdate()", cluster_name_);
info_->stats().update_empty_.inc();
onPreInitComplete();
onPreInitComplete(true);
return;
}
if (resources.size() != 1) {
Expand Down Expand Up @@ -120,7 +120,7 @@ void EdsClusterImpl::onConfigUpdate(const ResourceVector& resources, const std::

// If we didn't setup to initialize when our first round of health checking is complete, just
// do it now.
onPreInitComplete();
onPreInitComplete(false);
}

bool EdsClusterImpl::updateHostsPerLocality(
Expand Down Expand Up @@ -163,7 +163,7 @@ bool EdsClusterImpl::updateHostsPerLocality(
void EdsClusterImpl::onConfigUpdateFailed(const EnvoyException* e) {
UNREFERENCED_PARAMETER(e);
// We need to allow server startup to continue, even if we have a bad config.
onPreInitComplete();
onPreInitComplete(false);
}

} // namespace Upstream
Expand Down
1 change: 1 addition & 0 deletions source/common/upstream/health_discovery_service.h
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ class HdsCluster : public Cluster, Logger::Loggable<Logger::Id::upstream> {
Outlier::Detector* outlierDetector() override { return outlier_detector_.get(); }
const Outlier::Detector* outlierDetector() const override { return outlier_detector_.get(); }
void initialize(std::function<void()> callback) override;
bool initializedByEmptyConfig() const override { return false; }

// Creates and starts healthcheckers to its endpoints
void startHealthchecks(AccessLog::AccessLogManager& access_log_manager, Runtime::Loader& runtime,
Expand Down
2 changes: 1 addition & 1 deletion source/common/upstream/logical_dns_cluster.cc
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ void LogicalDnsCluster::startResolve() {
}
}

onPreInitComplete();
onPreInitComplete(false);
resolve_timer_->enableTimer(dns_refresh_rate_ms_);
});
}
Expand Down
2 changes: 1 addition & 1 deletion source/common/upstream/original_dst_cluster.h
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ class OriginalDstCluster : public ClusterImplBase {
void cleanup();

// ClusterImplBase
void startPreInit() override { onPreInitComplete(); }
void startPreInit() override { onPreInitComplete(false); }

Event::Dispatcher& dispatcher_;
const std::chrono::milliseconds cleanup_interval_ms_;
Expand Down
2 changes: 1 addition & 1 deletion source/common/upstream/subset_lb.h
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ class SubsetLoadBalancer : public LoadBalancer, Logger::Loggable<Logger::Id::ups

void update(uint32_t priority, const HostVector& hosts_added, const HostVector& hosts_removed);

bool empty() { return empty_; }
bool empty() const override { return empty_; }

const HostSubsetImpl* getOrCreateHostSubset(uint32_t priority) {
return reinterpret_cast<const HostSubsetImpl*>(&getOrCreateHostSet(priority));
Expand Down
8 changes: 4 additions & 4 deletions source/common/upstream/upstream_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -741,13 +741,13 @@ void ClusterImplBase::initialize(std::function<void()> callback) {
startPreInit();
}

void ClusterImplBase::onPreInitComplete() {
void ClusterImplBase::onPreInitComplete(const bool empty_update) {
// Protect against multiple calls.
if (initialization_started_) {
return;
}
initialization_started_ = true;

empty_update_ = empty_update;
ENVOY_LOG(debug, "initializing secondary cluster {} completed", info()->name());
init_manager_.initialize([this]() { onInitDone(); });
}
Expand Down Expand Up @@ -1059,7 +1059,7 @@ void StaticClusterImpl::startPreInit() {
}
priority_state_manager_.reset();

onPreInitComplete();
onPreInitComplete(false);
}

bool BaseDynamicClusterImpl::updateDynamicHostList(const HostVector& new_hosts,
Expand Down Expand Up @@ -1372,7 +1372,7 @@ void StrictDnsClusterImpl::ResolveTarget::startResolve() {
// multiple DNS names, this will return initialized after a single DNS resolution
// completes. This is not perfect but is easier to code and unclear if the extra
// complexity is needed so will start with this.
parent_.onPreInitComplete();
parent_.onPreInitComplete(false);
resolve_timer_->enableTimer(parent_.dns_refresh_rate_ms_);
});
}
Expand Down
15 changes: 14 additions & 1 deletion source/common/upstream/upstream_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -413,6 +413,15 @@ class PrioritySetImpl : public PrioritySet {
const HostVector& hosts_removed,
absl::optional<uint32_t> overprovisioning_factor = absl::nullopt) override;

bool empty() const override {
for (auto const& host_set : host_sets_) {
if (!host_set->hosts().empty()) {
return false;
}
}
return true;
}

protected:
// Allows subclasses of PrioritySetImpl to create their own type of HostSetImpl.
virtual HostSetImplPtr createHostSet(uint32_t priority,
Expand Down Expand Up @@ -578,6 +587,7 @@ class ClusterImplBase : public Cluster, protected Logger::Loggable<Logger::Id::u
// Upstream::Cluster
PrioritySet& prioritySet() override { return priority_set_; }
const PrioritySet& prioritySet() const override { return priority_set_; }
bool initializedByEmptyConfig() const override { return empty_update_; }

/**
* Optionally set the health checker for the primary cluster. This is done after cluster
Expand Down Expand Up @@ -627,8 +637,10 @@ class ClusterImplBase : public Cluster, protected Logger::Loggable<Logger::Id::u
* Called by every concrete cluster when pre-init is complete. At this point,
* shared init starts init_manager_ initialization and determines if there
* is an initial health check pass needed, etc.
*
* @param empty_update indicates that onPreInitComplete is triggered via empty update.
*/
void onPreInitComplete();
void onPreInitComplete(const bool empty_update);

/**
* Called by every concrete cluster after all targets registered at init manager are
Expand All @@ -654,6 +666,7 @@ class ClusterImplBase : public Cluster, protected Logger::Loggable<Logger::Id::u
bool initialization_started_{};
std::function<void()> initialization_complete_callback_;
uint64_t pending_initialize_health_checks_{};
bool empty_update_{};
};

/**
Expand Down
Loading