Skip to content

outlier_detector: accept large base_ejection_time when max_ejection_time not specified#14962

Merged
snowp merged 5 commits intoenvoyproxy:mainfrom
cpakulski:od_max
Feb 22, 2021
Merged

outlier_detector: accept large base_ejection_time when max_ejection_time not specified#14962
snowp merged 5 commits intoenvoyproxy:mainfrom
cpakulski:od_max

Conversation

@cpakulski
Copy link
Copy Markdown
Contributor

Commit Message:
accept large base_ejection_time when max_ejection_time not specified

Additional Description:
PR #14235 broke compatibility with istio. Istio sets large base_ejection_time and configs where max_jection_time was not specified were rejected. That scenario happened during upgrade from istio 1.8 to 1.9. See #14235 (comment).

This PR changes the logic to following:

  • if both base_ejection_time and max_ejection_time are specified in the config, they are compared and config is rejected when max_ejection_time is smaller than base_ejection_time.
  • if max_ejection_time is missing in the config, the default value of 300s is applied. If base_ejection_time is larger than 300s, then max_ejection_time is initialized to base_ejection_time.

Risk Level: Low
Testing: Added unit test to simulate receiving config without max_ejection_time and with large base_ejection_time.
Docs Changes: No
Release Notes: No
Platform Specific Features:

…ecified

and base_ejection_time is larger than max_ejection_time's default.

Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
@ramaraochavali
Copy link
Copy Markdown
Contributor

@lizan Can we backport this to 1.17 once it is merged?

Copy link
Copy Markdown
Contributor

@antoniovicente antoniovicente left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive by review as I look at the status of PRs pending review.

// max_ejection_time was not specified in the config. Apply the default or base_ejection_time
// whatever is larger.
max_ejection_time_ms_ = DEFAULT_MAX_EJECTION_TIME_MS;
max_ejection_time_ms_ = std::max(max_ejection_time_ms_, base_ejection_time_ms_);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max_ejection_time_ms_ = std::max(DEFAULT_MAX_EJECTION_TIME_MS, base_ejection_time_ms_);

PROTOBUF_GET_WRAPPED_OR_DEFAULT(config, enforcing_local_origin_success_rate,
DEFAULT_ENFORCING_LOCAL_ORIGIN_SUCCESS_RATE))),
max_ejection_time_ms_(static_cast<uint64_t>(
PROTOBUF_GET_MS_OR_DEFAULT(config, max_ejection_time, DEFAULT_MAX_EJECTION_TIME_MS))) {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternate implementation:

PROTOBUF_GET_MS_OR_DEFAULT(
    config, max_ejection_time,
    std::max(DEFAULT_MAX_EJECTION_TIME_MS, base_ejection_time_ms_))

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is good idea. Implemented like that but the change required conversion of DEFAULT_MAX_EJECTION_TIME_MS from const to constexpr.

max_ejection_time_ms_(static_cast<uint64_t>(
PROTOBUF_GET_MS_OR_DEFAULT(config, max_ejection_time, DEFAULT_MAX_EJECTION_TIME_MS))) {}
DEFAULT_ENFORCING_LOCAL_ORIGIN_SUCCESS_RATE))) {
// Check if max_ejection_time is specified.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @envoyproxy/api-shepherds

Please update API comments since default behavior after this change is more complex than currently documented. There were some comments in the original PR about the change being technically breaking behavior; was the bad interaction with configured base ejection time the concern or something else? https://github.com/envoyproxy/envoy/pull/14235/files#r539583484

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the description of the parameter in proto file. Normally outlier detector checks if max_ejection is larger than base_ejection time. If max_ is smaller, the config is rejected. The corner case was when max_ejection was not specified (legacy config was used from before the max_ejection was introduced) and base_ejection time was larger than max_ejection default. The config was rejected and broke upgrade path in istio.

@antoniovicente antoniovicente requested a review from snowp February 9, 2021 02:10
Updated api docs.

Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
@repokitteh-read-only
Copy link
Copy Markdown

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/envoy/.
API shepherd assignee is @markdroth
CC @envoyproxy/api-watchers: FYI only for changes made to api/envoy/.

🐱

Caused by: #14962 was synchronize by cpakulski.

see: more, trace.

@cpakulski
Copy link
Copy Markdown
Contributor Author

/retest

@repokitteh-read-only
Copy link
Copy Markdown

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #14962 (comment) was created by @cpakulski.

see: more, trace.

@cpakulski
Copy link
Copy Markdown
Contributor Author

@markdroth Would you mind approving api? Syntax was not changed - just updated the comment. Thanks!

Copy link
Copy Markdown
Contributor

@snowp snowp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good to me, just one nit.

@envoyproxy/api-shepherds ptal

Comment on lines 154 to 156
// for more information. If not specified, the default value or
// :ref:`base_ejection_time<envoy_api_field_config.cluster.v3.OutlierDetection.base_ejection_time>` value is applied, whatever is larger.
// Defaults to 300000ms or 300s.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would inline the default here instead of referring to the default on L154 and defining it on L156

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@snowp snowp added the waiting label Feb 17, 2021
Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
Copy link
Copy Markdown
Member

@lizan lizan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm api

Comment on lines 154 to 156
// for more information. If not specified, the default value or
// :ref:`base_ejection_time<envoy_api_field_config.cluster.v3.OutlierDetection.base_ejection_time>` value is applied, whatever is larger.
// Defaults to 300000ms or 300s.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Copy Markdown
Contributor

@snowp snowp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@snowp snowp merged commit 12df7d0 into envoyproxy:main Feb 22, 2021
@Shikugawa Shikugawa removed the backport/review Request to backport to stable releases label Feb 23, 2021
@Shikugawa Shikugawa added the backport/approved Approved backports to stable releases label Feb 23, 2021
cpakulski added a commit to cpakulski/envoy that referenced this pull request Feb 23, 2021
outlier_detector: accept large base_ejection_time when max_ejection_time not specified (envoyproxy#14962)

Changed logic in config verification when max_ejection_time is not specified
and base_ejection_time is larger than max_ejection_time's default.

Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
@cpakulski cpakulski deleted the od_max branch February 23, 2021 16:35
Shikugawa pushed a commit that referenced this pull request Mar 3, 2021
outlier_detector: accept large base_ejection_time when max_ejection_time not specified (#14962)

Changed logic in config verification when max_ejection_time is not specified
and base_ejection_time is larger than max_ejection_time's default.

Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api backport/approved Approved backports to stable releases

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants