Skip to content

access logging: gRPC logger retry to establish underlying stream connection#17469

Merged
htuch merged 18 commits intoenvoyproxy:mainfrom
Shikugawa:cluster-retry
Nov 5, 2021
Merged

access logging: gRPC logger retry to establish underlying stream connection#17469
htuch merged 18 commits intoenvoyproxy:mainfrom
Shikugawa:cluster-retry

Conversation

@Shikugawa
Copy link
Copy Markdown
Member

Signed-off-by: Shikugawa rei@tetrate.io

Commit Message: Add a retry mechanism to the grpc access logger. This retry mechanism currently only supports a simple retry count. Also, at the moment, retries are only fired when the gRPC stream fails to be established, and nothing happens if the stream is successfully established once and a reset is issued.
Additional Description:
Risk Level: Low
Testing: Unit
Docs Changes: Required
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Deprecated:]
[Optional API Considerations:]

Signed-off-by: Shikugawa <rei@tetrate.io>
@repokitteh-read-only
Copy link
Copy Markdown

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/envoy/.
envoyproxy/api-shepherds assignee is @lizan
CC @envoyproxy/api-watchers: FYI only for changes made to api/envoy/.

🐱

Caused by: #17469 was opened by Shikugawa.

see: more, trace.

@Shikugawa Shikugawa changed the title access logging: gRPC logger retry stream access logging: gRPC logger retry to establish underlying stream connection Jul 24, 2021
Signed-off-by: Shikugawa <rei@tetrate.io>
Comment thread source/extensions/access_loggers/common/grpc_access_logger.h Outdated
Comment thread source/extensions/access_loggers/common/grpc_access_logger.h Outdated
Signed-off-by: Shikugawa <rei@tetrate.io>
Signed-off-by: Shikugawa <rei@tetrate.io>
@Shikugawa Shikugawa requested a review from lizan July 28, 2021 07:56
Signed-off-by: Shikugawa <rei@tetrate.io>
Signed-off-by: Shikugawa <rei@tetrate.io>
@dmitri-d
Copy link
Copy Markdown
Contributor

lgtm!

Copy link
Copy Markdown
Member

@lizan lizan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/wait

Comment thread source/extensions/access_loggers/common/grpc_access_logger.h Outdated
Comment thread source/extensions/access_loggers/common/grpc_access_logger.h Outdated
Comment thread source/extensions/access_loggers/common/grpc_access_logger.h Outdated
Signed-off-by: Shikugawa <rei@tetrate.io>
Signed-off-by: Shikugawa <rei@tetrate.io>
Comment thread source/extensions/access_loggers/common/grpc_access_logger.h Outdated
Signed-off-by: Shikugawa <rei@tetrate.io>
Comment thread source/extensions/access_loggers/common/grpc_access_logger.h Outdated
@zuercher
Copy link
Copy Markdown
Member

zuercher commented Sep 8, 2021

/wait

…er-retry

Signed-off-by: Shikugawa <rei@tetrate.io>
@htuch
Copy link
Copy Markdown
Member

htuch commented Oct 20, 2021

@lizan friendly ping.

@Shikugawa
Copy link
Copy Markdown
Member Author

/retest

@repokitteh-read-only
Copy link
Copy Markdown

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #17469 (comment) was created by @Shikugawa.

see: more, trace.

lizan
lizan previously approved these changes Oct 27, 2021
Copy link
Copy Markdown
Member

@lizan lizan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@htuch can you do a non-tetrands review?

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just one Q: how does this behave when we are using the Google gRPC async client vs. the in-built Envoy async gRPC client?

@Shikugawa
Copy link
Copy Markdown
Member Author

Looks good, just one Q: how does this behave when we are using the Google gRPC async client vs. the in-built Envoy async gRPC client?

It seems opaque behavior on Google gRPC with this API. Maybe we should configure channel args for retry and buffer on google_grpc configuration.

// TODO(htuch): figure out how to propagate "this request should be buffered for
// retry" bit to Google gRPC library.
void GoogleAsyncStreamImpl::initialize(bool /*buffer_body_for_retry*/) {
parent_.stats_.streams_total_->inc();
gpr_timespec abs_deadline =
options_.timeout
? gpr_time_add(gpr_now(GPR_CLOCK_REALTIME),
gpr_time_from_millis(options_.timeout.value().count(), GPR_TIMESPAN))
: gpr_inf_future(GPR_CLOCK_REALTIME);
ctxt_.set_deadline(abs_deadline);
// Fill service-wide initial metadata.
auto initial_metadata = Http::RequestHeaderMapImpl::create();
parent_.metadata_parser_->evaluateHeaders(*initial_metadata, options_.parent_context.stream_info);
callbacks_.onCreateInitialMetadata(*initial_metadata);
initial_metadata->iterate([this](const Http::HeaderEntry& header) {
ctxt_.AddMetadata(std::string(header.key().getStringView()),
std::string(header.value().getStringView()));
return Http::HeaderMap::Iterate::Continue;
});
// Invoke stub call.
rw_ = parent_.stub_->PrepareCall(&ctxt_, "/" + service_full_name_ + "/" + method_name_,
&parent_.tls_.completionQueue());
if (rw_ == nullptr) {
notifyRemoteClose(Status::WellKnownGrpcStatus::Unavailable, nullptr, EMPTY_STRING);
call_failed_ = true;
return;
}
parent_.tls_.registerStream(this);
rw_->StartCall(&init_tag_);
++inflight_tags_;
}

If we'd like to integrate between Envoy gRPC and Google gRPC for retry and buffer, we should have API migration for envoy's retry policy and Google gRPC's channel args. Is there any plan to work on this?

@htuch
Copy link
Copy Markdown
Member

htuch commented Oct 29, 2021

I don't think there is a solid story for how to get the retry/buffering semantics aligned across the gRPC implementations. This really touches on the fact that we have not made any progress on unifying the gRPC implementations more broadly.

I think for this PR, maybe just clarify in the release notes and comments that this only works with the Envoy gRPC client and we can merge.

…er-retry

Signed-off-by: Shikugawa <rei@tetrate.io>
Signed-off-by: Shikugawa <rei@tetrate.io>
@repokitteh-read-only
Copy link
Copy Markdown

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @lizan
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

🐱

Caused by: #17469 was synchronize by Shikugawa.

see: more, trace.

@Shikugawa
Copy link
Copy Markdown
Member Author

/retest

@repokitteh-read-only
Copy link
Copy Markdown

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #17469 (comment) was created by @Shikugawa.

see: more, trace.

@Shikugawa
Copy link
Copy Markdown
Member Author

/retest

@repokitteh-read-only
Copy link
Copy Markdown

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #17469 (comment) was created by @Shikugawa.

see: more, trace.

@lizan
Copy link
Copy Markdown
Member

lizan commented Nov 4, 2021

@htuch ping?

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@repokitteh-read-only repokitteh-read-only Bot removed the api label Nov 5, 2021
@htuch htuch merged commit 3a5f795 into envoyproxy:main Nov 5, 2021
@durd07
Copy link
Copy Markdown
Member

durd07 commented Nov 5, 2021

@htuch @Shikugawa the merge bring back the generated_api_shadow into main repo.
can we add a hook to check whether generated_api_shadow added by mistake?
I have seen serval times generated_api_shadow back.

@Shikugawa
Copy link
Copy Markdown
Member Author

Shikugawa commented Nov 5, 2021

@htuch @Shikugawa the merge bring back the generated_api_shadow into main repo. can we add a hook to check whether generated_api_shadow added by mistake? I have seen serval times generated_api_shadow back.

@durd07 Thank you for the notification. This is no longer needed and just my mistake. #18911

@Shikugawa Shikugawa deleted the cluster-retry branch November 5, 2021 07:37
@htuch
Copy link
Copy Markdown
Member

htuch commented Nov 7, 2021

@durd07 what other examples are there? If this is common I can add a hook, otherwise I think we will soon forget about generated_api_shadow/ and this won't sneak back in.

@durd07
Copy link
Copy Markdown
Member

durd07 commented Nov 8, 2021

@htuch , yes, you're right. maybe we will forget about it soon. I met this situation at least twice. but I think it is not necessary to emphasis the alreay not exist codes. please forget it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants