http1 codec: fix possible call after release issue when server connection is destroyed#20629
Conversation
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
|
/assign @yanavlasov @adisuissa |
|
/retest |
|
Retrying Azure Pipelines: |
|
Can you provide some more detailed context about this PR? When are there lifetime issues? |
|
/wait-any |
|
Msan with https://clang.llvm.org/docs/MemorySanitizer.html#use-after-destruction-detection reports the following on multiple tests Here is for envoy/test/integration/integration_test.cc: Important frames: ServerConnectionImpl is subclass of ConnectionImpl Accessing field after lifetime, even trivial, is undefined behaviour is c++. |
There was a problem hiding this comment.
Hi, thanks for your kind reply. This work is great and valueable. 🌷
Could you create a seperated PR for the HTTP1 code updates? Then we can merge it first.
And could you provide some more context about the lifetime issue of access log manager? Thanks. 😄
| // It's messy and complicated to try to tag the final write of an HTTP response for response | ||
| // tracking for flood protection. Instead, write an empty buffer fragment after the response, | ||
| // to allow for tracking. | ||
| // When the response is written out, the fragment will be deleted and the counter will be updated | ||
| // by ServerConnectionImpl::releaseOutboundResponse() | ||
| // It's messy and complicated to try to tag the final write of an HTTP | ||
| // response for response tracking for flood protection. Instead, write an | ||
| // empty buffer fragment after the response, to allow for tracking. When the | ||
| // response is written out, the fragment will be deleted and the counter will | ||
| // be updated by response_buffer_releasor_. |
There was a problem hiding this comment.
Looks these comment changes are uncessary.
There was a problem hiding this comment.
The patch removes ServerConnectionImpl::releaseOutboundResponse mentioned in the comment
There was a problem hiding this comment.
Get it. But looks like there is some re-formatting. Ingore my comment, this is just a nit problem.
There was a problem hiding this comment.
Could you create a seperated PR for the HTTP1 code updates? Then we can merge it first. And could you provide some more context about the lifetime issue of access log manager? Thanks. smile
This explains source/common/access_log/access_log_manager_impl.* changes
and propose a different fix for crash in envoy/test/extensions/filters/http/grpc_json_transcoder/grpc_json_transcoder_integration_test.cc
==7987==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 in Envoy::AccessLog::AccessLogFileImpl::doWrite(Envoy::Buffer::Instance&) source/common/access_log/access_log_manager_impl.cc
#1 in Envoy::AccessLog::AccessLogFileImpl::~AccessLogFileImpl() source/common/access_log/access_log_manager_impl.cc:91:7
#2 in std::__msan::__shared_ptr_emplace<Envoy::AccessLog::AccessLogFileImpl, std::__msan::allocator<Envoy::AccessLog::AccessLogFileImpl> >::__on_zero_shared() include/c++/v1/__memory/shared_ptr.h:315:24
#3 in __release_shared include/c++/v1/__memory/shared_ptr.h:177:9
#4 in __release_shared include/c++/v1/__memory/shared_ptr.h:219:27
#5 in ~shared_ptr include/c++/v1/__memory/shared_ptr.h:706:23
#6 in Envoy::Extensions::AccessLoggers::File::FileAccessLog::~FileAccessLog() source/extensions/access_loggers/common/file_access_log_impl.h:14:7
#7 in std::__msan::__shared_ptr_emplace<Envoy::Extensions::AccessLoggers::File::FileAccessLog, std::__msan::allocator<Envoy::Extensions::AccessLoggers::File::FileAccessLog> >::__on_zero_shared() include/c++/v1/__memory/shared_ptr.h:315:24
#8 in __release_shared include/c++/v1/__memory/shared_ptr.h:177:9
#9 in __release_shared include/c++/v1/__memory/shared_ptr.h:219:27
#10 in ~shared_ptr include/c++/v1/__memory/shared_ptr.h:706:23
#11 in destroy<std::__msan::shared_ptr<Envoy::AccessLog::Instance>, void, void> include/c++/v1/__memory/allocator_traits.h:319:15
#12 in clear include/c++/v1/list:749:13
#13 in std::__msan::__list_imp<std::__msan::shared_ptr<Envoy::AccessLog::Instance>, std::__msan::allocator<std::__msan::shared_ptr<Envoy::AccessLog::Instance> > >::~__list_imp() include/c++/v1/list:728:3
#14 in Envoy::Server::AdminImpl::~AdminImpl() source/server/admin/admin.h:64:7
#15 in Envoy::Server::AdminImpl::~AdminImpl() source/server/admin/admin.h:64:7
#16 in operator() include/c++/v1/__memory/unique_ptr.h:57:5
#17 in ~unique_ptr include/c++/v1/__memory/unique_ptr.h:275:7
#18 in Envoy::Server::InstanceImpl::~InstanceImpl() source/server/server.cc:163:1
#19 in Envoy::IntegrationTestServerImpl::createAndRunEnvoyServer(Envoy::OptionsImpl&, Envoy::Event::TimeSystem&, std::__msan::shared_ptr<Envoy::Network::Address::Instance const>, Envoy::ListenerHooks&, Envoy::Thread::BasicLockable&, Envoy::Server::ComponentFactory&, std::__msan::unique_ptr<Envoy::Random::RandomGenerator, std::__msan::default_delete<Envoy::Random::RandomGenerator> >&&, std::__msan::optional<std::__msan::reference_wrapper<Envoy::ProcessObject> >, std::__msan::shared_ptr<Envoy::Buffer::WatermarkFactory>) test/integration/server.cc:243:3
#20 in Envoy::IntegrationTestServer::threadRoutine(Envoy::Network::Address::IpVersion, std::__msan::optional<unsigned long>, std::__msan::optional<std::__msan::reference_wrapper<Envoy::ProcessObject> >, Envoy::Server::FieldValidationConfig, unsigned int, std::__msan::chrono::duration<long long, std::__msan::ratio<1l, 1l> >, Envoy::Server::DrainStrategy, std::__msan::shared_ptr<Envoy::Buffer::WatermarkFactory>) test/integration/server.cc:201:3
#21 in operator() test/integration/server.cc:108:5
#22 in __invoke<(lambda at test/integration/server.cc:105:47) &> include/c++/v1/type_traits:3493:23
#23 in __call<(lambda at test/integration/server.cc:105:47) &> include/c++/v1/__functional/invoke.h:61:9
#24 in operator() include/c++/v1/__functional/function.h:232:12
#25 in void std::__msan::__function::__policy_invoker<void ()>::__call_impl<std::__msan::__function::__default_alloc_func<Envoy::IntegrationTestServer::start(Envoy::Network::Address::IpVersion, std::__msan::function<void ()>, std::__msan::optional<unsigned long>, bool, std::__msan::optional<std::__msan::reference_wrapper<Envoy::ProcessObject> >, Envoy::Server::FieldValidationConfig, unsigned int, std::__msan::chrono::duration<long long, std::__msan::ratio<1l, 1l> >, Envoy::Server::DrainStrategy, std::__msan::shared_ptr<Envoy::Buffer::WatermarkFactory>)::$_2, void ()> >(std::__msan::__function::__policy_storage const*) include/c++/v1/__functional/function.h:713:16
#26 in operator() include/c++/v1/__functional/function.h:845:16
#27 in operator() include/c++/v1/__functional/function.h:1186:12
#28 in operator() source/common/common/posix/thread_impl.cc:49:11
#29 in Envoy::Thread::ThreadImplPosix::ThreadImplPosix(std::__msan::function<void ()>, std::__msan::optional<Envoy::Thread::Options> const&)::'lambda'(void*)::__invoke(void*) source/common/common/posix/thread_impl.cc:48:9
#30 in start_thread (/usr/grte/v5/lib64/libpthread.so.0)
#31 in clone (/usr/grte/v5/lib64/libc.so.6)
Memory was marked as uninitialized
#0 in __sanitizer_dtor_callback llvm-project/compiler-rt/lib/msan/msan_interceptors.cpp:940:5
#1 in Envoy::AccessLog::AccessLogManagerImpl::~AccessLogManagerImpl() source/common/access_log/access_log_manager_impl.cc:22:1
#2 in Envoy::Server::InstanceImpl::~InstanceImpl() source/server/server.cc:163:1
#3 in Envoy::IntegrationTestServerImpl::createAndRunEnvoyServer(Envoy::OptionsImpl&, Envoy::Event::TimeSystem&, std::__msan::shared_ptr<Envoy::Network::Address::Instance const>, Envoy::ListenerHooks&, Envoy::Thread::BasicLockable&, Envoy::Server::ComponentFactory&, std::__msan::unique_ptr<Envoy::Random::RandomGenerator, std::__msan::default_delete<Envoy::Random::RandomGenerator> >&&, std::__msan::optional<std::__msan::reference_wrapper<Envoy::ProcessObject> >, std::__msan::shared_ptr<Envoy::Buffer::WatermarkFactory>) test/integration/server.cc:243:3
#4 in Envoy::IntegrationTestServer::threadRoutine(Envoy::Network::Address::IpVersion, std::__msan::optional<unsigned long>, std::__msan::optional<std::__msan::reference_wrapper<Envoy::ProcessObject> >, Envoy::Server::FieldValidationConfig, unsigned int, std::__msan::chrono::duration<long long, std::__msan::ratio<1l, 1l> >, Envoy::Server::DrainStrategy, std::__msan::shared_ptr<Envoy::Buffer::WatermarkFactory>) test/integration/server.cc:201:3
#5 in operator() test/integration/server.cc:108:5
#6 in __invoke<(lambda at test/integration/server.cc:105:47) &> include/c++/v1/type_traits:3493:23
#7 in __call<(lambda at test/integration/server.cc:105:47) &> include/c++/v1/__functional/invoke.h:61:9
#8 in operator() include/c++/v1/__functional/function.h:232:12
#9 in void std::__msan::__function::__policy_invoker<void ()>::__call_impl<std::__msan::__function::__default_alloc_func<Envoy::IntegrationTestServer::start(Envoy::Network::Address::IpVersion, std::__msan::function<void ()>, std::__msan::optional<unsigned long>, bool, std::__msan::optional<std::__msan::reference_wrapper<Envoy::ProcessObject> >, Envoy::Server::FieldValidationConfig, unsigned int, std::__msan::chrono::duration<long long, std::__msan::ratio<1l, 1l> >, Envoy::Server::DrainStrategy, std::__msan::shared_ptr<Envoy::Buffer::WatermarkFactory>)::$_2, void ()> >(std::__msan::__function::__policy_storage const*) include/c++/v1/__functional/function.h:713:16
#10 in operator() include/c++/v1/__functional/function.h:845:16
#11 in operator() include/c++/v1/__functional/function.h:1186:12
#12 in operator() source/common/common/posix/thread_impl.cc:49:11
#13 in Envoy::Thread::ThreadImplPosix::ThreadImplPosix(std::__msan::function<void ()>, std::__msan::optional<Envoy::Thread::Options> const&)::'lambda'(void*)::__invoke(void*) source/common/common/posix/thread_impl.cc:48:9
#14 in start_thread (/usr/grte/v5/lib64/libpthread.so.0)
Important frames only:
==7987==WARNING: MemorySanitizer: use-of-uninitialized-value
// Destroyes admin_ which calls into https://github.com/envoyproxy/envoy/blob/052f94ef7e42347366b1af31b2b7035d3c98463b/source/common/access_log/access_log_manager_impl.cc#L116.
#0 in Envoy::AccessLog::AccessLogFileImpl::doWrite(Envoy::Buffer::Instance&) source/common/access_log/access_log_manager_impl.cc
#15 in Envoy::Server::AdminImpl::~AdminImpl() source/server/admin/admin.h:64:7
#18 in Envoy::Server::InstanceImpl::~InstanceImpl() source/server/server.cc:163:1
Memory was marked as uninitialized
// Destroyes access_log_manager_.
#1 in Envoy::AccessLog::AccessLogManagerImpl::~AccessLogManagerImpl() source/common/access_log/access_log_manager_impl.cc:22:1
#2 in Envoy::Server::InstanceImpl::~InstanceImpl() source/server/server.cc:163:1
access_log_manager_ member is being destroyed before admin_.
The better fix is just to reorder members:
==== envoy/src/source/server/server.cc ====
--- envoy/src/source/server/server.cc
+++ envoy/src/source/server/server.cc
@@ -82,11 +82,11 @@
process_context ? ProcessContextOptRef(std::ref(*process_context)) : absl::nullopt,
watermark_factory)),
dispatcher_(api_->allocateDispatcher("main_thread")),
+ access_log_manager_(options.fileFlushIntervalMsec(), *api_, *dispatcher_, access_log_lock,
+ store),
singleton_manager_(new Singleton::ManagerImpl(api_->threadFactory())),
handler_(new ConnectionHandlerImpl(*dispatcher_, absl::nullopt)),
listener_component_factory_(*this), worker_factory_(thread_local_, *api_, hooks),
- access_log_manager_(options.fileFlushIntervalMsec(), *api_, *dispatcher_, access_log_lock,
- store),
terminated_(false),
mutex_tracer_(options.mutexTracingEnabled() ? &Envoy::MutexTracerImpl::getOrCreateTracer()
: nullptr),
==== envoy/src/source/server/server.h ====
--- envoy/src/source/server/server.h
+++ envoy/src/source/server/server.h
@@ -359,6 +359,7 @@
envoy::config::bootstrap::v3::Bootstrap bootstrap_;
Api::ApiPtr api_;
Event::DispatcherPtr dispatcher_;
+ AccessLog::AccessLogManagerImpl access_log_manager_;
std::unique_ptr<AdminImpl> admin_;
Singleton::ManagerPtr singleton_manager_;
Network::ConnectionHandlerPtr handler_;
@@ -374,7 +375,6 @@
Network::DnsResolverSharedPtr dns_resolver_;
Event::TimerPtr stat_flush_timer_;
DrainManagerPtr drain_manager_;
- AccessLog::AccessLogManagerImpl access_log_manager_;
std::unique_ptr<Upstream::ClusterManagerFactory> cluster_manager_factory_;
std::unique_ptr<Server::GuardDog> main_thread_guard_dog_;
std::unique_ptr<Server::GuardDog> worker_guard_dog_;
There was a problem hiding this comment.
Thanks @vitalybuka for the detail root cause analysis.
Based on the discussion, I splitted the access log part to a separate PR: #20913.
PTAL
There was a problem hiding this comment.
Great, thanks for your detailed explaination. @vitalybuka 👍 🌷
|
/wait-any |
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
|
/retest |
|
Retrying Azure Pipelines: |
wbpcode
left a comment
There was a problem hiding this comment.
LGTM. Thanks @yanjunxiang-google @vitalybuka
cc @KBaichoo for a second pass or merge.
|
I think we could also fix this by moving up the A better approach might be pushing down the output buffer ownership to the individual Server/Client ConnectionImpl and have the |
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
| [&]() -> void { this->onBelowLowWatermark(); }, | ||
| [&]() -> void { this->onAboveHighWatermark(); }, | ||
| []() -> void { /* TODO(adisuissa): handle overflow watermark */ })) { | ||
| // Inform parent |
There was a problem hiding this comment.
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
|
/retest |
|
Retrying Azure Pipelines: |
…tion is destroyed (envoyproxy#20629) * Fix lifetime issues in Envoy Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Fix lifetime issues in Envoy
Signed-off-by: Yanjun Xiang yanjunxiang@google.com
Commit Message:
Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional API Considerations:]