shared_pool: regression test for dispatcher-teardown cross-thread delete leak#135
Draft
Copilot wants to merge 2 commits into
Draft
shared_pool: regression test for dispatcher-teardown cross-thread delete leak#135Copilot wants to merge 2 commits into
Copilot wants to merge 2 commits into
Conversation
…ete leak Agent-Logs-Url: https://github.com/phlax/envoy/sessions/d289657d-320a-49dc-9726-10d554189a29 Co-authored-by: phlax <454682+phlax@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Add regression test for dispatcher teardown-drops-callback
shared_pool: regression test for dispatcher-teardown cross-thread delete leak
Apr 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds a regression test for the leak fixed in the cross-thread deletion path of
ObjectSharedPool: when the dispatcher is destroyed before running a posted delete callback, the capturedstd::unique_ptr<T>must free the object. Before the fix, a rawT*was captured by value and the callback destructor couldn't free it.Test:
DispatcherTeardownDropsCrossThreadDeleteFreesObjectNiceMock<Event::MockDispatcher>(no real dispatcher thread) — interceptspost()and moves each callback into a localstd::vector<Event::PostCb>without running it.getObjecton the main thread, then drops the lastshared_ptrfrom a worker thread to trigger the cross-thread path.join(), asserts exactly one callback was queued and the object is still alive (held by theunique_ptrinside the lambda).posted_cbs.clear()to simulate dispatcher teardown dropping the callback. Assertslive == 0— theunique_ptrdestructor freed the object.Warning
Firewall rules blocked me from connecting to one or more addresses (expand for details)
I tried to connect to the following addresses, but was blocked by firewall rules:
dl.google.com/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/embedded_tools/jdk/bin/java bazel(envoy) --add-opens=java.base/java.lang=ALL-UNNAMED -Xverify:none -Djava.util.logging.config.file=/build/bazel_root/base/javalog.properties -Dcom.google.devtools.build.lib.util.LogHandlerQuerier.class=com.google.devtools.build.lib.util.SimpleLogHandler$HandlerQuerier -XX:-MaxFDLimit -Djava.library.path=/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/embedded_tools/jdk/lib:/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/embedded_tools/jdk/lib/server:/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/ -Dfile.encoding=ISO-8859-1 -Duser.country= -Duser.language= -Duser.variant= -Xmx3g -DBAZEL_TRACK_SOURCE_DIRECTORIES=1 -Djavax.net.ssl.trustStore=/tmp/custom-cacerts -Djavax.net.ssl.trustStorePassword=changeit -jar /build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/A-server.jar --max_idle_secs=10800 --noshutdown_on_low_sys_mem --connect_timeout_secs=30(dns block)If you need me to access, download, or install something from one of these locations, you can either:
Original prompt
Add a single regression test to
test/common/shared_pool/shared_pool_test.ccfor the leak fix in PR envoyproxy#44570. The PR changessource/common/shared_pool/shared_pool.hso that when the lastshared_ptrto a pooled object is released on a non-main thread, the cross-thread deletion posted todispatcher_carries ownership via a capturedstd::unique_ptr<T>. If the dispatcher is destroyed before running the posted callback, the capturedunique_ptrdestructor frees the object. Before the fix, that path leaked.Reviewer @agrawroh asked for a regression test exercising "the dispatcher teardown-drops-callback path".
Push as a single commit on top of branch
flake-dfp-noin forkphlax/envoy. Do not open a new PR. Do not touch any other files.Context — existing test file
Path:
test/common/shared_pool/shared_pool_test.ccKey existing patterns in that file:
SharedPoolTestfixture spins up a real dispatcher on a background thread (dispatcher_thread_), used for tests that need a real dispatcher.Event::MockDispatcher dispatcher;for in-thread tests (seeTEST_F(SharedPoolTest, Basic)andTEST_F(SharedPoolTest, HashCollision)).TEST_F(SharedPoolTest, ThreadSafeForDeleteObject)— second block, wherethread_factory.createThreadis used and the thread callsdeleteObject(pool, an_int)so the deleter takes the cross-thread branch.sync_hookDeleteObjectOnMainThreadused to block between the cross-thread branch and the actualdispatcher_.post(...)— seeTEST_F(SharedPoolTest, GetObjectAndDeleteObjectRaceForSameHashValue).The fix under test (already on
flake-dfp-no)In
source/common/shared_pool/shared_pool.haround line 89 the cross-thread deleter now does:What to add
Add one new test,
TEST_F(SharedPoolTest, DispatcherTeardownDropsCrossThreadDeleteFreesObject), that:Uses a local
NiceMock<Event::MockDispatcher>(not the fixture's real dispatcher) — follow theBasic/HashCollisiontest pattern. This lets us interceptpost()without running the callback.Captures every posted callback into a local container instead of running them.
Event::MockDispatcher::post()takes aPostCb(seetest/mocks/event/mocks.h). UseEXPECT_CALL(dispatcher, post(_)).WillRepeatedly(Invoke([&](Event::PostCb cb) { posted_cbs.push_back(std::move(cb)); }));or equivalent usingSaveArg. (Check the exactPostCbsignature intest/mocks/event/mocks.h— ifpost()is declaredvoid post(PostCb callback)with move semantics, useInvokeso you can move the callback into the vector.)Uses a counting
T:Define these at namespace scope or inside the test — whichever is cleaner and matches the file's style. The pool type is
ObjectSharedPool<Counted, CountedHash>(look atHashCollisiontest for the shape).Creates the pool on the test (main) thread using the mock dispatcher. Acquires one object:
auto obj = pool->getObject(Counted{7});— assertCounted::live == 1andposted_cbs.empty().Drops the last
shared_ptrfrom a different thread usingThread::threadFactoryForTest()(same pattern asThreadSafeForDeleteObjectsecond block):After
join():ASSERT_EQ(1, posted_cbs.size());unique_ptrinside the posted lambda):EXPECT_EQ(1, Counted::live.load());Simulate dispatcher teardown dropping the queued callback without running it:
Then assert the object is freed:
EXPECT_EQ(0, Counted::live.load());This is the regression assertion. Prior to the fix the raw
T*was captured by value and the callback destructor couldn't free it → leak.Also release the pool (
pool.reset();) before the test ends so nothing dangles. Since the object is already destroyed, this should be a no-op for the counter.Things to watch out for
getObjectreturnsstd::shared_ptr<T>. The value type the pool holds isT, sogetObject(Counted{7})will look up / construct aCountedin the pool.Counted::livemust be reset to 0 ...This pull request was created from Copilot chat.