Skip to content

shared_pool: regression test for dispatcher-teardown cross-thread delete leak#135

Draft
Copilot wants to merge 2 commits into
flake-dfp-nofrom
copilot/flake-dfp-no-add-regression-test
Draft

shared_pool: regression test for dispatcher-teardown cross-thread delete leak#135
Copilot wants to merge 2 commits into
flake-dfp-nofrom
copilot/flake-dfp-no-add-regression-test

Conversation

Copy link
Copy Markdown

Copilot AI commented Apr 23, 2026

Adds a regression test for the leak fixed in the cross-thread deletion path of ObjectSharedPool: when the dispatcher is destroyed before running a posted delete callback, the captured std::unique_ptr<T> must free the object. Before the fix, a raw T* was captured by value and the callback destructor couldn't free it.

Test: DispatcherTeardownDropsCrossThreadDeleteFreesObject

  • Uses a NiceMock<Event::MockDispatcher> (no real dispatcher thread) — intercepts post() and moves each callback into a local std::vector<Event::PostCb> without running it.
  • Acquires one pooled object via getObject on the main thread, then drops the last shared_ptr from a worker thread to trigger the cross-thread path.
  • After join(), asserts exactly one callback was queued and the object is still alive (held by the unique_ptr inside the lambda).
  • Calls posted_cbs.clear() to simulate dispatcher teardown dropping the callback. Asserts live == 0 — the unique_ptr destructor freed the object.
// Simulate dispatcher teardown dropping the queued callback without running it.
posted_cbs.clear();

// Regression: before the fix, raw T* was captured — live would still be 1 here.
EXPECT_EQ(0, Counted::live.load());

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • dl.google.com
    • Triggering command: /build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/embedded_tools/jdk/bin/java bazel(envoy) --add-opens=java.base/java.lang=ALL-UNNAMED -Xverify:none -Djava.util.logging.config.file=/build/bazel_root/base/javalog.properties -Dcom.google.devtools.build.lib.util.LogHandlerQuerier.class=com.google.devtools.build.lib.util.SimpleLogHandler$HandlerQuerier -XX:-MaxFDLimit -Djava.library.path=/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/embedded_tools/jdk/lib:/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/embedded_tools/jdk/lib/server:/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/ -Dfile.encoding=ISO-8859-1 -Duser.country= -Duser.language= -Duser.variant= -Xmx3g -DBAZEL_TRACK_SOURCE_DIRECTORIES=1 -Djavax.net.ssl.trustStore=/tmp/custom-cacerts -Djavax.net.ssl.trustStorePassword=changeit -jar /build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/A-server.jar --max_idle_secs=10800 --noshutdown_on_low_sys_mem --connect_timeout_secs=30 (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

Add a single regression test to test/common/shared_pool/shared_pool_test.cc for the leak fix in PR envoyproxy#44570. The PR changes source/common/shared_pool/shared_pool.h so that when the last shared_ptr to a pooled object is released on a non-main thread, the cross-thread deletion posted to dispatcher_ carries ownership via a captured std::unique_ptr<T>. If the dispatcher is destroyed before running the posted callback, the captured unique_ptr destructor frees the object. Before the fix, that path leaked.

Reviewer @agrawroh asked for a regression test exercising "the dispatcher teardown-drops-callback path".

Push as a single commit on top of branch flake-dfp-no in fork phlax/envoy. Do not open a new PR. Do not touch any other files.

Context — existing test file

Path: test/common/shared_pool/shared_pool_test.cc

Key existing patterns in that file:

  • SharedPoolTest fixture spins up a real dispatcher on a background thread (dispatcher_thread_), used for tests that need a real dispatcher.
  • Several tests instead construct a local Event::MockDispatcher dispatcher; for in-thread tests (see TEST_F(SharedPoolTest, Basic) and TEST_F(SharedPoolTest, HashCollision)).
  • Cross-thread deletion is exercised in TEST_F(SharedPoolTest, ThreadSafeForDeleteObject) — second block, where thread_factory.createThread is used and the thread calls deleteObject(pool, an_int) so the deleter takes the cross-thread branch.
  • There is a sync_ hook DeleteObjectOnMainThread used to block between the cross-thread branch and the actual dispatcher_.post(...) — see TEST_F(SharedPoolTest, GetObjectAndDeleteObjectRaceForSameHashValue).

The fix under test (already on flake-dfp-no)

In source/common/shared_pool/shared_pool.h around line 89 the cross-thread deleter now does:

dispatcher_.post([owned_ptr = std::unique_ptr<T>(ptr), this_shared_ptr]() mutable {
  this_shared_ptr->deleteObjectOnMainThread(owned_ptr.release());
});

What to add

Add one new test, TEST_F(SharedPoolTest, DispatcherTeardownDropsCrossThreadDeleteFreesObject), that:

  1. Uses a local NiceMock<Event::MockDispatcher> (not the fixture's real dispatcher) — follow the Basic / HashCollision test pattern. This lets us intercept post() without running the callback.

  2. Captures every posted callback into a local container instead of running them. Event::MockDispatcher::post() takes a PostCb (see test/mocks/event/mocks.h). Use EXPECT_CALL(dispatcher, post(_)).WillRepeatedly(Invoke([&](Event::PostCb cb) { posted_cbs.push_back(std::move(cb)); })); or equivalent using SaveArg. (Check the exact PostCb signature in test/mocks/event/mocks.h — if post() is declared void post(PostCb callback) with move semantics, use Invoke so you can move the callback into the vector.)

  3. Uses a counting T:

    struct Counted {
      explicit Counted(int v) : v_(v) { ++live; }
      ~Counted() { --live; }
      int v_;
      static std::atomic<int> live;
      // hash/eq so it can be used in the pool:
      bool operator==(const Counted& o) const { return v_ == o.v_; }
    };
    std::atomic<int> Counted::live{0};
    struct CountedHash { size_t operator()(const Counted& c) const { return c.v_; } };

    Define these at namespace scope or inside the test — whichever is cleaner and matches the file's style. The pool type is ObjectSharedPool<Counted, CountedHash> (look at HashCollision test for the shape).

  4. Creates the pool on the test (main) thread using the mock dispatcher. Acquires one object: auto obj = pool->getObject(Counted{7}); — assert Counted::live == 1 and posted_cbs.empty().

  5. Drops the last shared_ptr from a different thread using Thread::threadFactoryForTest() (same pattern as ThreadSafeForDeleteObject second block):

    Thread::ThreadFactory& thread_factory = Thread::threadFactoryForTest();
    auto thread = thread_factory.createThread([&]() { obj.reset(); });
    thread->join();

    After join():

    • Assert exactly one callback was posted: ASSERT_EQ(1, posted_cbs.size());
    • Assert the object is still alive (held by the captured unique_ptr inside the posted lambda): EXPECT_EQ(1, Counted::live.load());
  6. Simulate dispatcher teardown dropping the queued callback without running it:

    posted_cbs.clear();

    Then assert the object is freed: EXPECT_EQ(0, Counted::live.load());

    This is the regression assertion. Prior to the fix the raw T* was captured by value and the callback destructor couldn't free it → leak.

  7. Also release the pool (pool.reset();) before the test ends so nothing dangles. Since the object is already destroyed, this should be a no-op for the counter.

Things to watch out for

  • getObject returns std::shared_ptr<T>. The value type the pool holds is T, so getObject(Counted{7}) will look up / construct a Counted in the pool.
  • Counted::live must be reset to 0 ...

This pull request was created from Copilot chat.

Copilot AI changed the title [WIP] Add regression test for dispatcher teardown-drops-callback shared_pool: regression test for dispatcher-teardown cross-thread delete leak Apr 23, 2026
Copilot AI requested a review from phlax April 23, 2026 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants