dns_cache: unconditionally floor failure backoff at min_refresh_interval_#134
Draft
Copilot wants to merge 2 commits into
Draft
dns_cache: unconditionally floor failure backoff at min_refresh_interval_#134Copilot wants to merge 2 commits into
Copilot wants to merge 2 commits into
Conversation
…_interval_ floor Agent-Logs-Url: https://github.com/phlax/envoy/sessions/d98c4156-7c80-4fbc-9837-8fc816054e8c Co-authored-by: phlax <454682+phlax@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Address review feedback on shared pool leak fix
dns_cache: unconditionally floor failure backoff at min_refresh_interval_
Apr 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Applies review feedback from @agrawroh on envoyproxy#44570: simplify the failure backoff capping logic in
finishResolveby removing the conditional guard around themin_refresh_interval_floor.Changes
dns_cache_impl.cc: Drop theuncapped_refresh_intervallocal variable and theif (refresh_interval < uncapped_refresh_interval)guard; replace with a single unconditionalstd::maxfloor:Original prompt
Address review feedback from @agrawroh on PR envoyproxy#44570 ("shared_pool: fix leak when last reference drops on worker thread during dispatcher teardown"). The branch to work on is
flake-dfp-noon this fork (phlax/envoy). Push commits directly to this branch so the upstream PR updates — do not open a new PR.The files currently touched by the PR are:
source/common/shared_pool/shared_pool.hsource/extensions/common/dynamic_forward_proxy/dns_cache_impl.cctest/extensions/common/dynamic_forward_proxy/dns_cache_impl_test.ccchangelogs/current.yamlPlease apply the following changes. Do not change unrelated code.
1.
source/extensions/common/dynamic_forward_proxy/dns_cache_impl.cc— simplify and fix invariantThe current diff in
finishResolve(around line 570) looks like:Reviewer feedback (agrawroh):
uncapped_refresh_interval→raw_backoff_ms.min_refresh_interval_. Replace the whole conditional floor block with a single line:refresh_interval = std::max(refresh_interval, std::chrono::duration_cast<std::chrono::milliseconds>(min_refresh_interval_));min_refresh_interval_is alreadystd::chrono::milliseconds, drop the cast.build_bazel_rules_apple: 0.31.3 (current: 0.31.2) #2): change>to>=inonReResolveAlarmin the same file — find the branch that compares elapsed time againsthost_ttl_for eviction and flip the strict>to>=so a re-arm exactly at the TTL boundary still evicts. Do NOT restore the+1msmagic offsets. Keeprefresh_interval = std::chrono::milliseconds(0);in the past-TTL branch and keepuntil_evictionwith no+1ms.dns_min_refresh_ratealarms (including thenextBackOffMs() == 0case) which could kick rapid-fire resolves and race with dispatcher/resolver teardown.2.
test/extensions/common/dynamic_forward_proxy/dns_cache_impl_test.cc— add testsa. A test with a seeded random generator that forces
nextBackOffMs() == 0and asserts the refresh alarm is armed atmin_refresh_interval_(the existing fixture aroundResolveFailureBackoffCappedByHostTtlusesdns_min_refresh_rate= 1s). Wire in aNiceMock<Random::MockRandomGenerator>that feeds the failure backoff strategy if one isn't already there; setON_CALL(random_, random()).WillByDefault(Return(0));so the jittered exponential backoff returns 0. The test should configure the same refresh/ttl/min as the existing test, trigger a DNS failure, and assertenableTimeris called with 1000ms (themin_refresh_interval_), NOT 0. Name itResolveFailureBackoffZeroFlooredByMinRefresh.b. A test for the past-TTL branch (
elapsed >= host_ttl_). Advance sim time pasthost_ttlbefore the failure; assertenableTimeris called withmin_refresh_interval_(not 1ms, not 0). Name itResolveFailurePastHostTtlFlooredByMinRefresh. Keep style consistent withResolveFailureBackoffCappedByHostTtl.Also update the existing
ResolveFailureBackoffCappedByHostTtltest's comment to match the new unconditional-floor semantics if it's still slightly stale.3.
source/common/shared_pool/shared_pool.hand a new test for the leak fixReviewer: "Could we please also add a regression test for the leak fix exercising the dispatcher teardown-drops-callback path?"
Find the existing shared_pool unit tests under
test/common/shared_pool/(likelyshared_pool_test.cc). Add a new test exercisin...This pull request was created from Copilot chat.