dns_cache: add test for past-TTL failure-backoff branch floored at dns_min_refresh_rate#137
Draft
Copilot wants to merge 2 commits into
Draft
dns_cache: add test for past-TTL failure-backoff branch floored at dns_min_refresh_rate#137Copilot wants to merge 2 commits into
Copilot wants to merge 2 commits into
Conversation
…_rate Agent-Logs-Url: https://github.com/phlax/envoy/sessions/798a1ee4-459e-4f76-8dda-8bbeef41e639 Co-authored-by: phlax <454682+phlax@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Add test for past-TTL branch in dns_cache_impl_test
dns_cache: add test for past-TTL failure-backoff branch floored at dns_min_refresh_rate
Apr 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Covers the
if (elapsed >= host_ttl_)branch inDnsCacheImpl::finishResolve(PR envoyproxy#44570), which was explicitly flagged by reviewer @agrawroh as untested. When a DNS failure is observed after the host TTL has elapsed,refresh_intervalis set to 0 and then unconditionally raised tomin_refresh_interval_— this test pins that behaviour.New test:
ResolveFailurePastHostTtlFlooredByMinRefreshPlaced immediately after
ResolveFailureBackoffZeroFlooredByMinRefresh, mirroringResolveFailureBackoffCappedByHostTtlin structure.Key timing design — sim time is advanced in two stages to thread the needle between
onReResolveAlarm(evicts whennow - last_used >= host_ttl) andfinishResolve(takes the past-TTL branch under the same condition):onReResolveAlarm→startResolve(not evict)resolve_cb(Failure)calledrefresh_interval = 0→ floor to 1000msThe critical assertion:
Warning
Firewall rules blocked me from connecting to one or more addresses (expand for details)
I tried to connect to the following addresses, but was blocked by firewall rules:
dl.google.com/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/embedded_tools/jdk/bin/java bazel(envoy) --add-opens=java.base/java.lang=ALL-UNNAMED -Xverify:none -Djava.util.logging.config.file=/build/bazel_root/base/javalog.properties -Dcom.google.devtools.build.lib.util.LogHandlerQuerier.class=com.google.devtools.build.lib.util.SimpleLogHandler$HandlerQuerier -XX:-MaxFDLimit -Djava.library.path=/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/embedded_tools/jdk/lib:/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/embedded_tools/jdk/lib/server:/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/ -Dfile.encoding=ISO-8859-1 -Duser.country= -Duser.language= -Duser.variant= -Xmx3g -DBAZEL_TRACK_SOURCE_DIRECTORIES=1 -Djavax.net.ssl.trustStore=/tmp/custom-cacerts -Djavax.net.ssl.trustStorePassword=changeit -jar /build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/A-server.jar --max_idle_secs=10800 --noshutdown_on_low_sys_mem --connect_timeout_secs=30(dns block)If you need me to access, download, or install something from one of these locations, you can either:
Original prompt
Add one new test to
test/extensions/common/dynamic_forward_proxy/dns_cache_impl_test.cccovering the past-TTL branch of the failure-backoff logic in PR envoyproxy#44570. Reviewer @agrawroh explicitly asked for a test of this case (theif (elapsed >= host_ttl_)branch infinishResolve).Push as a single commit on top of branch
flake-dfp-noin forkphlax/envoy. Do not open a new PR. Do not touch any other files.What the PR already does (on
flake-dfp-no)In
source/extensions/common/dynamic_forward_proxy/dns_cache_impl.ccfinishResolve, the failure branch:primary_host_info->failure_backoff_strategy_->nextBackOffMs().elapsed >= host_ttl_(past TTL), setsrefresh_interval = 0.host_ttl - elapsed.min_refresh_interval_(dns_min_refresh_rate).And in
onReResolveAlarm,(now - last_used) >= host_ttl_(note:>=, not>) evicts the host.So after a failure that happens past the TTL, the armed interval should equal
min_refresh_interval_(not 0, not 1ms). When that alarm fires the host is evicted because elapsed ≥ host_ttl.Context
Two tests already exist nearby that together define the reference shape:
ResolveFailureBackoffCappedByHostTtl— cap kicks in because raw_backoff (60s) > until_eviction (900ms), then floor lifts 900ms → 1000ms.ResolveFailureBackoffZeroFlooredByMinRefresh— raw_backoff forced to 0 viarandom()==0, cap doesn't kick in, floor lifts 0 → 1000ms.The new test covers: elapsed past host_ttl at the moment of DNS failure, which takes the
refresh_interval = 0branch, then is floored tomin_refresh_interval_.What to add
A new
TEST_F(DnsCacheImplTest, ResolveFailurePastHostTtlFlooredByMinRefresh)placed immediately afterResolveFailureBackoffZeroFlooredByMinRefresh.Structure should mirror
ResolveFailureBackoffCappedByHostTtlclosely.Steps:
Configure:
Standard first resolve (copy from
ResolveFailureBackoffCappedByHostTtl):loadDnsCacheEntry,resolver_->resolve, completion callback with"10.0.0.1", asserts foronDnsHostAddOrUpdate,onLoadDnsCacheComplete,onDnsResolutionComplete(Completed), andenableTimer(dns_ttl_, _)for the success refresh.Advance sim time past
host_ttl— e.g.simTime().advanceTimeWait(std::chrono::milliseconds(1500));(1500ms > host_ttl=1000ms soelapsed >= host_ttl_is true when the failure handler runs).Invoke
resolve_timer->invokeCallback()to trigger the re-resolve.EXPECT_CALLon*timeout_timer, enableTimer(5000ms, nullptr)and onresolve("foo.com", _, _)to capture the new resolve_cb.Trigger the DNS failure:
EXPECT_CALL(*timeout_timer, disableTimer());EXPECT_CALL(update_callbacks_, onDnsHostAddOrUpdate(_, _)).Times(0);EXPECT_CALL(callbacks, onLoadDnsCacheComplete(_)).Times(0);EXPECT_CALL(update_callbacks_, onDnsResolutionComplete("foo.com:80", DnsHostInfoEquals("10.0.0.1:80", "foo.com", false), Network::DnsResolver::ResolutionStatus::Failure));EXPECT_CALL(*resolve_timer, enableTimer(std::chrono::milliseconds(1000), _));— 0 (past-TTL branch) floored to 1000ms (dns_min_refresh_rate). Not 0. Not 1ms.resolve_cb(Network::DnsResolver::ResolutionStatus::Failure, "", TestUtility::makeDnsResponse({}));checkStats(2 attempt, 1 success, 1 failure, 1 address_changed, 1 added, 0 removed, 1 num_hosts);Eviction follow-up (mirror
ResolveFailureBackoffCappedByHostTtl's tail):simTime().advanceTimeWait(std::chrono::milliseconds(1000));— advance by the floored intervalEXPECT_CALL(update_callbacks_, onDnsHostRemove("foo.com:80"));resolve_timer->invokeCallback();— should evict because(now - last_used) >= host_ttl_holds (it holds well past the boundary since we advanced 1500ms + 1000ms = 2500ms of elapsed time vs 1000ms TTL).checkStats(2 attempt, 1 success, 1 failure, 1 address_changed, 1 added, 1 removed, 0 num_hosts);Add a short comment on the key assertion explaining: "Past-TTL branch: raw backoff is discarded and refresh_interval becomes 0; the unconditional floor lifts it to
dns_min_refresh_rate(1000ms)."Do NOT
dns_cache_impl.cc,dns_cache_impl.h, the fixture class, orshared_pool*.flake-dfp-no.Build / test
//test/extensions/common/dynamic_forward_proxy:dns_cache_impl_test(if bazel unavailable, note that in the PR bo...This pull request was created from Copilot chat.