[improve][client/broker] Add DnsResolverGroup to share DNS cache across multiple PulsarClient instances #24784
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
There "PIP-234: Support using shared thread pool across multiple Pulsar client instance", #19074, which never went forward. The intention is that multiple Pulsar clients could share resources. This is not only needed for thread pools, but also for the Netty DNS resolver cache which is represented by
io.netty.resolver.dns.DnsAddressResolverGroupin Netty.PIP-234 has been discussed in the past in threads
Since we don't want to expose Netty internals on the public API, a PulsarClientGroup API has been discussed earlier to abstract this.
Before PIP-234 becomes a reality, it's useful to have an internal API for sharing Netty's DnsAddressResolverGroup across multiple PulsarClient instances.
There's already an internal API for sharing instances. It's the PulsarClientImpl's Lombok generated builder:
pulsar/pulsar-client/src/main/java/org/apache/pulsar/client/impl/PulsarClientImpl.java
Lines 198 to 203 in a66e806
This PR adds a class
org.apache.pulsar.client.impl.DnsResolverGroupImplwhich could later on be abstracted by an interface whenever we get to implement the PIP-234'sPulsarClientGroupabstraction.In addition, this PR contains changes to use a shared DnsResolverGroup for Pulsar broker clients.
There were changes in the past where resource sharing was incrementally added in #12037, #13836, and #13839.
The problem that this could solve is a heavy load on the DNS server when a DNS entry expires and many clients access the same entry. This was something that was already addressed in the past for Pulsar Proxy, #15403 .
Similar problems are present in Flink Pulsar use cases where each Pulsar sink and source creates it's own Pulsar client instance. It would be useful to have PIP-234 available for addressing that with a public Pulsar client API. However, in the mean time, the internal API in this PR could be used as a workaround.
It should also be noted that Kubernetes default ndots 5 configuration adds heavy load on the DNS server.
This article explains the ndots 5 issue. The way to address it for the service url is to add an extra trailing dot to make the DNS name an absolute FQDN. There's also a Pulsar discussion at #24030 with more information about unnecessary DNS lookups. Changing the service url isn't sufficient. There isn't a direct feature to make the pulsar and broker return the address in absolute FQDN dns name format for Pulsar topic lookups. A similar problem exists for the Pulsar Proxy. I'll create a separate PR to address that.
Modifications
Documentation
docdoc-requireddoc-not-neededdoc-complete