KAFKA-15425: Fail fast in Admin::listOffsets when topic (but not partition) metadata is not found by C0urante · Pull Request #14314 · apache/kafka

C0urante · 2023-08-30T19:29:34Z

This restores previous behavior for Admin::listOffsets, which was to fail immediately if topic metadata could not be found, and only retry if metadata for one or more specific partitions could not be found.

There is a subtle difference here: prior to #13432, the operation would be retried if any metadata error was reported for any individual topic partition, even if an error was also reported for the entire topic. With this change, the operation always fails if an error is reported for the entire topic, even if an error is also reported for one or more individual topic partitions.

I am not aware of any cases where brokers might return both topic- and topic partition-level errors for a metadata request, and if there are none, then this change should be safe. However, if there are such cases, we may need to refine this PR to remove the discrepancy in behavior.

Committer Checklist (excluded from commit message)

Verify design and implementation
Verify test coverage and CI build status
Verify documentation (including upgrade notes)

…ition) metadata is not found

jolshan · 2023-08-31T23:15:23Z

What was the original rationale for https://issues.apache.org/jira/browse/KAFKA-12339? My understanding is that when a topic is first created, there is sometimes a delay to the metadata propagation. It seems like connect was encountering this issue and rather than adding code to retry they wanted it to be done automatically.

I can see the issue with the compatibility break, but I also see the issue with the metadata propagation. Is there anyway to configure whether we prefer to retry on such an error? I see (tolerateUnknownTopics) but not sure exactly how it works. Or is the argument this should always be left to the client to handle the manual retries?

C0urante · 2023-09-01T01:01:04Z

Thanks @jolshan, great questions!

What was the original rationale for https://issues.apache.org/jira/browse/KAFKA-12339?

That ticket was filed in response to a change in behavior caused by #9780, which altered some parts of Kafka Connect to use an admin client instead of a consumer to get the end offsets of internal topics. Kafka Connect already had support for automatically creating these topics on startup, so the switch to using an admin client to list end offsets for internal topics caused some workers to fail on startup after trying to list end offsets for a topic they'd just created. This is because consumers retried when listing end offsets for topics that were unknown to the broker, whereas admin clients did not.

My understanding is that when a topic is first created, there is sometimes a delay to the metadata propagation. It seems like connect was encountering this issue and rather than adding code to retry they wanted it to be done automatically.

This is correct, although in #11797, we reverted the automatic retry logic for Admin::listOffsets and added it to TopicAdmin::retryEndOffsets (a Connect-specific class and method) instead.

I can see the issue with the compatibility break, but I also see the issue with the metadata propagation. Is there anyway to configure whether we prefer to retry on such an error?

There's no user-facing way to handle this scenario in isolation. Users can obviously tweak generic retry settings for their Kafka clients, but that has other, much-wider implications. The tolerateUnknownTopics field is limited to the internal API we use for the admin client and is just a bit of a hack to allow us to fail fast in a specific scenario (topic is unknown to the broker) in some contexts (trying to list topic offsets using an admin client) but retry in others (everything else using the PartitionLeaderStrategy).

Or is the argument this should always be left to the client to handle the manual retries?

Yes, that's the idea--at least for Admin::listOffsets. Every other API remains unaffected, so, e.g., Producer::send will still retry on attempts to produce to an unknown topic partition (which is its own source of headaches, but that's out of scope for this PR).

To summarize: if a topic has just been created, it's possible that Admin::listOffsets will throw a spurious UnknownTopicOrPartitionException with this change, which is less desirable than an automatic retry. However, if the topic genuinely doesn't exist, Admin::listOffsets throwing that exception immediately instead of retrying until the retry timeout expires is better behavior. And, if the topic has just been created, it's debatable whether it's fair to retry for the full retry timeout anyways, instead of a shorter period. This was one of the points in the final rationale for reverting this behavioral change originally:

For example, a likely root cause of KAFKA-12339 was a Connect worker instantiates its KafkaConfigBackingStore (and other internal topic stores), which creates a KafkaBasedLog that as part of start() creates the topic if it doesn't exist and then immediately tries to read the offsets. That reading of offsets can fail if the metadata for the newly created topic hasn't been propagated to all of the brokers. We can solve this particular root cause easily by retrying the reading of offsets within the KafkaBasedLog's start() method, and since topic metadata should be propagated relatively quickly, we don't need to retry for that long – and most of the time we'd probably successfully retry within a few retries.

jolshan · 2023-09-05T21:02:14Z

Sorry for the delay @C0urante and thanks for the explanation. It makes sense.

Given that we moved the retries to TopicAdmin::retryEndOffsets does this mean there is not any known wide usage or need for this retry?

I guess that was sort of the question in the final part of the pr description.

jolshan · 2023-09-05T21:17:09Z

    ) {
        switch (topicError) {
            case UNKNOWN_TOPIC_OR_PARTITION:
+                if (!tolerateUnknownTopics) {


the operation would be retried if any metadata error was reported for any individual topic partition, even if an error was also reported for the entire topic. With this change, the operation always fails if an error is reported for the entire topic, even if an error is also reported for one or more individual topic partitions.

This change only changes the behavior for unknown topic or partition errors right? Leader/broker not available continue to be retriable, topic auth, invalid topic, and other errors continue to fail the request?

am not aware of any cases where brokers might return both topic- and topic partition-level errors for a metadata request, and if there are none, then this change should be safe.

Asking because my understanding is the only case where we would be concerned about topic and partition level errors is when the topic level error is unknown topic or partition but somehow the partitions for that topic have a different error?

This change only changes the behavior for unknown topic or partition errors right? Leader/broker not available continue to be retriable, topic auth, invalid topic, and other errors continue to fail the request?

Exactly, and we have unit testing coverage to demonstrate this.

Asking because my understanding is the only case where we would be concerned about topic and partition level errors is when the topic level error is unknown topic or partition but somehow the partitions for that topic have a different error?

This is correct, and I'm optimistic that no such case exists, but wanted to note the potential change in behavior just to be safe.

jolshan · 2023-09-05T21:19:47Z


-        try (AdminClientUnitTestEnv env = new AdminClientUnitTestEnv(cluster)) {
-            env.kafkaClient().setNodeApiVersions(NodeApiVersions.create());
+        Map<MetadataResponse, Class<? extends Throwable>> responsesAndFailures = new HashMap<>();


extremely minor nit (we could do in a followup and not a blocker bug) but could we parameterize this?

Leave it cleaner than you found it!

Pushed a commit that adds parameterization; if you'd prefer, can revert and publish as a follow-up PR after we merge this one.

C0urante · 2023-09-07T14:39:23Z

No worries about the delay, thanks for continuing to work on this @jolshan!

Given that we moved the retries to TopicAdmin::retryEndOffsets does this mean there is not any known wide usage or need for this retry?

Not that I'm aware of, though if anyone believes otherwise, we can discuss later. Considering the change in behavior that we're targeting here seems to have been made accidentally, I think we should err on the side of preserving existing behavior until we have a rationale for changing it and do so intentionally.

I guess that was sort of the question in the final part of the pr description.

(Discussed further in inline comments)

…eErrors

jolshan · 2023-09-07T16:09:48Z

+                Arguments.of(
+                        // We fail fast when the entire topic is unknown
+                        Errors.UNKNOWN_TOPIC_OR_PARTITION,
+                        Errors.NONE,


For my understanding, if the topic also had unknown topic or partition as the partition error message, we would fail fast there too.

With this change, the operation always fails if an error is reported for the entire topic, even if an error is also reported for one or more individual topic partitions.

Would we want to add this test case too just so the behavior is also shown in tests?

Sure, done. I've tried to clarify in the comments that these are unusual cases we're testing for lest anyone think that this is normal broker behavior.

…onRetriableErrors

jolshan

Thanks Chris!

jolshan · 2023-09-07T17:35:23Z

Also -- I assume we want to pick this to 3.6 as well. Is that the only branch @C0urante?

jolshan · 2023-09-07T21:01:53Z

Tests look unrelated and like flakes I've seen before, so I will go ahead and merge + pick to 3.6

…ition) metadata is not found (#14314) This restores previous behavior for Admin::listOffsets, which was to fail immediately if topic metadata could not be found, and only retry if metadata for one or more specific partitions could not be found. There is a subtle difference here: prior to #13432, the operation would be retried if any metadata error was reported for any individual topic partition, even if an error was also reported for the entire topic. With this change, the operation always fails if an error is reported for the entire topic, even if an error is also reported for one or more individual topic partitions. I am not aware of any cases where brokers might return both topic- and topic partition-level errors for a metadata request, and if there are none, then this change should be safe. However, if there are such cases, we may need to refine this PR to remove the discrepancy in behavior. Reviewers: Justine Olshan <jolshan@confluent.io>

C0urante · 2023-09-07T22:12:39Z

Sorry! Yes, that's correct--3.6 is all we need to backport to. Thanks @jolshan!

…ition) metadata is not found (apache#14314) This restores previous behavior for Admin::listOffsets, which was to fail immediately if topic metadata could not be found, and only retry if metadata for one or more specific partitions could not be found. There is a subtle difference here: prior to apache#13432, the operation would be retried if any metadata error was reported for any individual topic partition, even if an error was also reported for the entire topic. With this change, the operation always fails if an error is reported for the entire topic, even if an error is also reported for one or more individual topic partitions. I am not aware of any cases where brokers might return both topic- and topic partition-level errors for a metadata request, and if there are none, then this change should be safe. However, if there are such cases, we may need to refine this PR to remove the discrepancy in behavior. Reviewers: Justine Olshan <jolshan@confluent.io>

C0urante added the admin AdminClient label Aug 30, 2023

C0urante changed the title ~~KAFKA-12879: Fail fast in Admin::listOffsets when topic (but not partition) metadata is not found~~ KAFKA-15425: Fail fast in Admin::listOffsets when topic (but not partition) metadata is not found Aug 31, 2023

KAFKA-15425: Fail fast in Admin::listOffsets when topic (but not part…

246acd5

…ition) metadata is not found

C0urante force-pushed the kafka-12879 branch from 0859eb5 to 246acd5 Compare August 31, 2023 22:19

jolshan reviewed Sep 5, 2023

View reviewed changes

Parameterize KafkaAdminClientTest::testListOffsetsMetadataNonRetriabl…

8954742

…eErrors

jolshan reviewed Sep 7, 2023

View reviewed changes

Add more test cases to KafkaAdminClientTest::testListOffsetsMetadataN…

d0eb999

…onRetriableErrors

jolshan approved these changes Sep 7, 2023

View reviewed changes

jolshan merged commit 77a91be into apache:trunk Sep 7, 2023

C0urante deleted the kafka-12879 branch September 11, 2023 14:06

Conversation

C0urante commented Aug 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Committer Checklist (excluded from commit message)

Uh oh!

jolshan commented Aug 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

C0urante commented Sep 1, 2023

Uh oh!

jolshan commented Sep 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jolshan Sep 5, 2023

Choose a reason for hiding this comment

Uh oh!

jolshan Sep 6, 2023

Choose a reason for hiding this comment

Uh oh!

C0urante Sep 7, 2023

Choose a reason for hiding this comment

Uh oh!

jolshan Sep 5, 2023

Choose a reason for hiding this comment

Uh oh!

C0urante Sep 7, 2023

Choose a reason for hiding this comment

Uh oh!

C0urante commented Sep 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jolshan Sep 7, 2023

Choose a reason for hiding this comment

Uh oh!

C0urante Sep 7, 2023

Choose a reason for hiding this comment

Uh oh!

jolshan left a comment

Choose a reason for hiding this comment

Uh oh!

jolshan commented Sep 7, 2023

Uh oh!

jolshan commented Sep 7, 2023

Uh oh!

C0urante commented Sep 7, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

C0urante commented Aug 30, 2023 •

edited

Loading

jolshan commented Aug 31, 2023 •

edited

Loading

jolshan commented Sep 5, 2023 •

edited

Loading

C0urante commented Sep 7, 2023 •

edited

Loading