KAFKA-12257: Consumer mishandles topics deleted and recreated with the same name#10952
KAFKA-12257: Consumer mishandles topics deleted and recreated with the same name#10952hachikuji merged 7 commits intoapache:3.0from
Conversation
| return Optional.ofNullable(metadataByPartition.get(topicPartition)); | ||
| } | ||
|
|
||
| Map<String, Uuid> topicIds() { |
There was a problem hiding this comment.
Do we need to expose the map or could we just have lookup methods:
Uuid topicId(String topicName);
String topicName(Uuid topicId);There was a problem hiding this comment.
Hmm. I suppose we could have lookup methods. This has implications for the Fetch PR though.
There was a problem hiding this comment.
It is used when getting all the topic IDs to put into the fetch request/session in Fetcher there. Maybe it is ok to call a method multiple times there. I also use it in tests, but maybe we could change that usage.
| if (id != null) | ||
| newTopicIds.put(partition.topic(), id); | ||
| else | ||
| // Remove if the latest metadata does not have a topic ID |
There was a problem hiding this comment.
What is the rationale to discard topicId information? Is this to deal with downgrades?
There was a problem hiding this comment.
Yes, for the fetch path, we want to know when topic IDs are removed as quickly as possible so we can switch over to the older fetch version that uses topic names.
There was a problem hiding this comment.
Does this still make sense in the context of 3.0, which does not have topicId fetch logic?
There was a problem hiding this comment.
I suppose it is not needed, but I'm not sure if it helps a lot to remove it.
There was a problem hiding this comment.
We can leave it as is I guess since I can't think of a strong case to remove it. It is a rare situation that we would hit this case and the consequence of losing the topic ID is probably not too bad. Worst case, we might miss a recreation which occurred while the cluster was rolling to upgrade or downgrade. On the other hand, it could lead to other kinds of problems if we allow updates to the epoch information tied to a topic ID without being able to validate that the topic ID is correct, so maybe this logic is for the best.
There was a problem hiding this comment.
Yeah. That was my reasoning. I thought the upgrade/downgrade case would be rare and the guarantees harder to reason about there.
|
Also, please remember to rebase with the latest |
|
Yup. This merge conflict was caused by my other PR 😅 |
| log.debug("Topic ID changed, so this topic must have been recreated. " + | ||
| "Removing last seen epoch {} for the old partition {} and adding epoch {} from new metadata", currentEpoch, tp, newEpoch); | ||
| lastSeenLeaderEpochs.put(tp, newEpoch); | ||
| return Optional.of(partitionMetadata); |
There was a problem hiding this comment.
Leaving this comment here for lack of an alternative location. This patch takes a good first step in improving consumer behavior for the topic recreation case. At least we are able to detect and discard the old epoch state. In fact, it does a little more than that since, combined with the fetch validation logic, we are likely to detect that the old fetch position is no longer valid. Most likely this case would get raised to the user as a LogTruncationException, which might not be ideal, but at least is justifiable. However, it doesn't quite close the door on reuse of the fetch position since it may remain valid on the recreated topic. For the full solution, we probably need to track topicId in SubscriptionState as well so that we can force an offset reset whenever the topicId changes. I think it makes sense to do this in https://issues.apache.org/jira/browse/KAFKA-12975.
There was a problem hiding this comment.
I see. I think the main issue here was that we would ignore metadata updates when we were simply looking at the epoch. I believe that this PR solves the problem, but we can continue to improve beyond this.
There was a problem hiding this comment.
Yes, I was just pointing out that there is still a gap.
| if (id != null) | ||
| newTopicIds.put(partition.topic(), id); | ||
| else | ||
| // Remove if the latest metadata does not have a topic ID |
There was a problem hiding this comment.
Does this still make sense in the context of 3.0, which does not have topicId fetch logic?
| lastSeenLeaderEpochs.put(tp, newEpoch); | ||
| return Optional.of(partitionMetadata); | ||
| // If both topic IDs were valid and the topic ID changed, update the metadata | ||
| } else if (!topicId.equals(Uuid.ZERO_UUID) && oldTopicId != null && !topicId.equals(oldTopicId)) { |
There was a problem hiding this comment.
Hmm, shouldn't this check come before the epoch check? Admittedly, it's unlikely that a recreated topic would have a higher epoch, but we may as well handle that case.
By the way, it's a little inconsistent that this check uses both null and Uuid.ZERO_UUID to represent a missing value. Maybe we can use null consistently?
There was a problem hiding this comment.
This bugged me a bit too. The issue is that the request itself uses Uuid.ZERO_UUID, so we'd just have to convert that to null. We can do that if it is clearer to read.
There was a problem hiding this comment.
Also, these checks result in the same thing. The only difference is the log debug line. If it makes more sense to log the topic ID change, I can switch the order.
There was a problem hiding this comment.
Yes, the logging is what I had in mind. The log message is misleading otherwise.
hachikuji
left a comment
There was a problem hiding this comment.
LGTM. Thanks for the patch!
…e same name (#10952) Store topic ID info in consumer metadata. We will always take the topic ID from the latest metadata response and remove any topic IDs from the cache if the metadata response did not return a topic ID for the topic. The benefit of this is that it lets us detect topic recreations. This allows the client to update metadata even if the leader epoch is lower than what was seen previously. Reviewers: Luke Chen <showuon@gmail.com>, Jason Gustafson <jason@confluent.io>
Store topic ID info in consumer metadata. We will always take the topic ID from the latest metadata response and remove any topic IDs from the cache if the metadata response did not return a topic ID for the topic.
With the addition of topic IDs, when we encounter a new topic ID (recreated topic) we can choose to get the topic's metadata even if the epoch is lower than the deleted topic.
The idea is that when we update from no topic IDs to using topic IDs, we will not count the topic as new (It could be the same topic but with a new ID). We will only take the update if the topic ID changed.
Added tests for this scenario as well as some tests for storing the topic IDs. Also added tests for topic IDs in metadata cache.
Committer Checklist (excluded from commit message)