KAFKA-10223; Use NOT_LEADER_OR_FOLLOWER instead of non-retriable REPLICA_NOT_AVAILABLE for consumers#8979
Conversation
|
I think it makes sense to do this, but we also need to figure out a way to:
|
|
I'm not sure the REPLICA_NOT_AVAILABLE error is super helpful as distinguished from NOT_LEADER_FOR_PARTITION. One option here is just to use the latter in all cases. |
|
I think the confusing part is returning NOT_LEADER_FOR_PARTITION if you are trying to fetch from a follower. |
|
Maybe we could rename |
|
I agree it's a little weird. Renaming NOT_LEADER is not a bad idea. I think what we want the error to convey is that the recipient is not a valid request target for the given partition. Maybe |
|
@ijuma @hachikuji Thank you, I guess we need a KIP anyway even for renaming |
|
Renaming the error code name has no compatibility implications so in theory it could be done without a KIP. We could make this really boring and safe and call it |
|
If you both agree that the name above is reasonable, I would suggest sending a follow up to the KIP-320 thread covering what we've learned and how we're addressing it. It would be good to update the KIP text too so that we have something to refer to. |
|
One issue is that the exception |
|
@ijuma Sounds good to me. Will see what @hachikuji thinks as well. |
|
Also cc @edenhill as this affects non Java clients too. |
|
We may also want to deprecate the error and exception for ReplicaNotAvailable since it would not serve any purpose (apart from clients retaining compatibility with some brokers), I think. |
|
Renaming to NOT_LEADER_OR_FOLLOWER and reusing that error code in place of REPLICA_NOT_AVAILABLE sounds like a good idea. 👍 |
|
|
|
KIP-392 sounds good. |
|
@ijuma @hachikuji @edenhill Thanks for the responses. I will follow up on KIP-392 discussion thread and update the PR. |
a7b5bf9 to
8dc37b7
Compare
|
@ijuma @hachikuji Updated PR as discussed. |
ijuma
left a comment
There was a problem hiding this comment.
Thanks for the PR. Some quick comments.
There was a problem hiding this comment.
Would we want to change this for new IBPs?
There was a problem hiding this comment.
Updated to change to NOT_LEADER_OR_FOLLOWER from 2.7 onwards. Also added note to upgrade docs.
|
@ijuma Thanks for the review. Another usage of REPLICA_NOT_AVAILABLE is in MetadataCache: https://github.com/rajinisivaram/kafka/blob/eaea63ec201fefd32256d75ed308aca311b42d1f/core/src/main/scala/kafka/server/MetadataCache.scala#L105 and a couple of lines below that one. These get returned for partition metadata in metadata responses and NOT_LEADER_OR_FOLLOWER doesn't look suitable. What do you think? |
4707714 to
8bcda5e
Compare
| * This could a transient exception during reassignments. | ||
| */ | ||
| @SuppressWarnings("deprecation") | ||
| public class NotLeaderOrFollowerException extends NotLeaderForPartitionException { |
There was a problem hiding this comment.
One nitpick on the naming. It seems a little strange that followers might return NOT_LEADER_OR_FOLLOWER for requests which are intended for the leader. I think my suggestion to frame the name around the broker not being the intended target of the request was a little more accurate, though I admit the name was clumsier. Perhaps this class would be a good place to document the semantics? Maybe worth mentioning the Produce/Fetch behavior explicitly.
There was a problem hiding this comment.
Updated javadoc.
| LeaderNotAvailableException::new), | ||
| NOT_LEADER_FOR_PARTITION(6, "This server is not the leader for that topic-partition.", | ||
| NotLeaderForPartitionException::new), | ||
| NOT_LEADER_OR_FOLLOWER(6, "This server is not the leader or follower for that topic-partition.", |
There was a problem hiding this comment.
This would also be a good place to be clearer about the interpretation of this error. Maybe something like this?
For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition.
| case e: NotLeaderOrFollowerException => | ||
| warn(s"Unable to alter log dirs for $topicPartition", e) | ||
| // Retaining REPLICA_NOT_AVAILABLE exception for ALTER_REPLICA_LOG_DIRS for older versions for compatibility | ||
| (topicPartition, if (config.interBrokerProtocolVersion >= KAFKA_2_7_IV0) Errors.NOT_LEADER_OR_FOLLOWER else Errors.REPLICA_NOT_AVAILABLE) |
There was a problem hiding this comment.
Hmm.. Wouldn't it make more sense to tie this to a version bump to this protocol rather than the IBP?
There was a problem hiding this comment.
Yeah, this was a bad suggestion on my part as I didn't realize that we were using this from clients/tools versus for inter broker protocols. Maybe we should revert to using REPLICA_NOT_AVAILABLE here.
|
@hachikuji @ijuma Thanks for the reviews, have addressed the comments so far. |
hachikuji
left a comment
There was a problem hiding this comment.
LGTM. Just one minor comment.
| * - {@link Errors#TOPIC_AUTHORIZATION_FAILED} If the user does not have READ access to a requested topic | ||
| * - {@link Errors#REPLICA_NOT_AVAILABLE} If the request is received by a broker which is not a replica | ||
| * - {@link Errors#NOT_LEADER_FOR_PARTITION} If the broker is not a leader and either the provided leader epoch | ||
| * - {@link Errors#REPLICA_NOT_AVAILABLE} If the request is received by a broker with version 2.4 to 2.6 which is not a replica |
There was a problem hiding this comment.
Is the note about the range from 2.4 to 2.6 correct? I think this error has always been possible in the case of a reassignment.
There was a problem hiding this comment.
Updated to say < 2.6.
| BROKER_NOT_AVAILABLE(8, "The broker is not available.", | ||
| BrokerNotAvailableException::new), | ||
| REPLICA_NOT_AVAILABLE(9, "The replica is not available for the requested topic-partition.", | ||
| REPLICA_NOT_AVAILABLE(9, "The replica is not available for the requested topic-partition. This is used for backward compatibility for " + |
There was a problem hiding this comment.
@hachikuji @rajinisivaram Since this is still used outside of produce/fetch, maybe the backwards compatibility message needs to be qualified?
There was a problem hiding this comment.
Updated the doc here and added javadoc for ReplicaNotAvailableException with more detail.
|
@rajinisivaram With regards to MetadataResponse, it's currently documented as: I don't think we should change it, but it does raise the question of whether we should make |
|
I think it would make sense to change |
|
@hachikuji @ijuma Thanks for the reviews. I have updated the docs and made ReplicaNotAvailableException a subclass of InvalidMetadataException. Also moved upgrade doc under 2.6. If this doesn't go into 2.6, I will change all the javadocs and upgrade note to say 2.7. |
| /** | ||
| * The replica is not available for the requested topic partition. This may be | ||
| * a transient exception during reassignments. From version 2.6 onwards, Fetch requests | ||
| * and other requests intended only for the leader of the topic partition return |
There was a problem hiding this comment.
Should it be leader or follower her and in the ReplicaNotAvailable message?
|
@ijuma @hachikuji @bob-barrett Thanks for the reviews, merging to trunk and 2.6. |
…ICA_NOT_AVAILABLE for consumers (#8979) Brokers currently return NOT_LEADER_FOR_PARTITION to producers and REPLICA_NOT_AVAILABLE to consumers if a replica is not available on the broker during reassignments. Non-Java clients treat REPLICA_NOT_AVAILABLE as a non-retriable exception, Java consumers handle this error by explicitly matching the error code even though it is not an InvalidMetadataException. This PR renames NOT_LEADER_FOR_PARTITION to NOT_LEADER_OR_FOLLOWER and uses the same error for producers and consumers. This is compatible with both Java and non-Java clients since all clients handle this error code (6) as retriable exception. The PR also makes ReplicaNotAvailableException a subclass of InvalidMetadataException. - ALTER_REPLICA_LOG_DIRS continues to return REPLICA_NOT_AVAILABLE. Retained this for compatibility since this request never returned NOT_LEADER_FOR_PARTITION earlier. - MetadataRequest version 0 also returns REPLICA_NOT_AVAILABLE as topic-level error code for compatibility. Newer versions filter these out and return Errors.NONE, so didn't change this. - Partition responses in MetadataRequest return REPLICA_NOT_AVAILABLE to indicate that one of the replicas is not available. Did not change this since NOT_LEADER_FOR_PARTITION is not suitable in this case. Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>, Bob Barrett <bob.barrett@confluent.io>
* MINOR: Filter out quota configs for ConfigCommand using --bootstrap-server (apache#9030) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, David Jacot <djacot@confluent.io>, Ron Dagostino <rdagostino@confluent.io> * MINOR: Fix flaky system test assertion after static member fencing (apache#9033) The test case `OffsetValidationTest.test_fencing_static_consumer` fails periodically due to this error: ``` Traceback (most recent call last): File "/home/jenkins/workspace/system-test-kafka_2.6/kafka/venv/lib/python2.7/site-packages/ducktape-0.7.8-py2.7.egg/ducktape/tests/runner_client.py", line 134, in run data = self.run_test() File "/home/jenkins/workspace/system-test-kafka_2.6/kafka/venv/lib/python2.7/site-packages/ducktape-0.7.8-py2.7.egg/ducktape/tests/runner_client.py", line 192, in run_test return self.test_context.function(self.test) File "/home/jenkins/workspace/system-test-kafka_2.6/kafka/venv/lib/python2.7/site-packages/ducktape-0.7.8-py2.7.egg/ducktape/mark/_mark.py", line 429, in wrapper return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) File "/home/jenkins/workspace/system-test-kafka_2.6/kafka/tests/kafkatest/tests/client/consumer_test.py", line 257, in test_fencing_static_consumer assert len(consumer.dead_nodes()) == num_conflict_consumers AssertionError ``` When a consumer stops, there is some latency between when the shutdown is observed by the service and when the node is added to the dead nodes. This patch fixes the problem by giving some time for the assertion to be satisfied. Reviewers: Boyang Chen <boyang@confluent.io> * MINOR: Improve log4j for per-consumer assignment (apache#8997) Add log4j entry summarizing the assignment (previous owned and assigned) at the consumer level. Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Boyang Chen <boyang@confluent.io> * KAFKA-10223; Use NOT_LEADER_OR_FOLLOWER instead of non-retriable REPLICA_NOT_AVAILABLE for consumers (apache#8979) Brokers currently return NOT_LEADER_FOR_PARTITION to producers and REPLICA_NOT_AVAILABLE to consumers if a replica is not available on the broker during reassignments. Non-Java clients treat REPLICA_NOT_AVAILABLE as a non-retriable exception, Java consumers handle this error by explicitly matching the error code even though it is not an InvalidMetadataException. This PR renames NOT_LEADER_FOR_PARTITION to NOT_LEADER_OR_FOLLOWER and uses the same error for producers and consumers. This is compatible with both Java and non-Java clients since all clients handle this error code (6) as retriable exception. The PR also makes ReplicaNotAvailableException a subclass of InvalidMetadataException. - ALTER_REPLICA_LOG_DIRS continues to return REPLICA_NOT_AVAILABLE. Retained this for compatibility since this request never returned NOT_LEADER_FOR_PARTITION earlier. - MetadataRequest version 0 also returns REPLICA_NOT_AVAILABLE as topic-level error code for compatibility. Newer versions filter these out and return Errors.NONE, so didn't change this. - Partition responses in MetadataRequest return REPLICA_NOT_AVAILABLE to indicate that one of the replicas is not available. Did not change this since NOT_LEADER_FOR_PARTITION is not suitable in this case. Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>, Bob Barrett <bob.barrett@confluent.io> * MINOR: Improved code quality for various files. (apache#9037) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com> * MINOR: Fixed some resource leaks. (apache#8922) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com> * MINOR: Enable broker/client compatibility tests for 2.5.0 release - Add missing broker/client compatibility tests for 2.5.0 release Author: Manikumar Reddy <manikumar.reddy@gmail.com> Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com> Closes apache#9041 from omkreddy/compat * KAFKA-10286: Connect system tests should wait for workers to join group (apache#9040) Currently, the system tests `connect_distributed_test` and `connect_rest_test` only wait for the REST api to come up. The startup of the worker includes an asynchronous process for joining the worker group and syncing with other workers. There are some situations in which this sync takes an unusually long time, and the test continues without all workers up. This leads to flakey test failures, as worker joins are not given sufficient time to timeout and retry without waiting explicitly. This changes the `ConnectDistributedTest` to wait for the Joined group message to be printed to the logs before continuing with tests. I've activated this behavior by default, as it's a superset of the checks that were performed by default before. This log message is present in every version of DistributedHerder that I could find, in slightly different forms, but always with `Joined group` at the beginning of the log message. This change should be safe to backport to any branch. Signed-off-by: Greg Harris <gregh@confluent.io> Author: Greg Harris <gregh@confluent.io> Reviewer: Randall Hauch <rhauch@gmail.com> * KAFKA-10295: Wait for connector recovery in test_bounce (apache#9043) Signed-off-by: Greg Harris <gregh@confluent.io> Co-authored-by: Rajini Sivaram <rajinisivaram@googlemail.com> Co-authored-by: Jason Gustafson <jason@confluent.io> Co-authored-by: Guozhang Wang <wangguoz@gmail.com> Co-authored-by: Leonard Ge <62600326+leonardge@users.noreply.github.com> Co-authored-by: Manikumar Reddy <manikumar.reddy@gmail.com> Co-authored-by: Greg Harris <gregh@confluent.io>
|
I was recently made aware of this change, and am looking to make the corresponding change in the NodeJS client tulios/kafkajs#1155. One thing I don't really understand about this change is how REPLICA_NOT_AVAILABLE could change to being a retriable error. What does this mean for backwards compatibility with clusters pre-2.6? Was it always supposed to be retriable, and this just fixed an oversight, or does it mean that we will now end up retrying an operation that actually isn't retriable on clusters from before 2.6? |
|
@Nevon REPLICA_NOT_AVAILABLE was a transient error during reassignments with older clusters as well and should have been marked retriable. But it was not being handled as a retriable error by clients apart from Java consumers which did handle this specially as a retriable error. With the changes in this PR, we don't return REPLICA_NOT_AVAILABLE to clients except in the cases listed in the PR description. This ensures that both Java and existing non-Java consumers without any changes retry transient errors during reassignment (now NOT_LEADER_OR_FOLLOWER). |
|
That clears it right up. Thanks for the info! |
Brokers currently return NOT_LEADER_FOR_PARTITION to producers and REPLICA_NOT_AVAILABLE to consumers if a replica is not available on the broker during reassignments. Non-Java clients treat REPLICA_NOT_AVAILABLE as a non-retriable exception, Java consumers handle this error by explicitly matching the error code even though it is not an InvalidMetadataException. This PR renames
NOT_LEADER_FOR_PARTITIONtoNOT_LEADER_OR_FOLLOWERand uses the same error for producers and consumers. This is compatible with both Java and non-Java clients since all clients handle this error code (6) as retriable exception. The PR also makes ReplicaNotAvailableException a subclass of InvalidMetadataException.return Errors.NONE, so didn't change this.
Committer Checklist (excluded from commit message)