KAFKA-6299. Fix AdminClient error handling when metadata changes#4295
KAFKA-6299. Fix AdminClient error handling when metadata changes#4295hachikuji merged 5 commits intoapache:trunkfrom
Conversation
There was a problem hiding this comment.
Should this if block be placed above the if block on line 729 ?
There was a problem hiding this comment.
Hmm-- I don't think so. We don't want to return the exception of the stale metadata, if new metadata has been requested.
There was a problem hiding this comment.
Probably add comment on why synchronization on callsToSend is not needed.
There was a problem hiding this comment.
I will add a comment to the declaration of callsToSend, since this is used in many other places.
ccadd98 to
03be633
Compare
|
Test failure was |
|
@cmccabe The test failure does seem caused by this patch. I reproduced locally on this branch. Can you check again? |
hachikuji
left a comment
There was a problem hiding this comment.
Thanks for the patch. Left a few comments/questions.
There was a problem hiding this comment.
This is kind of a weird contract. Maybe the name should be checkMetadataFetchError or something like that?
There was a problem hiding this comment.
checkMetadataError might be a better name
There was a problem hiding this comment.
Is this necessary? It seems like the only way it's possible for the if condition to be true is if we already updated lastSeenMetadata below.
There was a problem hiding this comment.
The point of this if statement is that we shouldn't keep calling Metadata#requestUpdate over and over. We should call it once, wait for an update, and then call it again after that if needed.
There was a problem hiding this comment.
I was not questioning the need of the if, but the reassignment to lastSeenMetadataVersion, which seemed redundant given how it is updated below. I may have answered my own question however. The problem is that the version in Metadata is updated in two additional cases: 1) on initial bootstrapping, and 2) after a periodic refresh. We handle the first case I think because of this reassignment and the fact that the two fields are initialized to 0.
I'm not sure about the second case, however. Say, for example, that lowestValidMetadataVersion and lastSeenMetadataVersion are both 5 and no update has been requested. After the metadata max age expires, we'll automatically trigger an update and bump lastSeenMetadataVersion to 6. Now if we request a metadata update, nothing will happen because we'll never reach equality again. We may be able to fix it by changing to an inequality, but I'm not sure. Some testing would be helpful.
There was a problem hiding this comment.
nit: not that big of a deal, but since we do the same thing in several places, maybe we could add a requestMetadataUpdate method to the runnable.
There was a problem hiding this comment.
Is the NOT_CONTROLLER error possible for Metadata requests? Also, are there any other errors we care about at this level (e.g. NOT_LEADER errors)?
There was a problem hiding this comment.
No, it's not. Good catch.
"Not leader for partition" is something we care about for certain calls, but handling it will be harder. I want to do that in a follow-on change.
There was a problem hiding this comment.
I guess I would have expected that these transient errors would get retried internally. Is the intent to do this separately?
There was a problem hiding this comment.
Good question. Unfortunately, they can't be retried internally because we have no generic top level error code. So if createTopics fails, you get back a response that has {foo : NOT_CONTROLLER, bar: NOT_CONTROLLER, baz: NOT_CONTROLLER, etc.} You can't parse this without knowing the subtype, which means it has to be done here.
There was a problem hiding this comment.
To clarify my question, does the user request fail because of a transient NOT_CONTROLLER error or do we retry the request? I had expected we would retry, but perhaps we are leaving that for future work?
There was a problem hiding this comment.
Have you verified that this works for all uses of version at the moment? It seems like a minor change in behavior since we use failedUpdate in DefaultMetadataUpdater if there are no nodes in the response. Currently this would cause uses such as ConsumerNetworkClient.awaitMetadataUpdate() to retry before returning.
There was a problem hiding this comment.
I don't think retrying is what we want, though. If you get an authentication exception from any broker, your auth is bad and you should fail. It's not a retryable exception, conceptually or in terms of code
There was a problem hiding this comment.
The case I am referring to is in DefaultMetadataupdater.handleCompletedMetadataResponse, which is different from the authentication failure path. We have a check to ensure that the metadata response contains at least one broker. With this change, we will update the metadata version even in the case that it is empty, which will cause methods like ConsumerNetworkClient.awaitMetadataUpdate() to return earlier than they currently do.
There was a problem hiding this comment.
Hmm. Maybe I'm misinterpreting, but are you suggesting that ConsumerNetworkClient#awaitMetadataUpdate() should wait forever (or until the timeout hits) when there is an authentication error fetching the metadata? That doesn't seem right.
There was a problem hiding this comment.
I am definitely not suggesting that. I am not talking about authentication failures at all. I am referring to the call to Metadata.failedUpdate that is in DefaultMetadataupdater.handleCompletedMetadataResponse. With this change, the version will be incremented following an "empty" metadata update. This will cause us to exit the loop in ConsumerNetworkClient#awaitMetadataUpdate() even though we have not received a valid update.
My point more generally is that we are changing the meaning of the version field inside Metadata. It seems we were intentionally using this before to indicate when we had received a valid new version of the metadata. But now we bump it even when there's a failure which means we can no longer use it to tell when we've seen a successful update. As far as I can tell, the impact may be minor, but we should consider all of the usages to be sure of it.
There was a problem hiding this comment.
Seems this was not really necessary since we just needed a sentinel value? Note that common/errors is a public package.
There was a problem hiding this comment.
I can use NetworkException instead (which is a subclass of InvalidMetadataException)
10ed8a5 to
66bbf66
Compare
| env.kafkaClient().setNode(env.cluster().nodeById(0)); | ||
| env.kafkaClient().prepareResponse(new CreateTopicsResponse(Collections.singletonMap("myTopic", new ApiError(Errors.NONE, "")))); | ||
| KafkaFuture<Void> future = env.adminClient().createTopics( | ||
| Collections.singleton(new NewTopic("myTopic", Collections.singletonMap(Integer.valueOf(0), asList(new Integer[]{0, 1, 2})))), |
There was a problem hiding this comment.
nit: this can be simplified Collections.singletonMap(0, asList(0, 1, 2)
A few more of these in this file.
There was a problem hiding this comment.
I was not questioning the need of the if, but the reassignment to lastSeenMetadataVersion, which seemed redundant given how it is updated below. I may have answered my own question however. The problem is that the version in Metadata is updated in two additional cases: 1) on initial bootstrapping, and 2) after a periodic refresh. We handle the first case I think because of this reassignment and the fact that the two fields are initialized to 0.
I'm not sure about the second case, however. Say, for example, that lowestValidMetadataVersion and lastSeenMetadataVersion are both 5 and no update has been requested. After the metadata max age expires, we'll automatically trigger an update and bump lastSeenMetadataVersion to 6. Now if we request a metadata update, nothing will happen because we'll never reach equality again. We may be able to fix it by changing to an inequality, but I'm not sure. Some testing would be helpful.
|
Jenkins is flaking again due to out of memory errors launching git. |
|
retest this please |
530d59b to
41b3f7a
Compare
hachikuji
left a comment
There was a problem hiding this comment.
Thanks, the approach using the custom metadata updater seems promising. Left some comments/questions.
There was a problem hiding this comment.
This is a public package, which conventionally has implied that all of the classes in it are public as well. Should we have an internals package for stuff like this? This is the pattern we use for the consumer and producer.
There was a problem hiding this comment.
OK, I'll create an internals package, to follow the convention.
There was a problem hiding this comment.
nit: comments like this seem like overkill
There was a problem hiding this comment.
Would it make sense to extend ManualMetadataUpdater. It already has most of the no-op functionality we want.
There was a problem hiding this comment.
I guess my thought process here is that the broker is using ManualMetadataUpdater, and I don't want changes to ManualMetadataUpdater to change AdminClient. Since the (lack of?) functionality here is minimal, seems better just to create another class.
There was a problem hiding this comment.
nit: use the other constructor?
There was a problem hiding this comment.
It's a little unclear if we need this. Below when we see a disconnect, we check for authentication errors explicitly. Do we need anything else? It would be a little clearer if we only have one path for surfacing authentication errors.
There was a problem hiding this comment.
When NetworkClient gets a request, it sometimes has to make additional internal requests to fulfill it. For example, if NC gets asked to make an AlterConfigsRequest, it may first have to make an API versions request. However, if this API versions request fails, there is nothing which ties the failure back to the AlterConfigsRequest. Since it's an "internal" request, the failure disappears without a trace and never makes its way into the list of responses.
I would argue that this is a bug in NetworkClient. We are currently hacking around it by things like having the event loop manually iterate through each node in ClusterConnectionStates to see whether any of them ended up in ConnectionState.AUTHENTICATION_FAILED. The metadata updater is another hack which gets triggered even when the Response gets dropped, in NetworkClient#processDisconnection.
If we want to be a little braver, we could say that when handling an internal APIVersionRequest disconnect, we could also send a disconnection to the next non-internal request queued for that node.
Really the whole concept of an internal request is evil. We should just have a wrapper class around NetworkClient that translates one stream of requests into another. That would help us keep this straight. But that is too big to do right now.
There was a problem hiding this comment.
Is this not a concurrent modification?
There was a problem hiding this comment.
Technically, the iterator removes the entry from pendingAuthenticationErrors via Iterator#remove, and then the call to pendingAuthenticationErrors#remove has no effect.
There was a problem hiding this comment.
The name timeoutMs seems misleading. It's really the blackout period following the authentication error?
There was a problem hiding this comment.
Good point. I will change it to blackoutMs.
There was a problem hiding this comment.
What is the expected behavior if the user ignores the auth error and continues to use the AdminClient?
There was a problem hiding this comment.
It will continue to throw AuthenticationException.
After enough time, another metadata request may be made which may succeed, which would allow future requests to go through. But we don't spam metadata requests or anything-- if the auth exception is cleared, it will be because of a timeout.
There was a problem hiding this comment.
Should the poll timeout take into account the time to the next metadata refresh? What happens if we are in the middle of the metadata backoff when a call is made?
There was a problem hiding this comment.
Yes, it should. Will fix.
|
retest this please |
There was a problem hiding this comment.
Is the synchronization needed here since pendingCalls is only accessed by this thread?
There was a problem hiding this comment.
It's not needed. Good catch.
There was a problem hiding this comment.
We don't need the second {} since the logging treats the exception specially.
When AdminClient gets a NOT_CONTROLLER error, it should refresh its metadata and retry the request, rather than making the end-user deal with NotControllerException. Move AdminClient's metadata management outside of NetworkClient and into AdminMetadataManager. This will make it easier to do more sophisticated metadata management in the future, such as implementing a NodeProvider which fetches the leaders for topics. Rather than manipulating newCalls directly, the AdminClient service thread now drains it directly into pendingCalls. This minimizes the amount of locking we have to do, since pendingCalls is only accessed from the service thread.
|
|
||
| public boolean isReady() { | ||
| if (authException != null) { | ||
| log.trace("Metadata is ready: got authentication exception."); |
There was a problem hiding this comment.
Should this be "Metadata is not ready"?
hachikuji
left a comment
There was a problem hiding this comment.
Thanks for the patch, LGTM. Will merge after the builds complete.
|
retest this please |
…-record-version * apache-github/trunk: KAFKA-6894: Improve err msg when connecting processor with global store (apache#5000) KAFKA-6893; Create processors before starting acceptor in SocketServer (apache#4999) MINOR: Fix typo in ConsumerRebalanceListener JavaDoc (apache#4996) MINOR: Remove deprecated valueTransformer.punctuate (apache#4993) MINOR: Update dynamic broker configuration doc for truststore update (apache#4954) KAFKA-6870 Concurrency conflicts in SampledStat (apache#4985) KAFKA-6361: Fix log divergence between leader and follower after fast leader fail over (apache#4882) KAFKA-6813: Remove deprecated APIs in KIP-182, Part II (apache#4976) KAFKA-6878 Switch the order of underlying.init and initInternal (apache#4988) KAFKA-6299; Fix AdminClient error handling when metadata changes (apache#4295) KAFKA-6878: NPE when querying global state store not in READY state (apache#4978) KAFKA 6673: Implemented missing override equals method (apache#4745) KAFKA-6834: Handle compaction with batches bigger than max.message.bytes (apache#4953)
…che#4295) When AdminClient gets a NOT_CONTROLLER error, it should refresh its metadata and retry the request, rather than making the end-user deal with NotControllerException. Move AdminClient's metadata management outside of NetworkClient and into AdminMetadataManager. This will make it easier to do more sophisticated metadata management in the future, such as implementing a NodeProvider which fetches the leaders for topics. Rather than manipulating newCalls directly, the AdminClient service thread now drains it directly into pendingCalls. This minimizes the amount of locking we have to do, since pendingCalls is only accessed from the service thread.
AdminClient should only call Metadata#requestUpdate when needed.
When AdminClient gets a NOT_CONTROLLER error, it should refresh its metadata and retry the request, rather than making the end-user deal with NotControllerException.
Move AdminClient's metadata management outside of NetworkClient and into AdminMetadataManager. This will make it easier to do more sophisticated metadata management in the future, such as implementing a NodeProvider which fetches the leaders for topics.
Rather than manipulating newCalls directly, the AdminClient service thread now drains it directly into pendingCalls. This minimizes the amount of locking we have to do, since pendingCalls is only accessed from the service thread.