Addressing 2 Coordinators Elected As Leader (#16411)#16528
Addressing 2 Coordinators Elected As Leader (#16411)#16528kfaraz merged 9 commits intoapache:masterfrom
Conversation
Added listener method that tracks ZK leader state
|
@kfaraz I took an attempt at resolving this if you can take a look and provide feedback it would be appreciated! Thank you! |
|
Thanks a lot, @razinbouzar ! You beat me to it. 🙂 I have been doing some investigation on this. Overall, your solution makes sense to me. But I am still going to try it out in my local setup that I have been using for testing. One important thing that I have noticed is that the I will share more details here in a bit. |
kfaraz
left a comment
There was a problem hiding this comment.
I have left some suggestions. Overall approach makes sense to me.
I still need to test out the changes in my local setup.
An alternative solution I was thinking of was just to recreate the latch in notLeader() method. We would need to evaluate if one is better than the other.
For more information, I encountered this in my local testing when I was trying to recreate the latch in Timeline:
An example sequence of events that I observed Just for clarification, the scenario described above is slightly different from the one we are trying to solve. |
This reverts commit 90ea5e8.
- Cleanup file formatting and comments - Reduce complexity of the first go by calling the recreateLeaderLatch in the notLeader() method
…rDruidLeaderSelector.java Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
kfaraz
left a comment
There was a problem hiding this comment.
Please add a unit test that verifies the new behaviour.
Also the case state = CLOSED still needs to be handled in the isLeader() method of the listener.
- Remove handleConnectionStateChagned method - Remove duplicate code and use recreate leader latch method - Handle LeaderLatch.State.CLOSED in the isLeader() function, log a warning.
Remove unused import
kfaraz
left a comment
There was a problem hiding this comment.
I have done some basic testing changes with these changes and they look good.
I have a follow up PR #16544 to this where I intend to add tests to verify the behaviour and fix other race conditions.
Thanks a lot for the changes, @razinbouzar !
Fixes #16411 .
Description
Introduces a new private method handleConnectionStateChanged to handle connection state changes, which recreates the leader latch if the connection state is SUSPENDED or LOST.
Fixed the bug ...
Renamed the class ...
Added a forbidden-apis entry ...
Release note
Key changed/added classes in this PR
MyFooOurBarTheirBazThis PR has: