Coordinator primary segment assignment fix#5532
Conversation
|
Fix looks good to me, would be nice if we had tests but looks like that's pretty difficult/complicated as of now |
jihoonson
left a comment
There was a problem hiding this comment.
LGTM overall. Can we add a unit test to verify the load rule correctly considers segment loading?
| public class SegmentReplicantLookup | ||
| { | ||
|
|
||
|
|
…er while loading is in progress
|
Saw a failure in this one: Being optimistic, I restarted the tests. Although this test does seem potentially related to this patch. |
|
@gianm the test was added in #5528, and it also failed in a definitely unrelated PR where i increased the timeout so hopefully it will happen less often. I don't think it's because of a race condition with the curator test server used in the test because I ran it in a loop where it successfully passed ~720 times, with something like this but I suppose it's possible it's still racy somewhere. Also possible is that travis just takes a bit longer than my local debug machine took to run the tests, which is why the increased timeout in the other PR. |
|
The test passed after being restarted, so let's commit the patch. |
* fix issue where assign primary assigns segments to all historical servers in cluster * fix test * add test to ensure primary assignment will not assign to another server while loading is in progress
* fix issue where assign primary assigns segments to all historical servers in cluster * fix test * add test to ensure primary assignment will not assign to another server while loading is in progress
* fix issue where assign primary assigns segments to all historical servers in cluster * fix test * add test to ensure primary assignment will not assign to another server while loading is in progress
* fix issue where assign primary assigns segments to all historical servers in cluster * fix test * add test to ensure primary assignment will not assign to another server while loading is in progress
Fixes #5531 by taking segment loading into account when deciding how to place segments, avoiding a bug where primary assignment would assign the segment to all historicals until it is loaded.
Reverts some code removed in #4921 for tracking currently loading segments in
SegmentReplicantLookup.