Skip to content

Coordinator balancer move then drop fix#5528

Merged
jihoonson merged 4 commits intoapache:masterfrom
clintropolis:coordinator-move-drop-fix
Mar 29, 2018
Merged

Coordinator balancer move then drop fix#5528
jihoonson merged 4 commits intoapache:masterfrom
clintropolis:coordinator-move-drop-fix

Conversation

@clintropolis
Copy link
Copy Markdown
Member

Fixes issue where coordinator balancer will move a segment to the destination server but never actually drop the segment from the source server, one of the issues described in #5521.

With this bug in place, the effective behavior of segment movement was that the balancer would copy the segment to where it thought it needed to be, and the load rule would then decide from where to drop. Because the drop here was never previously working, this may be worth discussion of any implications.

@jihoonson
Copy link
Copy Markdown
Contributor

@clintropolis thanks for working on this. This PR contains some changes to use lambda and I think it makes me difficult to figure out which parts are bug fixes. I would say it's better to split this PR into two sub PRs for bug fix only and remaining.

@jon-wei jon-wei added the Bug label Mar 27, 2018
@clintropolis
Copy link
Copy Markdown
Member Author

Sorry for mixing stuff, this was already part of a split up chain of commits that originally also included the stuff in #5529. Main bug fix is here and additions to test to test the original bug are some additions here and the test itself

I can split this stuff out if you prefer, I was just lazily updating stuff as I came across it.

Copy link
Copy Markdown
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @clintropolis. Left some comments.

sourceLoadQueueChildrenCache.start();
destinationLoadQueueChildrenCache.start();

Thread.sleep(100);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might cause test failures in Travis. If this is to wait for childrenCaches to be ready, some kind of real check (by polling or using a latch) should be used instead. Same for other sleeps.

tearDownServerAndCurator();
}

@Test
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a timeout to avoid too long waits even when this test becomes broken in the future.

loadManagementPeons.put("localhost:1", loadQueuePeon);
loadManagementPeons.put("localhost:2", loadQueuePeon2);


Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary line. Please remove other unnecessary lines too.

@jihoonson jihoonson merged commit 30fc4d3 into apache:master Mar 29, 2018
@clintropolis clintropolis deleted the coordinator-move-drop-fix branch March 30, 2018 20:19
jon-wei pushed a commit to implydata/druid-public that referenced this pull request Apr 4, 2018
* apache#5521 part 1

* formatting

* oops

* less magic tests
gianm pushed a commit to implydata/druid-public that referenced this pull request May 16, 2018
* apache#5521 part 1

* formatting

* oops

* less magic tests
@jihoonson jihoonson added this to the 0.12.2 milestone Jul 5, 2018
jihoonson pushed a commit to jihoonson/druid that referenced this pull request Jul 5, 2018
* apache#5521 part 1

* formatting

* oops

* less magic tests
fjy pushed a commit that referenced this pull request Jul 6, 2018
* #5521 part 1

* formatting

* oops

* less magic tests
leventov pushed a commit to metamx/druid that referenced this pull request Jul 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants