Balancer can work incorrectly in case of slow historicals or large number of segments to move

When a cluster has thousands of segments to load, it is possible that historicals take more time to process all the requests showing up in the loadQueue. Consequently, the coordinator timeout `druid.coordinator.load.timeout` expires and after that the segment request is removed from the CuratorLoadQueuePeon object for this historical and the balancer attempts to balance this segment again.
However, since the zookeeper path for this segment still exists on the earlier historical, this historical will eventually load this segment even though the coordinator has decided to place the segment elsewhere.
This behavior can cause the balancer to make incorrect segment placement decisions since `ServerHolder.getSizeUsed()` will now be different from the actual used size on that historical. This is more evident when using the disk Normalized balancer strategy as the usageRatio plays an important role in that strategy.

Couple of ways to mitigate this problem would be to increase the `druid.coordinator.load.timeout` property or lower the `maxSegmentsInNodeLoadingQueue` so that the historicals aren't flooded with loadqueue entries when there are a large number of segments to move.

Investigating into this issue, it looks like CuratorLoadQueuePeon shouldn't be deleting the load/queue entries in the event of a  load timeout, so that the balancer is aware of this activity.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Balancer can work incorrectly in case of slow historicals or large number of segments to move #10193

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Balancer can work incorrectly in case of slow historicals or large number of segments to move #10193

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions