-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Enhancement] Optimize the algorithm of selecting host for a bucket scan task when a backend not alive #5133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| if (!buckendIdToBucketCountMap.containsKey(backendIdRef.getRef())) { | ||
| buckendIdToBucketCountMap.put(backendIdRef.getRef(), 1); | ||
| } else { | ||
| buckendIdToBucketCountMap.put(backendIdRef.getRef(), buckendIdToBucketCountMap.get(backendIdRef.getRef())+1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| buckendIdToBucketCountMap.put(backendIdRef.getRef(), buckendIdToBucketCountMap.get(backendIdRef.getRef())+1); | |
| buckendIdToBucketCountMap.put(backendIdRef.getRef(), buckendIdToBucketCountMap.get(backendIdRef.getRef()) + 1); |
caiconghui
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
@xinghuayu007 there are conflict, please resolve them first. |
233aee7 to
409a359
Compare
409a359 to
95c3421
Compare
Thank you for your code review. The conflict has been resolved. |
morningman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
kangkaisen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
…can task when a backend not alive (apache#5133)
Proposed changes
When query uses bucket join, fuction
Coordinator#BucketShuffleJoinController#getExecHostPortForFragmentIDAndBucketSeqis responseful for making sure each host have average bucket to scan. That means if there are 10 buckets to scan and 5 hosts, the strategy will distributed 2 buckets to each host. The algorithm is like this:a. use data structure
buckendIdToBucketCountMapto represents how many buckets distributed to the backend;b. traverse every backend, find a backend which owns minimum buckets. We call it mini_backend;
c. distribute the bucket to the mini_backend;
d. update
buckendIdToBucketCountMapfor mini_backend;When all bakends are all alive, the algorithm is available. But when mini_backend is not alive, it will chose a replica host as final host randomly and
buckendIdToBucketCountMapis not updated. This will cause the bucket scan task not load balance.This patch optimize the algorithm. When the mini_backend is not alive, update
buckendIdToBucketCountMap.Related Issue: #5132
Types of changes
What types of changes does your code introduce to Doris?
Put an
xin the boxes that applyChecklist
Put an
xin the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...