-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Normalized Cost Balancer #3632
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalized Cost Balancer #3632
Changes from all commits
ca4e35e
35db214
010d640
d08ecbc
3024aa0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,59 @@ | ||
| /* | ||
| * Licensed to Metamarkets Group Inc. (Metamarkets) under one | ||
| * or more contributor license agreements. See the NOTICE file | ||
| * distributed with this work for additional information | ||
| * regarding copyright ownership. Metamarkets licenses this file | ||
| * to you under the Apache License, Version 2.0 (the | ||
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, | ||
| * software distributed under the License is distributed on an | ||
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| * KIND, either express or implied. See the License for the | ||
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
|
|
||
| package io.druid.server.coordinator; | ||
|
|
||
| import com.google.common.util.concurrent.ListeningExecutorService; | ||
| import io.druid.timeline.DataSegment; | ||
|
|
||
| public class DiskNormalizedCostBalancerStrategy extends CostBalancerStrategy | ||
| { | ||
| public DiskNormalizedCostBalancerStrategy(ListeningExecutorService exec) | ||
| { | ||
| super(exec); | ||
| } | ||
|
|
||
| /** | ||
| * Averages the cost obtained from CostBalancerStrategy. Also the costs are weighted according to their usage ratios. | ||
| * This ensures that all the hosts will have the same % disk utilization. | ||
| */ | ||
| @Override | ||
| protected double computeCost( | ||
| final DataSegment proposalSegment, final ServerHolder server, final boolean includeCurrentServer | ||
| ) | ||
| { | ||
| double cost = super.computeCost(proposalSegment, server, includeCurrentServer); | ||
|
|
||
| if(cost == Double.POSITIVE_INFINITY){ | ||
| return cost; | ||
| } | ||
|
|
||
| int nSegments = 1; | ||
| if(server.getServer().getSegments().size() > 0) | ||
| { | ||
| nSegments = server.getServer().getSegments().size(); | ||
| } | ||
|
|
||
| double normalizedCost = cost/nSegments; | ||
| double usageRatio = (double)server.getServer().getCurrSize()/(double)server.getServer().getMaxSize(); | ||
|
|
||
| return normalizedCost*usageRatio; | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. did this change from the initial version of this PR? I don't remember anything being tied to number of segments.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, it did change.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. any reason why this is better than the previous one? I guess I'm not sure I fully understand normalization, a comment might be worthwhile.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. bump, @niketh can you explain why
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @xvrl @cheddar The above cost function is better that cost*usage ratio because the cost is directly proportional to number of segments. If the number of segments is more, the cost will be more. Disk Normalized cost ensures that the cost does not increase just because there are more segments. Further Normalized cost * usage ratio will ensure that the host which has a higher percentage of free space is preferred over host with lesser percentage .I didn't add a reply here since i added a comment in the code.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see, technically cost is not proportional to the number of segments, but if your segments all cover the same interval length, using that approximation (essentially multiplying your cost by the average segment size) might work. I've thought about it a little bit more and I was thinking a more consistent approach could be to add a "weight" factor to the existing cost balancer, where we assigns each segment a weight proportional to the size of the segment. Essentially if we just added two parameters I don't want to hold up this PR any longer, but how hard would it be to try this out and see if you get similar results in your setup? As long as we can commit to trying it out, I think we can merge what we have right now, so we can get the refactoring in.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @xvrl We can merge this now and I will try your suggestion this week and will share the results that we see on our cluster
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @niketh 👍 |
||
| } | ||
| } | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| /* | ||
| * Licensed to Metamarkets Group Inc. (Metamarkets) under one | ||
| * or more contributor license agreements. See the NOTICE file | ||
| * distributed with this work for additional information | ||
| * regarding copyright ownership. Metamarkets licenses this file | ||
| * to you under the Apache License, Version 2.0 (the | ||
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, | ||
| * software distributed under the License is distributed on an | ||
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| * KIND, either express or implied. See the License for the | ||
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
| package io.druid.server.coordinator; | ||
|
|
||
| import com.google.common.util.concurrent.ListeningExecutorService; | ||
|
|
||
| public class DiskNormalizedCostBalancerStrategyFactory implements BalancerStrategyFactory | ||
| { | ||
| @Override | ||
| public BalancerStrategy createBalancerStrategy(ListeningExecutorService exec) | ||
| { | ||
| return new DiskNormalizedCostBalancerStrategy(exec); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the server's
currSizeis 0, this will zero out the cost, likely not what we want.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currSize = 0 , will ensure that the host has the lowest cost and is assigned the segments