-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-28513 The StochasticLoadBalancer should support discrete evaluations #6651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-28513 The StochasticLoadBalancer should support discrete evaluations #6651
Conversation
hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/BalanceAction.java
Show resolved
Hide resolved
hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/BalancerClusterState.java
Outdated
Show resolved
Hide resolved
| return newRegions; | ||
| } | ||
|
|
||
| int[] removeRegions(int[] regions, Set<Integer> regionIndicesToRemove) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method, and the below addRegions, are just a nicer way to add/remove regions from the BCS arrays in bulk
| * from finding a solution. | ||
| */ | ||
| @InterfaceAudience.Private | ||
| final class BalancerConditionals implements Configurable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This acts as a unified interface for interacting with whatever conditional(s) one might have enabled
| public enum ValidationLevel { | ||
| SERVER, // Just check server | ||
| HOST, // Check host and server | ||
| RACK // Check rack, host, and server | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All 3 checks will be necessary for replica balancing, but for most conditionals (I envision a system table isolation conditional, for example) my guess is that only server validation will be necessary. So I figured that this is a nice way to let implementations be as simple as possible
...ncer/src/main/java/org/apache/hadoop/hbase/master/balancer/SlopFixingCandidateGenerator.java
Show resolved
Hide resolved
...e-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
Outdated
Show resolved
Hide resolved
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
c03c88c to
99f0c2c
Compare
|
Rebased |
This comment has been minimized.
This comment has been minimized.
99f0c2c to
561d2e4
Compare
| import org.apache.hbase.thirdparty.com.google.common.cache.CacheLoader; | ||
| import org.apache.hbase.thirdparty.com.google.common.cache.LoadingCache; | ||
|
|
||
| public class ReplicaKeyCache implements Configurable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I were reviewing this PR then I'd probably ask whether this is an early/unnecessary optimization. But object creation isn't inexpensive, and replica distribution will need to check many of these ReplicaKeys. This cache meaningfully dropped the runtime of our large cluster test (several minutes, about 50%)
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
561d2e4 to
c963fc9
Compare
This comment has been minimized.
This comment has been minimized.
adb82d1 to
564fd27
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/RegionPlanConditional.java
Outdated
Show resolved
Hide resolved
f5e7773 to
be42fcd
Compare
This comment has been minimized.
This comment has been minimized.
be42fcd to
a7fb623
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
ndimiduk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A tour de force!
hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/BalanceAction.java
Show resolved
Hide resolved
hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/BalanceAction.java
Show resolved
Hide resolved
hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/BalancerClusterState.java
Outdated
Show resolved
Hide resolved
hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/BalancerClusterState.java
Show resolved
Hide resolved
hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/BalancerClusterState.java
Show resolved
Hide resolved
...lancer/src/test/java/org/apache/hadoop/hbase/master/balancer/CandidateGeneratorTestUtil.java
Show resolved
Hide resolved
...balancer/src/test/java/org/apache/hadoop/hbase/master/balancer/TestBalancerConditionals.java
Outdated
Show resolved
Hide resolved
...balancer/src/test/java/org/apache/hadoop/hbase/master/balancer/TestBalancerConditionals.java
Outdated
Show resolved
Hide resolved
...he/hadoop/hbase/master/balancer/TestLargeClusterBalancingConditionalReplicaDistribution.java
Outdated
Show resolved
Hide resolved
...rver/src/test/java/org/apache/hadoop/hbase/master/balancer/BalancerConditionalsTestUtil.java
Show resolved
Hide resolved
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/BalancerConditionals.java
Show resolved
Hide resolved
...he/hadoop/hbase/master/balancer/TestLargeClusterBalancingConditionalReplicaDistribution.java
Outdated
Show resolved
Hide resolved
...he/hadoop/hbase/master/balancer/TestLargeClusterBalancingConditionalReplicaDistribution.java
Outdated
Show resolved
Hide resolved
ndimiduk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing all my concerns, @rmdmattingly . I think that everything left is in the realm of your best judgement.
I'm generally uncomfortable with our object lifecycle on these various classes -- static stateful objects hanging around. It's similar to what we've done in the RegionServer and it all of course pre-dates you. I think you've done a pretty good job given what you have to work with.
I'd still appreciate someone else having a look over it, someone more familiar than I with the balancer. Maybe give it the weekend for any other community members to come along? You could post a last-call reply on your mailing list post as a kindness to the other maintainers.
Good on you.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
…tions (#6651) HBASE-28513 The StochasticLoadBalancer should support discrete evaluations Co-authored-by: Ray Mattingly <rmattingly@hubspot.com> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
…tions (#6651) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
…tions (#6651) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
…tions (#6651) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
…tions (#6651) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
…tions (#6651) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
…tions (#6651) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
…tions (apache#6651) (apache#6720) Signed-off-by: Nick Dimiduk <ndimiduk@apache.org> Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
…rt discrete evaluations (apache#6651) (apache#6720) (will be in 2.7) Signed-off-by: Nick Dimiduk <ndimiduk@apache.org> Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
…rt discrete evaluations (apache#6651) (apache#6720) (will be in 2.7) Signed-off-by: Nick Dimiduk <ndimiduk@apache.org> Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
…rt discrete evaluations (apache#6651) (apache#6720) (will be in 2.7) Signed-off-by: Nick Dimiduk <ndimiduk@apache.org> Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
…rt discrete evaluations (apache#6651) (apache#6720) (will be in 2.7) Signed-off-by: Nick Dimiduk <ndimiduk@apache.org> Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
…tions (apache#6651) (apache#6720) Signed-off-by: Nick Dimiduk <ndimiduk@apache.org> Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
Finally, a big PR here. This adds the balancer conditional framework and our first conditional implementation: replica distribution. This is an improvement on existing cost-based replica distribution for reasons that I'll dig into further. See my design doc here.
You can enable conditional replica distribution via
hbase.master.balancer.stochastic.conditionals.distributeReplicas: set this to true to enable the feature.Improvements on Replica Balancing
Primary replica balancing squashes all other considerations. The default weight for one of the several cost functions that factor into primary replica balancing is 100,000. Meanwhile the default read request cost is 5. The result is that the load balancer, OOTB, basically doesn't care about balancing actual load. To solve this, you can either set primary replica balancing costs to zero, which is fine if you don't use read replicas, or — if you do use read replicas — maybe you can produce a magic incantation of configurations that work just right, until your needs change. Conditionals provide an alternative which works much more cleanly in relation to all of the other considerations that you would like your balancer to have.
Replica cost functions don't balance secondary replicas effectively. While they'll calculate imbalance costs necessary to balance primary replicas away from secondary replicas, there is no sufficient mechanism in the existing cost functions to distribute secondary replicas appropriately. So using >2 replicas on a table has a pretty dubious value proposition. On the other hand, this conditional implementation will ensure that secondary replicas are distributed to the greatest extent that the cluster allows. Even on a relatively tiny minicluster test I was unable to demonstrate that cost-based replica balancing could distribute a 3 replica table perfectly:


….omitting the meaningless snapshots between 4 and 27…
Meanwhile, conditional based replica balancing solved this imbalance effectively:





Testing
I've written a minicluster test to validate that conditional replica balancing works on a small cluster locally, and I've written a larger scale test that mocks the StochasticLoadBalancer in hbase-balancer. This test validates that conditional balancing performance is acceptable; even at a huge scale, even with default balancer costs (which other large scale cost-based replica balancing tests have had to compromise), and even with strict consideration for secondary replicas
cc @ndimiduk