Skip to content

Conversation

@rmdmattingly
Copy link
Contributor

Finally, a big PR here. This adds the balancer conditional framework and our first conditional implementation: replica distribution. This is an improvement on existing cost-based replica distribution for reasons that I'll dig into further. See my design doc here.

You can enable conditional replica distribution via hbase.master.balancer.stochastic.conditionals.distributeReplicas: set this to true to enable the feature.

Improvements on Replica Balancing

Primary replica balancing squashes all other considerations. The default weight for one of the several cost functions that factor into primary replica balancing is 100,000. Meanwhile the default read request cost is 5. The result is that the load balancer, OOTB, basically doesn't care about balancing actual load. To solve this, you can either set primary replica balancing costs to zero, which is fine if you don't use read replicas, or — if you do use read replicas — maybe you can produce a magic incantation of configurations that work just right, until your needs change. Conditionals provide an alternative which works much more cleanly in relation to all of the other considerations that you would like your balancer to have.

Replica cost functions don't balance secondary replicas effectively. While they'll calculate imbalance costs necessary to balance primary replicas away from secondary replicas, there is no sufficient mechanism in the existing cost functions to distribute secondary replicas appropriately. So using >2 replicas on a table has a pretty dubious value proposition. On the other hand, this conditional implementation will ensure that secondary replicas are distributed to the greatest extent that the cluster allows. Even on a relatively tiny minicluster test I was unable to demonstrate that cost-based replica balancing could distribute a 3 replica table perfectly:
cf1
cf2

….omitting the meaningless snapshots between 4 and 27…

cf28

Meanwhile, conditional based replica balancing solved this imbalance effectively:
bc1
bc2
bc3
bc4
bc5

Testing

I've written a minicluster test to validate that conditional replica balancing works on a small cluster locally, and I've written a larger scale test that mocks the StochasticLoadBalancer in hbase-balancer. This test validates that conditional balancing performance is acceptable; even at a huge scale, even with default balancer costs (which other large scale cost-based replica balancing tests have had to compromise), and even with strict consideration for secondary replicas

cc @ndimiduk

@rmdmattingly rmdmattingly requested a review from ndimiduk January 28, 2025 20:41
return newRegions;
}

int[] removeRegions(int[] regions, Set<Integer> regionIndicesToRemove) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method, and the below addRegions, are just a nicer way to add/remove regions from the BCS arrays in bulk

* from finding a solution.
*/
@InterfaceAudience.Private
final class BalancerConditionals implements Configurable {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This acts as a unified interface for interacting with whatever conditional(s) one might have enabled

Comment on lines 41 to 46
public enum ValidationLevel {
SERVER, // Just check server
HOST, // Check host and server
RACK // Check rack, host, and server
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All 3 checks will be necessary for replica balancing, but for most conditionals (I envision a system table isolation conditional, for example) my guess is that only server validation will be necessary. So I figured that this is a nice way to let implementations be as simple as possible

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@rmdmattingly rmdmattingly force-pushed the HBASE-28513-only-replicas branch from c03c88c to 99f0c2c Compare January 29, 2025 13:26
@rmdmattingly
Copy link
Contributor Author

Rebased

@Apache-HBase

This comment has been minimized.

@rmdmattingly rmdmattingly force-pushed the HBASE-28513-only-replicas branch from 99f0c2c to 561d2e4 Compare January 29, 2025 15:07
import org.apache.hbase.thirdparty.com.google.common.cache.CacheLoader;
import org.apache.hbase.thirdparty.com.google.common.cache.LoadingCache;

public class ReplicaKeyCache implements Configurable {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I were reviewing this PR then I'd probably ask whether this is an early/unnecessary optimization. But object creation isn't inexpensive, and replica distribution will need to check many of these ReplicaKeys. This cache meaningfully dropped the runtime of our large cluster test (several minutes, about 50%)

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@rmdmattingly rmdmattingly force-pushed the HBASE-28513-only-replicas branch from 561d2e4 to c963fc9 Compare January 29, 2025 18:20
@Apache-HBase

This comment has been minimized.

@rmdmattingly rmdmattingly force-pushed the HBASE-28513-only-replicas branch 2 times, most recently from adb82d1 to 564fd27 Compare January 29, 2025 21:00
@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@rmdmattingly rmdmattingly requested a review from Apache9 February 4, 2025 18:04
@rmdmattingly rmdmattingly force-pushed the HBASE-28513-only-replicas branch 2 times, most recently from f5e7773 to be42fcd Compare February 7, 2025 14:17
@Apache-HBase

This comment has been minimized.

@rmdmattingly rmdmattingly force-pushed the HBASE-28513-only-replicas branch from be42fcd to a7fb623 Compare February 7, 2025 15:28
@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

Copy link
Member

@ndimiduk ndimiduk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A tour de force!

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

Copy link
Member

@ndimiduk ndimiduk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing all my concerns, @rmdmattingly . I think that everything left is in the realm of your best judgement.

I'm generally uncomfortable with our object lifecycle on these various classes -- static stateful objects hanging around. It's similar to what we've done in the RegionServer and it all of course pre-dates you. I think you've done a pretty good job given what you have to work with.

I'd still appreciate someone else having a look over it, someone more familiar than I with the balancer. Maybe give it the weekend for any other community members to come along? You could post a last-call reply on your mailing list post as a kindness to the other maintainers.

Good on you.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 28s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ master Compile Tests _
+0 🆗 mvndep 0m 29s Maven dependency ordering for branch
+1 💚 mvninstall 3m 17s master passed
+1 💚 compile 3m 30s master passed
+1 💚 checkstyle 0m 44s master passed
+1 💚 spotbugs 1m 55s master passed
+1 💚 spotless 0m 43s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 12s Maven dependency ordering for patch
+1 💚 mvninstall 3m 5s the patch passed
+1 💚 compile 3m 33s the patch passed
-0 ⚠️ javac 0m 24s /results-compile-javac-hbase-balancer.txt hbase-balancer generated 1 new + 57 unchanged - 0 fixed = 58 total (was 57)
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 7s /buildtool-patch-checkstyle-hbase-balancer.txt The patch fails to run checkstyle in hbase-balancer
+1 💚 spotbugs 2m 7s the patch passed
+1 💚 hadoopcheck 11m 40s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚 spotless 0m 43s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 17s The patch does not generate ASF License warnings.
41m 4s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6651/14/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #6651
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname Linux 5f9db8bce7c7 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 3e37a94
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 86 (vs. ulimit of 30000)
modules C: hbase-balancer hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6651/14/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 36s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 11s Maven dependency ordering for branch
+1 💚 mvninstall 3m 33s master passed
+1 💚 compile 1m 30s master passed
+1 💚 javadoc 0m 53s master passed
+1 💚 shadedjars 6m 27s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 13s Maven dependency ordering for patch
+1 💚 mvninstall 3m 38s the patch passed
+1 💚 compile 1m 21s the patch passed
+1 💚 javac 1m 21s the patch passed
+1 💚 javadoc 0m 49s the patch passed
+1 💚 shadedjars 6m 29s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 💚 unit 7m 25s hbase-balancer in the patch passed.
+1 💚 unit 250m 28s hbase-server in the patch passed.
289m 7s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6651/14/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #6651
Optional Tests javac javadoc unit compile shadedjars
uname Linux 5847ad4ef1cb 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 3e37a94
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6651/14/testReport/
Max. process+thread count 4510 (vs. ulimit of 30000)
modules C: hbase-balancer hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6651/14/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@rmdmattingly rmdmattingly merged commit d24a0ed into apache:master Feb 24, 2025
1 check passed
@rmdmattingly rmdmattingly deleted the HBASE-28513-only-replicas branch February 24, 2025 19:29
rmdmattingly added a commit that referenced this pull request Feb 24, 2025
…tions (#6651)

HBASE-28513 The StochasticLoadBalancer should support discrete evaluations

Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
rmdmattingly added a commit that referenced this pull request Feb 24, 2025
…tions (#6651)

Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
rmdmattingly added a commit that referenced this pull request Feb 25, 2025
…tions (#6651)

Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
rmdmattingly added a commit that referenced this pull request Feb 25, 2025
…tions (#6651) (#6719)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
rmdmattingly added a commit that referenced this pull request Feb 25, 2025
…tions (#6651)

Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
rmdmattingly added a commit that referenced this pull request Feb 27, 2025
…tions (#6651)

Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
rmdmattingly added a commit that referenced this pull request Feb 27, 2025
…tions (#6651)

Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
rmdmattingly added a commit that referenced this pull request Feb 28, 2025
…tions (#6651)

Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
rmdmattingly added a commit that referenced this pull request Feb 28, 2025
…tions (#6651) (#6720)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
rmdmattingly added a commit to HubSpot/hbase that referenced this pull request Mar 7, 2025
…tions (apache#6651) (apache#6720)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
rmdmattingly added a commit to HubSpot/hbase that referenced this pull request Mar 7, 2025
…rt discrete evaluations (apache#6651) (apache#6720) (will be in 2.7)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
charlesconnell pushed a commit to HubSpot/hbase that referenced this pull request Mar 7, 2025
…rt discrete evaluations (apache#6651) (apache#6720) (will be in 2.7)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
charlesconnell pushed a commit to HubSpot/hbase that referenced this pull request Jun 25, 2025
…rt discrete evaluations (apache#6651) (apache#6720) (will be in 2.7)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
charlesconnell pushed a commit to HubSpot/hbase that referenced this pull request Jul 1, 2025
…rt discrete evaluations (apache#6651) (apache#6720) (will be in 2.7)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
mokai87 pushed a commit to mokai87/hbase that referenced this pull request Aug 7, 2025
…tions (apache#6651) (apache#6720)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants