Skip to content

Conversation

@rmdmattingly
Copy link
Contributor

In HBASE-28513 we added support for balancer conditionals which will the balancer discrete rules to follow when generating plans. This PR adds support for meta table isolation that works in a more flexible, lightweight, and cost effective way compared to RegionServer groups.

In a subsequent PR I'll solve HBASE-29075 by extending this framework to support generic system table isolation as well. The triad of 28513, 29074, and 29075 will complete my team's initial vision for balancer conditionals, unlocking support for system table isolation, meta table isolation, and improved secondary replica distribution.

@rmdmattingly rmdmattingly force-pushed the rmattingly-HBASE-29074 branch from d3d6f4d to d587239 Compare February 25, 2025 15:15
}

if (LOG.isDebugEnabled()) {
if (LOG.isTraceEnabled()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These logs are really noisy in tests, and low value imo, so I've demoted them to trace

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines 101 to 111
boolean isServerHostingIsolatedTables(BalancerClusterState cluster, int serverIdx) {
Set<TableIsolationConditional> tableIsolationConditionals =
conditionals.stream().filter(TableIsolationConditional.class::isInstance)
.map(TableIsolationConditional.class::cast).collect(Collectors.toSet());
for (TableIsolationConditional tableIsolationConditional : tableIsolationConditionals) {
if (tableIsolationConditional.isServerHostingIsolatedTables(cluster, serverIdx)) {
return true;
}
}
return false;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helps us power sloppy server evaluation that's compatible with table isolation

* If enabled, this class will help the balancer ensure that the meta table lives on its own
* RegionServer. Configure this via {@link BalancerConditionals#ISOLATE_META_TABLE_KEY}
*/
class MetaTableIsolationConditional extends TableIsolationConditional {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation, and the MetaTableIsolationCandidateGenerator, are so lightweight because I anticipate adding a SystemTableIsolationConditional/CandidateGenerator soon

Comment on lines +67 to +71
BalanceAction batchMovesAndResetClusterState(BalancerClusterState cluster,
List<MoveRegionAction> moves) {
if (moves.isEmpty()) {
return BalanceAction.NULL_ACTION;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning a null action is better here, because it signifies that there's no work to be done by the generator — whereas a batch, even an empty one, is easily misinterpreted as actionable work

Comment on lines 437 to 441
if (
!balancerConditionals.isTableIsolationEnabled() // table isolation is inherently incompatible
// with naive "sloppy server" checks
&& sloppyRegionServerExist(cs)
) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In reality, I still think we will fix "sloppy" servers more effectively than a non-conditional balancer implementation by way of the SlopFixingCandidateGenerator (which is table isolation compatible, and performs better than the cost function based approach in my testing)

return generateCandidate(cluster, false);
}

BalanceAction generateCandidate(BalancerClusterState cluster, boolean isWeighing) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is pretty straightforward in my opinion; if there's any head scratching complexity, then it's in this method. I'm happy to explore breaking up this method, which is admittedly probably too long, but doing so would probably introduce a much larger diff because we'd need more objects to represent the state that should be passed around across the goals of isolation + colocation

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

Copy link
Member

@ndimiduk ndimiduk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly just comments and questions. Really nice to see how these play out in test.

}

if (LOG.isDebugEnabled()) {
if (LOG.isTraceEnabled()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

boolean shouldSkipSloppyServerEvaluation() {
return isConditionalBalancingEnabled();
boolean isTableIsolationEnabled() {
return conditionalClasses.contains(MetaTableIsolationConditional.class);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This strategy of feature enablement by class presence is interesting. I wonder how we'll have to change this approach if we introduce user-provided implementations in the future. Just a thought.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I introduce SystemTableIsolation I anticipate making this an isAssignableFrom check much like isReplicaDistributionEnabled


@Override
boolean isRegionToIsolate(RegionInfo regionInfo) {
return regionInfo.isMetaRegion();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it kind of a code smell that both the CandidateGenerator and the Conditional implementation have repeated logic? As you land your additional implementations, I wonder if this will suggest a change to the interfaces.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably yes, let me think about reusability maybe. But I might chalk this up as a relatively small imperfection

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to leave this alone for now, I think it's a disappointing but small reality of the logical divide between conditionals and candidate generators. Will keep thinking on this as I introduce the system table isolation conditional though

private static void validateRegionLocationsWithRetry(Connection connection,
Set<TableName> tableNames, TableName productTableName, boolean areDistributed,
boolean runBalancerOnFailure) throws InterruptedException, IOException {
for (int i = 0; i < 100; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use a Waiter instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same limitations apply that make me hesitant to use a waiter — I don't see how callbacks are supported to the degree that I need. Am I looking in the wrong place with Waiter#waitFor(...Predicate...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, that's the waiter I mean.

@Apache-HBase

This comment has been minimized.

@rmdmattingly rmdmattingly force-pushed the rmattingly-HBASE-29074 branch from 797f85f to 0cd9bd1 Compare February 26, 2025 16:29
@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

if (this == o) {
return true;
}
if (!(o instanceof ReplicaKey other)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change adheres to our restrictions around new language features, please see #6729

@Apache-HBase

This comment has been minimized.

@rmdmattingly rmdmattingly force-pushed the rmattingly-HBASE-29074 branch from 0cd9bd1 to c1314d7 Compare February 27, 2025 15:10
@Apache-HBase

This comment has been minimized.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 29s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ master Compile Tests _
+0 🆗 mvndep 0m 11s Maven dependency ordering for branch
+1 💚 mvninstall 3m 2s master passed
+1 💚 compile 3m 26s master passed
-0 ⚠️ checkstyle 0m 9s /buildtool-branch-checkstyle-hbase-balancer.txt The patch fails to run checkstyle in hbase-balancer
+1 💚 spotbugs 1m 56s master passed
+1 💚 spotless 0m 45s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 11s Maven dependency ordering for patch
+1 💚 mvninstall 3m 2s the patch passed
+1 💚 compile 3m 24s the patch passed
-0 ⚠️ javac 0m 22s /results-compile-javac-hbase-balancer.txt hbase-balancer generated 2 new + 58 unchanged - 0 fixed = 60 total (was 58)
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 9s /results-checkstyle-hbase-balancer.txt hbase-balancer: The patch generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0)
+1 💚 spotbugs 2m 6s the patch passed
+1 💚 hadoopcheck 11m 42s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚 spotless 0m 43s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 16s The patch does not generate ASF License warnings.
40m 19s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6722/5/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #6722
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname Linux 104f92c2ba33 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / c1314d7
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 84 (vs. ulimit of 30000)
modules C: hbase-balancer hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6722/5/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 26s Docker mode activated.
-0 ⚠️ yetus 0m 2s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 10s Maven dependency ordering for branch
+1 💚 mvninstall 3m 13s master passed
+1 💚 compile 1m 10s master passed
+1 💚 javadoc 0m 39s master passed
+1 💚 shadedjars 5m 55s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 14s Maven dependency ordering for patch
+1 💚 mvninstall 3m 3s the patch passed
+1 💚 compile 1m 12s the patch passed
+1 💚 javac 1m 12s the patch passed
+1 💚 javadoc 0m 41s the patch passed
+1 💚 shadedjars 5m 55s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 💚 unit 6m 59s hbase-balancer in the patch passed.
+1 💚 unit 214m 28s hbase-server in the patch passed.
249m 10s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6722/5/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #6722
Optional Tests javac javadoc unit compile shadedjars
uname Linux 0928d25c0126 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / c1314d7
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6722/5/testReport/
Max. process+thread count 5307 (vs. ulimit of 30000)
modules C: hbase-balancer hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6722/5/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@rmdmattingly rmdmattingly merged commit 189c513 into apache:master Feb 27, 2025
1 check passed
@rmdmattingly rmdmattingly deleted the rmattingly-HBASE-29074 branch February 27, 2025 22:12
rmdmattingly added a commit that referenced this pull request Feb 27, 2025
…#6722)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
rmdmattingly added a commit that referenced this pull request Feb 28, 2025
…#6722) (#6735)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
rmdmattingly added a commit that referenced this pull request Feb 28, 2025
…#6722)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
rmdmattingly added a commit that referenced this pull request Feb 28, 2025
…#6722)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
rmdmattingly added a commit that referenced this pull request Mar 4, 2025
…#6722)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
rmdmattingly added a commit that referenced this pull request Mar 5, 2025
…#6722)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
rmdmattingly added a commit that referenced this pull request Mar 6, 2025
…#6722) (#6737)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
rmdmattingly added a commit to HubSpot/hbase that referenced this pull request Mar 7, 2025
…apache#6722) (apache#6737)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
rmdmattingly added a commit to HubSpot/hbase that referenced this pull request Mar 7, 2025
…ta table isolation (apache#6722) (apache#6737) (will be in 2.7)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
charlesconnell pushed a commit to HubSpot/hbase that referenced this pull request Mar 7, 2025
…ta table isolation (apache#6722) (apache#6737) (will be in 2.7)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
charlesconnell pushed a commit to HubSpot/hbase that referenced this pull request Jun 25, 2025
…ta table isolation (apache#6722) (apache#6737) (will be in 2.7)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
charlesconnell pushed a commit to HubSpot/hbase that referenced this pull request Jul 1, 2025
…ta table isolation (apache#6722) (apache#6737) (will be in 2.7)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
mokai87 pushed a commit to mokai87/hbase that referenced this pull request Aug 7, 2025
…apache#6722) (apache#6737)

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Co-authored-by: Ray Mattingly <rmattingly@hubspot.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants