Skip to content

[fix][broker] Fix loadbalance score calculation problem#19420

Merged
Technoboy- merged 2 commits intoapache:masterfrom
gaozhangmin:loadbalance-rate-problem
Feb 15, 2023
Merged

[fix][broker] Fix loadbalance score calculation problem#19420
Technoboy- merged 2 commits intoapache:masterfrom
gaozhangmin:loadbalance-rate-problem

Conversation

@gaozhangmin
Copy link
Copy Markdown
Contributor

@gaozhangmin gaozhangmin commented Feb 3, 2023

Motivation

There is a problem in my environment, a bundle cannot be loaded to an empty broker(no bundles on it).
I found that this empty broker still has preallocatedBundleData on it which caused the score is higher than others.

private static double getScore(final BrokerData brokerData, final ServiceConfiguration conf) {
final double overloadThreshold = conf.getLoadBalancerBrokerOverloadedThresholdPercentage() / 100.0;
final double maxUsage = brokerData.getLocalData().getMaxResourceUsage();
if (maxUsage > overloadThreshold) {
log.warn("Broker {} is overloaded: max usage={}", brokerData.getLocalData().getWebServiceUrl(), maxUsage);
return Double.POSITIVE_INFINITY;
}
double totalMessageRate = 0;
for (BundleData bundleData : brokerData.getPreallocatedBundleData().values()) {
final TimeAverageMessageData longTermData = bundleData.getLongTermData();
totalMessageRate += longTermData.getMsgRateIn() + longTermData.getMsgRateOut();
}
// calculate estimated score
final TimeAverageBrokerData timeAverageData = brokerData.getTimeAverageData();
final double timeAverageLongTermMessageRate = timeAverageData.getLongTermMsgRateIn()
+ timeAverageData.getLongTermMsgRateOut();
final double totalMessageRateEstimate = totalMessageRate + timeAverageLongTermMessageRate;
if (log.isDebugEnabled()) {
log.debug("Broker {} has long term message rate {}",
brokerData.getLocalData().getWebServiceUrl(), totalMessageRateEstimate);
}
return totalMessageRateEstimate;
}

This issue is caused by the preallocatedBundleData is not cleared from brokerData. LINE 573 and LINE 580 is the condition of clearing preallocatedBundleData.

There is a corner case, which bundle is unload before broker data reportted to Zookeeper. the preallocatedBundleData would never have the chance of clearing from loadData of leader.

final Map<String, BundleData> preallocatedBundleData = brokerData.getPreallocatedBundleData();
synchronized (preallocatedBundleData) {
for (String preallocatedBundleName : brokerData.getPreallocatedBundleData().keySet()) {
if (brokerData.getLocalData().getBundles().contains(preallocatedBundleName)) {
final Iterator<Map.Entry<String, BundleData>> preallocatedIterator =
preallocatedBundleData.entrySet()
.iterator();
while (preallocatedIterator.hasNext()) {
final String bundle = preallocatedIterator.next().getKey();
if (bundleData.containsKey(bundle)) {
preallocatedIterator.remove();
preallocatedBundleToBroker.remove(bundle);
}
}
}
// This is needed too in case a broker which was assigned a bundle dies and comes back up.
preallocatedBundleToBroker.remove(preallocatedBundleName);
}
}

Modifications

If current broker doesn't own the bundle, clearing the preallocatedBundleData

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

gaozhangmin#10

@gaozhangmin gaozhangmin self-assigned this Feb 3, 2023
@github-actions github-actions Bot added the doc-not-needed Your PR changes do not impact docs label Feb 3, 2023
@gaozhangmin gaozhangmin changed the title [Fix][broker] Fix loadbalance score caculation problem [fix][broker] Fix loadbalance score caculation problem Feb 3, 2023
@gaozhangmin gaozhangmin added this to the 2.11.0 milestone Feb 3, 2023
Copy link
Copy Markdown
Contributor

@HQebupt HQebupt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gaozhangmin gaozhangmin added ready-to-test doc-not-needed Your PR changes do not impact docs and removed ready-to-test doc-not-needed Your PR changes do not impact docs labels Feb 7, 2023
@Technoboy- Technoboy- merged commit 456d112 into apache:master Feb 15, 2023
@lhotari
Copy link
Copy Markdown
Member

lhotari commented Feb 15, 2023

If there was a bug, why didn't we try to add a unit test to reproduce the issue and prevent further regressions?

liangyepianzhou pushed a commit that referenced this pull request Feb 16, 2023
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request Feb 28, 2023
(cherry picked from commit 456d112)
(cherry picked from commit b4b664c)
@momo-jun momo-jun changed the title [fix][broker] Fix loadbalance score caculation problem [fix][broker] Fix loadbalance score calculation problem Apr 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants