Optimize ThresholdShedder strategy: the low-load Broker cannot be fully utilized #12471

lordcheng10 · 2021-10-23T10:32:20Z

Motivation

Consider a large cluster with multiple nodes. If most of the nodes are relatively balanced, but only a few nodes have extremely low load, then according to the current logic, it may not be possible to trigger the balancing action, eg:
broke1 brokerAvgResourceUsage 80
......
broker100 brokerAvgResourceUsage 80

broker101 brokerAvgResourceUsage 10

The calculated avgUsage=(80*100+10)/101=79 at this time, if the threshold is set to the default value of 10, then any broker will satisfy currentUsage <avgUsage + threshold, so that the balancing operation will not be triggered, so some load Low machines cannot be fully utilized.

Modifications

Here I define the equilibrium state as:
avgUsage-threshold< currentUsage <avgUsage + threshold
In this way, some nodes with extremely low load will not appear.

eolivelli · 2021-10-23T10:32:47Z

@lordcheng10:Thanks for your contribution. For this PR, do we need to update docs?
(The PR template contains info about doc, which helps others know more about the changes. Can you provide doc-related info in this and future PR descriptions? Thanks)

lordcheng10 · 2021-10-23T10:36:34Z

@lordcheng10:Thanks for your contribution. For this PR, do we need to update docs? (The PR template contains info about doc, which helps others know more about the changes. Can you provide doc-related info in this and future PR descriptions? Thanks)

No need to add documentation

lordcheng10 · 2021-10-23T13:41:35Z

/pulsarbot run-failure-checks

lordcheng10 · 2021-10-23T16:28:23Z

/pulsarbot run-failure-checks

lordcheng10 · 2021-10-23T17:02:35Z

/pulsarbot run-failure-checks

lordcheng10 · 2021-10-24T04:09:59Z

/pulsarbot run-failure-checks

pulsar-broker/src/main/java/org/apache/pulsar/broker/loadbalance/impl/ThresholdShedder.java

lordcheng10 · 2021-10-28T16:57:53Z

I made some changes. PTAL，thanks! @hangc0276

michaeljmarshall

I wonder if it would make more sense to implement a new LoadSheddingStrategy to achieve your goals instead of updating the ThesholdShedder class. This class is only meant to unload bundles from a broker with above average (plus some threshold) usage.

Based on your PR's description, you'd like the ThresholdShedder algorithm to take under utilized brokers into special consideration. As @hangc0276 pointed out, it is not straightforward to consider these brokers because the only mechanism we have available in this class is to "unload" a bundle from a broker. Your proposed changes require the unloading of bundles from brokers that are not above the average (plus some threshold) usage, which seems like a new strategy to me.

Note that because we're using a simple average, clusters with many brokers might have a few under utilized brokers without any one broker being too much above the average utilization. An alternative strategy could give underutilized brokers more weight.

The calculated avgUsage=(80*100+10)/101=79 at this time, if the threshold is set to the default value of 10, then any broker will satisfy currentUsage <avgUsage + threshold, so that the balancing operation will not be triggered, so some load Low machines cannot be fully utilized.

Based on your example, an alternative solution is to configure a smaller threshold. Decreasing the threshold will lead to better distribution, but the trade off is that it will also lead to more frequent unloading, which has costs too.

If we do decide to update this class's strategy, we will need to document the new details in the Javadoc as well as add tests to ensure that the algorithm works correctly.

michaeljmarshall · 2021-10-28T19:29:52Z

pulsar-broker/src/main/java/org/apache/pulsar/broker/loadbalance/impl/ThresholdShedder.java

            final double currentUsage = brokerAvgResourceUsage.getOrDefault(broker, 0.0);

-            if (currentUsage < avgUsage + threshold) {
+            if (currentUsage > avgUsage - threshold && currentUsage < avgUsage + threshold) {


Based on the new changes (the brokers being iterated over are all over average), there is no longer a need to update the logic for this conditional.

michaeljmarshall · 2021-10-28T19:34:17Z

pulsar-broker/src/main/java/org/apache/pulsar/broker/loadbalance/impl/ThresholdShedder.java

+        }
+
+        //4. Calculate the percentage of traffic to be migrated by each broker in overAvgUsageBrokers;
+        double percentOfTrafficToOffload = ADDITIONAL_THRESHOLD_PERCENT_MARGIN


Why do we calculate this as a generic average for all brokers? The original calculation of percentOfTrafficToOffload is for a specific broker because it is used to determine which bundles to unload to decrease that broker's load to beneath the average. By using an averaged value from all brokers, we could unload too many bundles from some brokers and too few bundles from other brokers.

Why do we calculate this as a generic average for all brokers?

In the example I mentioned above, the load utilization rate of all brokers is lower than the average utilization rate. At this time, according to the original calculation method, there is no way to calculate percentoftraffictooffload.

With regard to this new change, I need to explain the following points:

First, I define the equilibrium state as:

avgUsage - threshold < currentUsage < avgUsage + threshold；

The sum of the traffic that can be received by all nodes lower than avgusage is the total traffic that we should unload from nodes higher than avgusage. Therefore, we only need to calculate the sum of the total traffic of all brokers higher than avgusage over brokertrafficsum, Then, sumbelowoavgtraffic / overbrokertrafficsum can get the information from each broker that exceeds avgusage

@michaeljmarshall

the load utilization rate of all brokers is lower than the average utilization rate.

Can you clarify this point? The average usage cannot be higher than all of the individual usages.

At this time, according to the original calculation method, there is no way to calculate percentoftraffictooffload.

The current algorithm isn't focused on calculating this value. It only unloads brokers that have utilization above a certain threshold. Moving bundles is not free, so some operators might be fine with a slightly underloaded broker as long as no one broker is too high above the average utilization.

The sum of the traffic that can be received by all nodes lower than avgusage is the total traffic that we should unload from nodes higher than avgusage. Therefore, we only need to calculate the sum of the total traffic of all brokers higher than avgusage over brokertrafficsum, Then, sumbelowoavgtraffic / overbrokertrafficsum can get the information from each broker that exceeds avgusage

I don't think this is correct. Since brokers in the overAvgUsageBrokers collection can have different usages, we need to calculate an amount that is right for each broker itself. Your algorithm might work for your example because all brokers above the average have the same usage.

I think the most productive way to show that your algorithm works as you're saying is to add tests that show how it will react to specific scenarios.

lordcheng10 · 2021-10-29T00:52:31Z

/pulsarbot run-failure-checks

lordcheng10 · 2021-11-18T19:38:27Z

In order to facilitate understanding, I have shown the code ideas in the form of diagrams. PTAL,thanks! @hangc0276 @michaeljmarshall

michaeljmarshall · 2021-11-18T20:07:33Z

@lordcheng10 - instead of writing a diagram, I think it would be more productive to write tests that show how it will work and will also serve to verify your algorithm works the way your say it will.

I am still of the opinion that this is a new algorithm. That does not mean you shouldn't implement it, it just means that updating the ThresholdShedder might be the wrong direction. I am interested to know what others think, too.

rdhabalia · 2021-11-18T20:28:06Z

this solution would not work in heterogeneous hardware and not correct to make it default behavior. I have done some work to address this issue and need to complete the PR.

rdhabalia · 2021-11-18T20:32:50Z

also , don't change the existing implementation because there are two different approaches for load balancer a. utilize brokers fully b. distribute load uniformly across brokers. so, we should not change existing behavior and this behavior should be configurable.

rdhabalia

the current implementation is by design to fully utilize the broker. so, if needed we need to introduce new policy which can be configurable.

lordcheng10 · 2021-11-19T01:59:01Z

@lordcheng10 - instead of writing a diagram, I think it would be more productive to write tests that show how it will work and will also serve to verify your algorithm works the way your say it will.

I am still of the opinion that this is a new algorithm. That does not mean you shouldn't implement it, it just means that updating the ThresholdShedder might be the wrong direction. I am interested to know what others think, too.

You are right, I will rewrite a balanced strategy class！

lordcheng10 · 2021-11-19T01:59:41Z

the current implementation is by design to fully utilize the broker. so, if needed we need to introduce new policy which can be configurable.

You are right, I will rewrite a balanced strategy class

rdhabalia · 2021-11-20T06:09:44Z

@lordcheng10 can you please check if this PR addresses your concern #12902

michaeljmarshall · 2022-02-11T18:39:56Z

Removing the release/2.8.3 and release/2.9.3 labels as this change will target master and probably won't be cherry picked back to stable branches because it is an optimization.

github-actions · 2022-03-19T01:57:20Z

The pr had no activity for 30 days, mark with Stale label.

Optimize ThresholdShedder strategy

00a1492

eolivelli assigned lordcheng10 Oct 23, 2021

eolivelli added the doc-label-missing label Oct 23, 2021

codelipenghui requested review from 315157973, eolivelli, hangc0276 and merlimat October 24, 2021 04:25

codelipenghui added this to the 2.10.0 milestone Oct 24, 2021

codelipenghui added doc-not-needed Your PR changes do not impact docs release/2.8.2 release/2.9.1 type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages and removed doc-label-missing labels Oct 24, 2021

codelipenghui previously approved these changes Oct 24, 2021

View reviewed changes

tomscut previously approved these changes Oct 24, 2021

View reviewed changes

hezhangjian previously approved these changes Oct 25, 2021

View reviewed changes

hangc0276 reviewed Oct 25, 2021

View reviewed changes

pulsar-broker/src/main/java/org/apache/pulsar/broker/loadbalance/impl/ThresholdShedder.java Show resolved Hide resolved

Optimize equilibrium strategy

0f4b980

lordcheng10 changed the title ~~Optimize ThresholdShedder strategy~~ Optimize ThresholdShedder strategy: the low-load Broker cannot be fully utilized Oct 28, 2021

lordcheng10 added 2 commits October 29, 2021 01:09

check style

0cb4bba

check style

11f4a81

michaeljmarshall requested changes Oct 28, 2021

View reviewed changes

rdhabalia requested changes Nov 19, 2021

View reviewed changes

codelipenghui added release/2.8.3 and removed release/2.8.2 labels Nov 21, 2021

eolivelli added release/2.9.2 and removed release/2.9.1 labels Dec 13, 2021

codelipenghui added release/2.9.3 and removed release/2.9.2 labels Jan 7, 2022

codelipenghui modified the milestones: 2.10.0, 2.11.0 Jan 18, 2022

michaeljmarshall removed release/2.8.3 release/2.9.3 labels Feb 11, 2022

congbobo184 dismissed stale reviews from hezhangjian, tomscut, and codelipenghui via 11f4a81 February 16, 2022 02:58

github-actions bot added the lifecycle/stale label Mar 19, 2022

codelipenghui modified the milestones: 2.11.0, 2.12.0 Jul 26, 2022

lordcheng10 closed this Jul 29, 2022

codelipenghui modified the milestones: 2.12.0, 2.11.0 Aug 1, 2022

Optimize ThresholdShedder strategy: the low-load Broker cannot be fully utilized #12471

Optimize ThresholdShedder strategy: the low-load Broker cannot be fully utilized #12471

Uh oh!

Conversation

lordcheng10 commented Oct 23, 2021

Motivation

Modifications

Uh oh!

eolivelli commented Oct 23, 2021

Uh oh!

lordcheng10 commented Oct 23, 2021

Uh oh!

lordcheng10 commented Oct 23, 2021

Uh oh!

lordcheng10 commented Oct 23, 2021

Uh oh!

lordcheng10 commented Oct 23, 2021

Uh oh!

lordcheng10 commented Oct 24, 2021

Uh oh!

Uh oh!

lordcheng10 commented Oct 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

michaeljmarshall left a comment

Choose a reason for hiding this comment

Uh oh!

michaeljmarshall Oct 28, 2021

Choose a reason for hiding this comment

Uh oh!

michaeljmarshall Oct 28, 2021

Choose a reason for hiding this comment

Uh oh!

lordcheng10 Nov 4, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michaeljmarshall Nov 18, 2021

Choose a reason for hiding this comment

Uh oh!

lordcheng10 commented Oct 29, 2021

Uh oh!

lordcheng10 commented Nov 18, 2021

Uh oh!

michaeljmarshall commented Nov 18, 2021

Uh oh!

rdhabalia commented Nov 18, 2021

Uh oh!

rdhabalia commented Nov 18, 2021

Uh oh!

rdhabalia left a comment

Choose a reason for hiding this comment

Uh oh!

lordcheng10 commented Nov 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lordcheng10 commented Nov 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rdhabalia commented Nov 20, 2021

Uh oh!

michaeljmarshall commented Feb 11, 2022

Uh oh!

github-actions bot commented Mar 19, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

lordcheng10 commented Oct 28, 2021 •

edited

Loading

lordcheng10 Nov 4, 2021 •

edited

Loading

lordcheng10 commented Nov 19, 2021 •

edited

Loading

lordcheng10 commented Nov 19, 2021 •

edited

Loading