KAFKA-9700:Fix negative estimatedCompressionRatio issue by jiameixie · Pull Request #8285 · apache/kafka

jiameixie · 2020-03-12T07:12:08Z

There are cases that currentEstimation is smaller than
COMPRESSION_RATIO_IMPROVING_STEP and it will get negative
estimatedCompressionRatio,which leads to misjudgment
about if there is no room and MESSAGE_TOO_LARGE might occur.

Change-Id: I0932a2a6ca669f673ab5d862d3fe7b2bb6d96ff6
Signed-off-by: Jiamei Xie jiamei.xie@arm.com

More detailed description of your change,
if necessary. The PR title and PR message become
the squashed commit message, so use a separate
comment to ping reviewers.

Summary of testing strategy (including rationale)
for the feature or bug fix. Unit and/or integration
tests are expected for any behaviour change and
system tests should be considered for larger changes.

Committer Checklist (excluded from commit message)

Verify design and implementation
Verify test coverage and CI build status
Verify documentation (including upgrade notes)

There are cases that currentEstimation is smaller than COMPRESSION_RATIO_IMPROVING_STEP and it will get negative estimatedCompressionRatio,which leads to misjudgment about if there is no room and MESSAGE_TOO_LARGE might occur. Change-Id: I0932a2a6ca669f673ab5d862d3fe7b2bb6d96ff6 Signed-off-by: Jiamei Xie <jiamei.xie@arm.com>

ijuma

Thanks for the PR. Can we please add a test?

jiameixie · 2020-03-12T09:35:16Z

@ijuma I tracked this issue because of MESSAGE_TOO_LARGE by running command
bin/kafka-producer-perf-test.sh --topic test --num-records 50000000 --throughput -1 --record-size 5000 --producer-props bootstrap.servers=server04:9092 acks=1 buffer.memory=67108864 batch.size 65536 compression.type=zstd
But MESSAGE_TOO_LARGE doesn't occur every time. Could you please give me some suggestions about how to write the test? Thanks.

There are cases that currentEstimation is smaller than COMPRESSION_RATIO_IMPROVING_STEP and it will get negative estimatedCompressionRatio,which leads to misjudgment about if there is no room and MESSAGE_TOO_LARGE might occur. Change-Id: I0932a2a6ca669f673ab5d862d3fe7b2bb6d96ff6 Signed-off-by: Jiamei Xie <jiamei.xie@arm.com>

jiameixie · 2020-03-13T09:58:11Z

@ijuma A test has been added. Please review it. Thanks.

ijuma · 2020-03-13T10:50:16Z

ok to test

ijuma

Thanks for the test. A few comments below.

ijuma · 2020-03-17T16:33:30Z

+        // 2. currentEstimation < observedRatio && (currentEstimation + COMPRESSION_RATIO_DETERIORATE_STEP) > observedRatio
+        // 3. currentEstimation > observedRatio && (currentEstimation - COMPRESSION_RATIO_IMPROVING_STEP) > observedRatio
+        // 4. currentEstimation > observedRatio && (currentEstimation - COMPRESSION_RATIO_IMPROVING_STEP) < observedRatio
+        // In all cases, updatedCompressionRatio shouldn't smaller than observedRatio


It seems like we are duplicating a lot of the logic of the code in this comment. This is likely to get stale over time. Can we make the comment more concise and refer to the non test code for more detail?

ijuma · 2020-03-17T16:34:16Z

+            new EstimationsObservedRatios(0.8f, 0.84f, 0.84f),
+            new EstimationsObservedRatios(0.6f, 0.7f, 0.7f),
+            new EstimationsObservedRatios(0.6f, 0.4f, 0.4f),
+            new EstimationsObservedRatios(0.004f, 0.001f, 0.001f)


It seems that expected is always the same as observed? If so, why do we have 3 parameters instead of 2?

ijuma · 2020-03-17T16:35:01Z

+        // 3. currentEstimation > observedRatio && (currentEstimation - COMPRESSION_RATIO_IMPROVING_STEP) > observedRatio
+        // 4. currentEstimation > observedRatio && (currentEstimation - COMPRESSION_RATIO_IMPROVING_STEP) < observedRatio
+        // In all cases, updatedCompressionRatio shouldn't smaller than observedRatio
+        EstimationsObservedRatios[] currentEstimationsObservedRatios = new EstimationsObservedRatios[] {


If we use a List, we should be able to simplify the logic below by using an enhanced for loop.

Signed-off-by: Jiamei Xie <jiamei.xie@arm.com>

ijuma

Thanks for update, a few minor comments below.

ijuma · 2020-03-19T15:23:57Z

+        estimationsObservedRatios.add(new EstimationsObservedRatios(0.8f, 0.84f));
+        estimationsObservedRatios.add(new EstimationsObservedRatios(0.6f, 0.7f));
+        estimationsObservedRatios.add(new EstimationsObservedRatios(0.6f, 0.4f));
+        estimationsObservedRatios.add(new EstimationsObservedRatios(0.004f, 0.001f));


Nit: this can be written more concisely by using Arrays.asList.

ijuma · 2020-03-19T15:24:10Z

+        estimationsObservedRatios.add(new EstimationsObservedRatios(0.004f, 0.001f));
+        float updatedCompressionRatio;
+        for(EstimationsObservedRatios estimationsObservedRatio:estimationsObservedRatios)
+        {


Nit: this brace should be on the previous line.

ijuma · 2020-03-20T12:46:38Z

ok to test

ijuma · 2020-03-21T16:10:29Z

retest this please

ijuma · 2020-03-21T16:14:14Z

ok to test

ijuma · 2020-03-23T13:13:27Z

ok to test

ijuma · 2020-03-23T13:13:52Z

retest this please

ijuma · 2020-03-23T18:55:50Z

Hmm, looks like org.apache.kafka.common.record.CompressionRatioEstimatorTest.testUpdateEstimation is failing. @jiameixie, can you please take a look?

Signed-off-by: Jiamei Xie <jiamei.xie@arm.com>

jiameixie · 2020-03-24T02:02:01Z

@ijuma In my second commit, it was ">=". I am really sorry for my careless.

ijuma · 2020-03-24T14:27:50Z

retest this please

ijuma · 2020-03-24T14:32:59Z

retest this please

ijuma · 2020-03-25T03:46:29Z

One job passed and the other had a flaky test failure:

kafka.api.PlaintextConsumerTest.testLowMaxFetchSizeForRequestAndPartition

ijuma · 2020-03-25T03:49:44Z

Merged to trunk and cherry picked to 2.5.

There are cases where `currentEstimation` is less than `COMPRESSION_RATIO_IMPROVING_STEP` causing `estimatedCompressionRatio` to be negative. This, in turn, may result in `MESSAGE_TOO_LARGE`. Reviewers: Ismael Juma <ismael@juma.me.uk>

ijuma · 2020-04-02T16:55:15Z

@jiameixie Quick question, were you seeing the MESSAGE_TOO_LARGE in the broker?

jiameixie · 2020-04-03T01:21:18Z

@ijuma I saw MESSAGE_TOO_LARGE in producer client

Integrated PR from apache/kafka: apache#8285

ijuma reviewed Mar 12, 2020

View reviewed changes

jiameixie added 3 commits March 13, 2020 09:42

Merge branch 'trunk' of https://github.com/apache/kafka into negative

f998cdf

Merge remote-tracking branch 'origin/negative' into negative

0153dc7

jiameixie requested a review from ijuma March 13, 2020 09:58

ijuma reviewed Mar 17, 2020

View reviewed changes

Add test for CompressionRatioEstimator

7a34d39

Signed-off-by: Jiamei Xie <jiamei.xie@arm.com>

jiameixie mentioned this pull request Mar 18, 2020

KAFKA-9703:Free up resources when splitting huge batches #8286

Closed

3 tasks

ijuma reviewed Mar 19, 2020

View reviewed changes

Modify CompressionRatioEstimatorTest.java

e50dfdc

Signed-off-by: Jiamei Xie <jiamei.xie@arm.com>

jiameixie force-pushed the negative branch from 943bc01 to e50dfdc Compare March 24, 2020 01:55

Formatting fixes

537c402

ijuma merged commit b5409b9 into apache:trunk Mar 25, 2020

jiameixie deleted the negative branch March 27, 2020 01:22

MaximGonnissen added a commit to MaximGonnissen/kafka that referenced this pull request May 28, 2022

Integrate "KAFKA-9700:Fix negative estimatedCompressionRatio issue"

7713e87

Integrated PR from apache/kafka: apache#8285

Conversation

jiameixie commented Mar 12, 2020

Committer Checklist (excluded from commit message)

Uh oh!

ijuma left a comment

Choose a reason for hiding this comment

Uh oh!

jiameixie commented Mar 12, 2020

Uh oh!

jiameixie commented Mar 13, 2020

Uh oh!

ijuma commented Mar 13, 2020

Uh oh!

ijuma left a comment

Choose a reason for hiding this comment

Uh oh!

ijuma Mar 17, 2020

Choose a reason for hiding this comment

Uh oh!

ijuma Mar 17, 2020

Choose a reason for hiding this comment

Uh oh!

ijuma Mar 17, 2020

Choose a reason for hiding this comment

Uh oh!

ijuma left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ijuma Mar 19, 2020

Choose a reason for hiding this comment

Uh oh!

ijuma Mar 19, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ijuma commented Mar 20, 2020

Uh oh!

ijuma commented Mar 21, 2020

Uh oh!

ijuma commented Mar 21, 2020

Uh oh!

ijuma commented Mar 23, 2020

Uh oh!

ijuma commented Mar 23, 2020

Uh oh!

ijuma commented Mar 23, 2020

Uh oh!

jiameixie commented Mar 24, 2020

Uh oh!

ijuma commented Mar 24, 2020

Uh oh!

ijuma commented Mar 24, 2020

Uh oh!

ijuma commented Mar 25, 2020

Uh oh!

ijuma commented Mar 25, 2020

Uh oh!

ijuma commented Apr 2, 2020

Uh oh!

jiameixie commented Apr 3, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants