KAFKA-7528: Standardize on Min/Avg/Max Kafka metrics' default value - NaN by stanislavkozlovski · Pull Request #5908 · apache/kafka

stanislavkozlovski · 2018-11-12T18:51:09Z

While metrics like Min, Avg and Max make sense to respective use Double.MAX_VALUE, 0.0 and Double.MIN_VALUE as default values to ease computation logic, exposing those values makes reading them a bit misleading. For instance, how would you differentiate whether your -avg metric has a value of 0 because it was given samples of 0 or no samples were fed to it?

It makes sense to standardize on the output of these metrics with something that clearly denotes that no values have been recorded.

… NaN

guozhangwang

Made a quick pass over the PR, LGTM.

It just occurs to me that for Streams we seems do not have unit tests to verify initial values for our metrics (otherwise they should be broken and needs update here). cc @vvcephei .

bbejeck

I took a pass over the PR and it LGTM (streams perspective).

vvcephei · 2018-11-14T22:31:42Z

            count += s.eventCount;
        }
-        return count == 0 ? 0 : total / count;
+        return count == 0 ? total : total / count;


Hi @stanislavkozlovski ,

Thanks for this PR. If I'm reading the code of SampledStat right, I think you could achieve your objective simply by replacing this line with:

return count == 0 ? Double.NaN : total / count;

And then, you wouldn't need any of the other changes in this class. Does that seem right to you?

If so, I think it would apply to the rest of the changed stats as well.

Thanks,
-John

Yes, that would be the same functionality. Do you think that makes the code clearer?

Yeah, personally, I think so.

As long as we can report NaN externally for un-initialized metrics, it really doesn't matter how we initialize each sample internally. IMHO, each metric should use an initial sample value that makes its internal math simple, since those samples aren't directly visible externally. (unless I've misunderstood the system)

That makes a lot of sense. Can you take another look, @vvcephei ?

stanislavkozlovski · 2018-11-15T10:50:45Z

        for (Sample sample : samples)
            min = Math.min(min, sample.value);
-        return min;
+        return Math.abs(min - Double.MAX_VALUE) < 0.001 ? Double.NaN : min;


Spotbugs was complaining about comparing doubles (FE_FLOATING_POINT_EQUALITY)

Yeah, it's a tricky business... I think the suggestion I had in Max would also apply here, and you wouldn't have to compare them at all.

vvcephei · 2018-11-15T20:26:24Z

            count += s.eventCount;
        }
-        return count == 0 ? 0 : total / count;
+        return count == 0 ? Double.NaN : total;


Suggested change

return count == 0 ? Double.NaN : total;

return count == 0 ? Double.NaN : total / count;

I think you accidentally lost the average computation in the course of the changes.

Nice catch!

vvcephei · 2018-11-15T20:29:05Z

            max = Math.max(max, sample.value);
-        return max;
+        }
+        return Math.abs(max - Double.MIN_VALUE) < 0.001 ? Double.NaN : max;


Sorry if this basically seems like code-golfing, but I'm wondering if this would be equivalent and a little more robust?

Suggested change

return Math.abs(max - Double.MIN_VALUE) < 0.001 ? Double.NaN : max;

return samples.isEmpty() ? Double.NaN : max;

Then, we could leave the initial values at negative infinity (not sure if it matters).

No, it doesn't seem like code-golfing. Thanks for the review, I think we made the implementation substantially easier to read

vvcephei

LGTM! Thanks for humoring my review, @stanislavkozlovski .

Note: there are a couple of failing tests that look related.

stanislavkozlovski · 2018-11-17T11:18:42Z

@vvcephei We cannot rely on the samples.isEmpty() in the combine() methods. This is because pureObsoleteSamples() resets an expired sample's lastWindowMs and its value to the initial value.

I think the best way to keep track of metrics with initial values is by summing up the event count and returning NaN if no such events exist. Every metric sample will have an eventCount > 0 if its value is updated.

Also split the one testSampledStatInitialValue into two, more clear tests

vvcephei · 2018-11-19T16:28:59Z

@stanislavkozlovski Good catch! It still looks good to me.

stanislavkozlovski · 2018-11-19T17:47:49Z

The failed test looks unrelated - kafka.log.LogCleanerParameterizedIntegrationTest.testCleansCombinedCompactAndDeleteTopic[3]

java.lang.AssertionError: Contents of the map shouldn't change expected:<Map(0 -> (340,340), 5 -> (345,345), 10 -> (350,350), 14 -> (354,354), 1 -> (341,341), 6 -> (346,346), 9 -> (349,349), 13 -> (353,353), 2 -> (342,342), 17 -> (357,357), 12 -> (352,352), 7 -> (347,347), 3 -> (343,343), 18 -> (358,358), 16 -> (356,356), 11 -> (351,351), 8 -> (348,348), 19 -> (359,359), 4 -> (344,344), 15 -> (355,355))> but was:<Map(0 -> (340,340), 5 -> (345,345), 10 -> (350,350), 14 -> (354,354), 1 -> (341,341), 6 -> (346,346), 9 -> (349,349), 13 -> (353,353), 2 -> (342,342), 17 -> (357,357), 12 -> (352,352), 7 -> (347,347), 3 -> (343,343), 18 -> (358,358), 16 -> (356,356), 11 -> (351,351), 99 -> (299,299), 8 -> (348,348), 19 -> (359,359), 4 -> (344,344), 15 -> (355,355))>
	at org.junit.Assert.fail(Assert.java:88)
	at org.junit.Assert.failNotEquals(Assert.java:834)
	at org.junit.Assert.assertEquals(Assert.java:118)
	at kafka.log.LogCleanerParameterizedIntegrationTest.testCleansCo

guozhangwang · 2018-11-20T23:54:35Z

Merged to trunk, thanks a lot @stanislavkozlovski !!

… NaN (apache#5908) While metrics like Min, Avg and Max make sense to respective use Double.MAX_VALUE, 0.0 and Double.MIN_VALUE as default values to ease computation logic, exposing those values makes reading them a bit misleading. For instance, how would you differentiate whether your -avg metric has a value of 0 because it was given samples of 0 or no samples were fed to it? It makes sense to standardize on the output of these metrics with something that clearly denotes that no values have been recorded. Reviewers: Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>, Guozhang Wang <wangguoz@gmail.com>

stanislavkozlovski added 2 commits November 12, 2018 18:50

KAFKA-7528: Standardize on Min/Avg/Max Kafka metrics' default value -…

e6dfd33

… NaN

Fix failing tests

4abb1f4

guozhangwang approved these changes Nov 13, 2018

View reviewed changes

bbejeck approved these changes Nov 14, 2018

View reviewed changes

vvcephei reviewed Nov 14, 2018

View reviewed changes

Simplify internal math in metrics by only exposing NaN

00fa077

stanislavkozlovski commented Nov 15, 2018

View reviewed changes

vvcephei reviewed Nov 15, 2018

View reviewed changes

Fix Avg metric and clean up code

016c14a

vvcephei approved these changes Nov 16, 2018

View reviewed changes

Use samples' event count to figure out when to return NaN

2c4fc00

Also split the one testSampledStatInitialValue into two, more clear tests

guozhangwang merged commit 068ab9c into apache:trunk Nov 20, 2018

	return count == 0 ? Double.NaN : total;
	return count == 0 ? Double.NaN : total / count;

	return Math.abs(max - Double.MIN_VALUE) < 0.001 ? Double.NaN : max;
	return samples.isEmpty() ? Double.NaN : max;

Conversation

stanislavkozlovski commented Nov 12, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

guozhangwang left a comment

Choose a reason for hiding this comment

Uh oh!

bbejeck left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vvcephei Nov 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vvcephei left a comment

Choose a reason for hiding this comment

Uh oh!

stanislavkozlovski commented Nov 17, 2018

Uh oh!

vvcephei commented Nov 19, 2018

Uh oh!

stanislavkozlovski commented Nov 19, 2018

Uh oh!

guozhangwang commented Nov 20, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

stanislavkozlovski commented Nov 12, 2018 •

edited

Loading

vvcephei Nov 15, 2018 •

edited

Loading