KAFKA-12272: Fix commit-interval metrics#10102
Conversation
|
|
||
| log.debug("{} punctuators ran.", punctuated); | ||
|
|
||
| final long beforeCommitTs = now; |
There was a problem hiding this comment.
such a small nit that I don't think it's worth pushing a fix and restarting the build, but should this be beforeCommitMs rather than Ts?
| final int committed = maybeCommit(); | ||
| totalCommittedSinceLastSummary += committed; | ||
| final long commitLatency = advanceNowAndComputeLatency(); | ||
| final long commitLatency = Math.max(now - beforeCommitTs, 0); |
There was a problem hiding this comment.
Also kind of a nit, since technically this does work, but wouldn't it make more sense to just remove the advanceNowAndComputeLatency call in maybeCommit, and then just call advancedNowAndComputeLatency here as before? Otherwise we're just computing the latency inside maybeCommit for no reason, and throwing out the result.
There was a problem hiding this comment.
Well, we need to advance time within maybeCommit() to make sure we compute lastCommitMs correctly (what was exactly the fix that introduced the commit-latency metric regression).
There was a problem hiding this comment.
We could replace advancedNowAndComputeLatency by now = time.milliseconds() in maybeCommit() though. -- In the original PR, we just called advancedNowAndComputeLatency to keep the patter of a single place in the code that does call time.milliseconds
There was a problem hiding this comment.
But currently we call advanceNowAndComputeLatency pretty much at the end of the maybeCommit method, so there shouldn't be a noticeable difference between advancing now at the end of maybeCommit vs advancing it immediately after maybeCommit returns, right?
I'm also ok with just advancing now inside maybeCommit. The main thing that felt off was just that we compute the latency for no reason inside maybeCommit, and ignore the result
There was a problem hiding this comment.
(and similarly move the lastCommitMs = now to after maybeCommit returns)
There was a problem hiding this comment.
so there shouldn't be a noticeable difference between advancing now at the end of maybeCommit vs advancing it immediately after maybeCommit returns, right?
Well, not from a runtime point of view.
(and similarly move the lastCommitMs = now to after maybeCommit returns)
Exactly: we need to first advance now and than update lastCommitMs -- but I would rather not update lastCommitMs outside of maybeCommit, because it breaks encapsulation?
I'm also ok with just advancing now inside maybeCommit. The main thing that felt off was just that we compute the latency for no reason inside maybeCommit, and ignore the result
Yes, that is a little odd -- as I said, happy to change advancedNowAndComputeLatency to now = time.milliseconds() within maybeCommit().
There was a problem hiding this comment.
Yes, that is a little odd -- as I said, happy to change advancedNowAndComputeLatency to now = time.milliseconds() within maybeCommit().
Sounds good to me, let's do it
ableegoldman
left a comment
There was a problem hiding this comment.
This looks good enough to me, but I think it could be cleaned up just a bit to avoid calling advanceNowAndComputeLatency just to advance now, but then throwing out the latency. But I'm ok with it as is for now, especially since we want to get this merged quickly
|
Thanks @ableegoldman -- updated the PR. If there are no further concerns raised, I'll merge after Jenkins passed and cherry-pick to |
|
Merged to |
…e-allocations-lz4 * apache-github/trunk: (118 commits) KAFKA-12327: Remove MethodHandle usage in CompressionType (apache#10123) KAFKA-12297: Make MockProducer return RecordMetadata with values as per contract MINOR: Update zstd and use classes with no finalizers (apache#10120) KAFKA-12326: Corrected regresion in MirrorMaker 2 executable introduced with KAFKA-10021 (apache#10122) KAFKA-12321 the comparison function for uuid type should be 'equals' rather than '==' (apache#10098) MINOR: Add FetchSnapshot API doc in KafkaRaftClient (apache#10097) MINOR: KIP-631 KafkaConfig fixes and improvements (apache#10114) KAFKA-12272: Fix commit-interval metrics (apache#10102) MINOR: Improve confusing admin client shutdown logging (apache#10107) MINOR: Add BrokerMetadataListener (apache#10111) MINOR: Support Raft-based metadata quorums in system tests (apache#10093) MINOR: add the MetaLogListener, LocalLogManager, and Controller interface. (apache#10106) MINOR: Introduce the KIP-500 Broker lifecycle manager (apache#10095) MINOR: Remove always-passing validation in TestRecordTest#testProducerRecord (apache#9930) KAFKA-5235: GetOffsetShell: Support for multiple topics and consumer configuration override (KIP-635) (apache#9430) MINOR: Prevent creating partition.metadata until ID can be written (apache#10041) MINOR: Add RaftReplicaManager (apache#10069) MINOR: Add ClientQuotaMetadataManager for processing QuotaRecord (apache#10101) MINOR: Rename DecommissionBrokers to UnregisterBrokers (apache#10084) MINOR: KafkaBroker.brokerState should be volatile instead of AtomicReference (apache#10080) ... clients/src/main/java/org/apache/kafka/common/record/CompressionType.java core/src/test/scala/unit/kafka/coordinator/group/GroupMetadataManagerTest.scala
Reviewer: A. Sophie Blee-Goldman <sophie@confluent.io>
Call for review @ableegoldman
Fixes a regression bug. Should be cherry-picked back to 2.6 branch.