MINOR: further reduce StreamThread loop INFO logging to 2min summary#9941
Conversation
|
call for review @lct45 @wcarlson5 @cadonna |
|
|
||
| private void commitOffsetsOrTransaction(final Map<Task, Map<TopicPartition, OffsetAndMetadata>> offsetsPerTask) { | ||
| log.debug("Committing task offsets {}", offsetsPerTask); | ||
| log.debug("Committing task offsets {}", offsetsPerTask.entrySet().stream().collect(Collectors.toMap(t -> t.getKey().id(), Entry::getValue))); // avoid logging actual Task objects |
There was a problem hiding this comment.
An unrelated but equally annoying thing I noticed in the logs: we should never log a full Task object because it prints literally everything about the task, including for example the topology description which is not that useful but sometimes VERY long
| referenceContainer.nextScheduledRebalanceMs, | ||
| shutdownErrorHook, | ||
| streamsUncaughtExceptionHandler, | ||
| cacheSize -> cache.resize(cacheSize) |
There was a problem hiding this comment.
My ide tried to optimize this as well. At the time not passing in cacheSize caused some expections. I would be careful about making this change without need
There was a problem hiding this comment.
Interesting. It should be exactly the same, but of course who knows with Java. Did it cause a test to fail or was it something more subtle?
There was a problem hiding this comment.
I think it caused a test to fail but not everytime. It also could have been fixed since then as changes have been made. If all the tests pass it's probably fine
|
Overall LGTM. I am not sure about the cache change but the changes to the log makes a lot of sense |
|
LGTM. |
|
All tests passed except the broken |
Remove all INFO-level logging from the main StreamThread loop in favor of a summary with a 2min interval Reviewers: Walker Carlson <carlson@confluent.io>, Guozhang Wang <guozhang@confluent.io>
|
Cherrypicked to 2.7 |
…e-allocations-lz4 * apache-github/trunk: (562 commits) MINOR: remove unused code from MessageTest (apache#9961) MINOR: Fix visibility of Log.{unflushedMessages, addSegment} methods (apache#9966) KAFKA-12229: Restore original class loader in integration tests using EmbeddedConnectCluster during shutdown (apache#9942) KAFKA-12190: Fix setting of file permissions on non-POSIX filesystems (apache#9947) MINOR: Remove `toStruct` and `fromStruct` methods from generated protocol classes (apache#9960) MINOR: Fix typo in Utils#toPositive (apache#9943) MINOR: MessageUtil: remove some deadcode (apache#9931) MINOR: Update zstd-jni to 1.4.8-2 (apache#9957) MINOR: Revert assertion in MockProducerTest (apache#9956) MINOR: Optimize assertions in unit tests (apache#9955) MINOR: Tag `RaftEventSimulationTest` as `integration` and tweak it (apache#9925) MINOR: Update to Gradle 6.8.1 (apache#9953) MINOR: A few small group coordinator cleanups (apache#9952) MINOR: Upgrade ducktape to version 0.8.1 (apache#9933) MINOR: fix record time in test shouldWipeOutStandbyStateDirectoryIfCheckpointIsMissing (apache#9948) MINOR: Restore interrupt status when closing (apache#9863) KAFKA-10357: Extract setup of repartition topics from Streams partition assignor (apache#9848) KAFKA-12212; Bump Metadata API version to remove `ClusterAuthorizedOperations` fields (KIP-700) (apache#9945) MINOR: log 2min processing summary of StreamThread loop (apache#9941) MINOR: Drop enable.metadata.quorum config (apache#9934) ...
Remove all INFO-level logging from the main StreamThread loop in favor of a summary with a 2min interval Reviewers: Walker Carlson <carlson@confluent.io>, Guozhang Wang <guozhang@confluent.io>
…#498) Remove all INFO-level logging from the main StreamThread loop in favor of a summary with a 2min interval Reviewers: Walker Carlson <carlson@confluent.io>, Guozhang Wang <guozhang@confluent.io> Co-authored-by: A. Sophie Blee-Goldman <sophie@confluent.io>
One more try to get the logging levels right..the only way to log something within the main StreamThread loop without absolutely flooding the logs at a level that isn't appropriate for INFO is to just set some kind of interval to stick to. I chose to log the summary every 2 min, since this is long enough to prevent log spam but short enough to fit at least 2 summaries within the (default) poll interval of 5 min