Skip to content

KAFKA-13152: Replace "buffered.records.per.partition" with "input.buffer.max.bytes"#11796

Merged
guozhangwang merged 25 commits intoapache:trunkfrom
vamossagar12:kip-770
Mar 22, 2022
Merged

KAFKA-13152: Replace "buffered.records.per.partition" with "input.buffer.max.bytes"#11796
guozhangwang merged 25 commits intoapache:trunkfrom
vamossagar12:kip-770

Conversation

@vamossagar12
Copy link
Copy Markdown
Contributor

@vamossagar12 vamossagar12 commented Feb 22, 2022

Implements KIP-770.

@vamossagar12
Copy link
Copy Markdown
Contributor Author

vamossagar12 commented Feb 23, 2022

@ableegoldman , @guozhangwang , @mjsax this is the new PR for KIP-770(follow up from #11424

last 3-4 commits hold the new set of changes. I haven't added the StreamConfigUtils class, was thinking if i can do it on a separate ticket. WDYT? Also, the javadoc is pending at this moment. Will add once we are ok with the changes here.

@vamossagar12
Copy link
Copy Markdown
Contributor Author

@guozhangwang , this PR is showing a failure in streams:compileTestJava.. It seems to be working on my local.

@guozhangwang
Copy link
Copy Markdown
Contributor

Hello @vamossagar12 I checked out your branch, and run the streams:compileTestJava on my local machine it it also fails with:

> Task :streams:compileTestJava
/Users/guozhang/Workspace/github/guozhangwang/kafka-work/streams/src/test/java/org/apache/kafka/streams/integration/ErrorHandlingIntegrationTest.java:95: warning: [deprecation] CACHE_MAX_BYTES_BUFFERING_CONFIG in org.apache.kafka.streams.StreamsConfig has been deprecated
                mkEntry(StreamsConfig.CACHE_MAX_BYTES_BUFFERING_CONFIG, 0),
                                     ^
error: warnings found and -Werror specified
1 error
1 warning

> Task :streams:compileTestJava FAILED

I guess your branch was not rebased on top of trunk and maybe that's why you did not see the failure?

@vamossagar12
Copy link
Copy Markdown
Contributor Author

@guozhangwang i had done a rebase before pushing and hadn't notice the issue. Not sure what went wrong there(maybe an oversight from me). Anyways i did another rebase and now i could see the error. Have fixed it and pushed again.

@vamossagar12
Copy link
Copy Markdown
Contributor Author

@guozhangwang , now i see failures with raft/zk tests. that seems unrelated to this PR.

Copy link
Copy Markdown
Contributor

@guozhangwang guozhangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only reviewed the code snippet around deprecated config logic, only nit comments regarding the log lines.

@ableegoldman could you also take another look?

Comment thread streams/src/main/java/org/apache/kafka/streams/StreamsConfig.java Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overriding {} to {} here could also be a bit confusing since it could be interpreted as the second override the first by some one. I'd suggest we just say Both deprecated config {} and new config {} are set, hence {} is ignored and the new config {} (value {}) is used.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah makes sense. I have updated it to use topology name along with the rest of log message. I guess that should be fine?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here: we just say "Only deprecated config {} is set and hence it's value {} would be used; we suggest setting the new config {} instead since the deprecated config {} would be removed in the future."

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

@mjsax mjsax changed the title Kip 770 KAFKA-13152: Replace "buffered.records.per.partition" with "input.buffer.max.bytes" Mar 7, 2022
@mjsax mjsax added kip Requires or implements a KIP streams labels Mar 7, 2022
@guozhangwang
Copy link
Copy Markdown
Contributor

Re-triggered jenkins.

@guozhangwang
Copy link
Copy Markdown
Contributor

cc @ableegoldman

@guozhangwang
Copy link
Copy Markdown
Contributor

@vamossagar12 could you resolve the conflicts before I re-trigger jenkins again?

@vamossagar12
Copy link
Copy Markdown
Contributor Author

@vamossagar12 could you resolve the conflicts before I re-trigger jenkins again?

@guozhangwang done. On my local, only one test failed in streams which is => org.apache.kafka.streams.integration.PurgeRepartitionTopicIntegrationTest#shouldRestoreState

@guozhangwang guozhangwang merged commit 0924fd3 into apache:trunk Mar 22, 2022
@guozhangwang
Copy link
Copy Markdown
Contributor

Thanks @vamossagar12 . I've merged the PR, and please go ahead and mark the ticket / KIP as for 3.3.0.

@vamossagar12
Copy link
Copy Markdown
Contributor Author

Thanks @guozhangwang !

Copy link
Copy Markdown
Member

@ableegoldman ableegoldman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the new PR @vamossagar12 ! Sorry I wasn't able to take a look earlier but I just gave it a quick pass now. I took the liberty of moving the #getTotalCacheSize method to the StreamsConfigUtils class myself since I'm doing a quick warning log PR in that part of the code anyways.

createAndAddStreamThread(cacheSizePerThread, i);
createAndAddStreamThread(0L, 0L, i);
}
// Initially, all Stream Threads are created with 0 cache size and max buffer size and then resized here.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we do it this way? ie rather than just computing the size upfront and creating the threads with that? I find this a bit confusing, mainly because I can't tell if there is a technical reason for doing it this way or it was just a design choice

// and then resize them later
streamThread = createAndAddStreamThread(0L, 0L, threadIdx);
final int numLiveThreads = getNumLiveStreamThreads();
resizeThreadCacheAndBufferMemory(numLiveThreads + 1);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this feedback got lost in the shuffle when we reverted the original PR, but this needs to be fixed -- the + 1 was only necessary in the old code because we resized the cache before adding the new thread/computing the thread count. Now that we first create the new thread, the numLiveThreads count should accurately reflect the number of current threads, so we shouldn't be adding to it anymore.

I'll include a fix for this on the side in a PR I'm doing so no worries 🙂

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the code change, and I agree with you. We don't need +1 anymore. Nice catch!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah I forgot about it :( Thanks for the new PR. I will take a look..

@ableegoldman
Copy link
Copy Markdown
Member

FYI @vamossagar12 here is the PR: #11959

@vamossagar12
Copy link
Copy Markdown
Contributor Author

Thanks for the new PR @vamossagar12 ! Sorry I wasn't able to take a look earlier but I just gave it a quick pass now. I took the liberty of moving the #getTotalCacheSize method to the StreamsConfigUtils class myself since I'm doing a quick warning log PR in that part of the code anyways.

Thanks for that! I thought I will create a follow up PR for moving to StreamsConfigUtils after another issue that I have been working on. You have done it already !

ableegoldman added a commit that referenced this pull request Mar 30, 2022
…11959)

Since the topology-level cache size config only controls whether we disable the caching layer entirely for that topology, setting it to anything other than 0 has no effect. The actual cache memory is still just split evenly between the threads, and shared by all topologies.

It's possible we'll want to change this in the future, but for now we should make sure to log a warning so that users who do try to set this override to some nonzero value are made aware that it doesn't work like this.

Also includes some minor refactoring plus a fix for an off-by-one error in #11796

Reviewers: Luke Chen <showuon@gmail.com>, Walker Carlson <wcarlson@confluent.io>, Sagar Rao <sagarmeansocean@gmail.com>
wcarlson5 added a commit to confluentinc/kafka that referenced this pull request Jun 9, 2022
* KAFKA-13152: Replace "buffered.records.per.partition" with "input.buffer.max.bytes" (apache#11796)

Implements KIP-770

Reviewers: Guozhang Wang <wangguoz@gmail.com>
(cherry picked from commit 0924fd3)

* KAFKA-13945: add bytes/records consumed and produced metrics (apache#12235)

Implementation of KIP-846: Source/sink node metrics for Consumed/Produced throughput in Streams

Adds the following INFO topic-level metrics for the total bytes/records consumed and produced:

    bytes-consumed-total
    records-consumed-total
    bytes-produced-total
    records-produced-total

Reviewers: Kvicii <Karonazaba@gmail.com>, Guozhang Wang <guozhang@apache.org>, Bruno Cadonna <cadonna@apache.org>

* Update CODEOWNERS (#721)

(cherry picked from commit ad18b43)

Co-authored-by: vamossagar12 <sagarmeansocean@gmail.com>
Co-authored-by: A. Sophie Blee-Goldman <sophie@confluent.io>
Co-authored-by: Chris Johnson <53450846+cjohnson-confluent@users.noreply.github.com>
mjsax added a commit to mjsax/kafka that referenced this pull request Jul 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kip Requires or implements a KIP streams

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants