KAFKA-10684; Avoid additional envelope copies during network transmission#9563
KAFKA-10684; Avoid additional envelope copies during network transmission#9563hachikuji merged 10 commits intoapache:trunkfrom
Conversation
|
Does this improvement work for other requests? |
|
@chia7712 I think it could. I was actually thinking about #9401 when I was working on it. As far as I can tell, the client currently copies the whole produce buffer when the request reaches |
|
I got inspired to try and extend this to apply to all message types. I've updated the patch to remove the custom response logic in Here is the results before the patch (10 topics and 500 topics): Here is the new benchmark (10 topics and 500 topics): So looks like a modest overall improvement. Note I still need to polish up a few things in the PR. |
There was a problem hiding this comment.
Does it need null check (maybe no-op)?
There was a problem hiding this comment.
I was concerned about this also, but the generated code adds its own null check.
|
@chia7712 Thanks for the reviews. I pushed an update to address your comments. I was a little concerned about the garbage created from the |
|
Here are updated results for Trunk: This patch: So generally better, but not groundbreaking. That is fine since the main improvement is simplifying the generation of |
chia7712
left a comment
There was a problem hiding this comment.
@hachikuji This improvement LGTM overall.
It seems to me the serialization mechanism get more complicated since we are trying to add different implementation for IO-heavy requests get better memory usage. For more readable and consistent code, is there a follow-up to apply this improvement to all requests (if we complete all auto-generated protocol migration)?
There was a problem hiding this comment.
If all requests are using auto-generated data, should this be default implementation of AbstractRequest?
There was a problem hiding this comment.
Yeah, I think so. And looks like we're almost there. After your patch for Produce, the only remaining unconverted API that I see is OffsetsForLeaderEpoch.
There was a problem hiding this comment.
I'm thinking about how to simplify this process.
Could we reuse the method void write(Writable writable, ObjectSerializationCache cache, short version) ? Maybe we can create a Writable instance but it does not write data to any output. Instead, it calculate the size of buffer according to input data.
There was a problem hiding this comment.
if (tagged) {
buffer.printf("int _sizeBeforeArray = _size.totalSize();%n");
}There was a problem hiding this comment.
There was a problem hiding this comment.
If it's ok with you, I'd like to address this in a separate patch. The main difference is the presence of the correlation validation logic in NetworkClient, which has been tailored to a subtle case in SaslClientAuthenticator. I think the envelope parsing logic should also be checking the correlationId, but probably not with the same quirky behavior.
There was a problem hiding this comment.
sure. Open a jira as follow-up :)
chia7712
left a comment
There was a problem hiding this comment.
@hachikuji this is a great improvement. +1
e625728 to
ec0a079
Compare
|
@chia7712 Thanks for reviews, merging to trunk. I will follow up with the suggestion about |
This patch creates a new
SendBuilderclass which allows us to avoid copying "bytes" types when transmitting an api message over the network. This is used inEnvelopeRequestandEnvelopeResponseto avoid copying the embedded data.The patch also contains a few minor cleanups such as moving envelope parsing logic into
RequestContext.Committer Checklist (excluded from commit message)