KAFKA-5488: Add type-safe split() operator#9107
Conversation
|
|
Hey @mjsax , do you have time to give this a first pass? |
|
I'll put it into my backlog. But I am the main reviewer for two other KIPs (216 and 466) that I should review first as they got approve earlier and PRs are open for longer already. |
mjsax
left a comment
There was a problem hiding this comment.
Thanks for the PR @inponomarev and sorry for the long wait for a review
Many comments are about JavaDocs, so it's mostly small suggestions. A few comments about the code structure are there, too.
There was a problem hiding this comment.
nit supposed -> used ?
this function -> the provided function
There was a problem hiding this comment.
Actually, I am wondering if we should allow to pass in null? Thoughts?
There was a problem hiding this comment.
See my reply below, where we discuss null consumers: #9107 (comment)
(in short: I agree, I think we shouldn't)
There was a problem hiding this comment.
If a non-null branch is provided here? (branch -> consumer?)
But I would propose to simplify it, and just use By default (as passing in a non-null consumer should be the "default" usage).
There was a problem hiding this comment.
As above, should we even allow a not-null consumer?
There was a problem hiding this comment.
Should we call branch((k,v) -> true, branched) instead to just add a predicate and branch? This way, the default branch is nothing special at runtime any longer.
There was a problem hiding this comment.
The default branch should have index 0 (so it will be stable when branches are added or removed), but it should always be checked after all other branches. And when we come to the default branch during message processing, there is actually no need in dereferncing a predicate and calling test... that's why I treat the default branch differently.
There was a problem hiding this comment.
I guess it's fine both ways. -- The point about the index is a good one that I missed. But would still be doable I guess.
I don't think that there would be any measurable runtime difference if you use a "default predicate" (what we also do in the current implementation) -- the code is just a little "cleaner" as we don't need an extra "if" at the end -- but it's also not the end of the world as the process method is fairly simply anyway.
There was a problem hiding this comment.
I am wondering if it might be better to move this code into a build method that would be called within defaultBranch() / noDefaultBranch() ?
The pattern to pass in empty list that we modify later seems undesirable, and we should first build the list, and than pass them in -- otherwise, we make assumptions how ProcessorParameters and ProcessorGraphNode might be implemented what we should avoid.
There was a problem hiding this comment.
I clearly remember that something made me to write it this way, but I have to recall...
There was a problem hiding this comment.
Would love to learn about it. -- In general, it's easier to follow the same pattern throughout the code base. It easier to reason about the code that way, and also easier for people to learn the code base.
|
Hi @mjsax, thanks for your thorough revew! I have fixed everything according to your comments, except:
|
|
OK @mjsax concerning #9107 (comment) I remembered why it was implemented this way! The problem is that it is not necessary to invoke source.split()
.branch(isCoffee, Branched.withConsumer(issuer::setCoffeePurchases))
.branch(isElectronics, Branched.withConsumer(issuer::setElectronicsPurchases)); |
|
About the original comment: #9107 (comment) I am fine with those changes. About #9107 (comment) -- that is a good point. Thanks for explaining. I guess it's a "philosophical" question if we want to allow this pattern though, or if we want to require that either Curious to hear what @vvcephei thinks about it. |
|
@inponomarev -- Can you also update the docs for Kafka Streams and the 2.8 upgrade guide in this PR. |
The documentation had been already updated (see changes in I also modified Another question: CI checks fail because of usage of deprecated Most likely we should deprecate the |
|
To make the build pass, for now, it should be sufficient to just deprecate the method via It seems, we need to add |
|
As far as I can judge from the name, |
|
Maybe I miss-understood you question. I thought the build fails because we are using some deprecated method -- for this case, we can make the build pass by suppressing the warning. If you want to deprecate a method in the Scala API, you just add |
|
Hey @inponomarev and @mjsax ! I'm glad to see this is moving along. Regarding #9107 (comment) : My understanding was that defaultBranch/noDefaultBranch were the terminal operators, in that they close out the context of a BranchedKStream, and you can't add any more branches after one of those methods. But also, the whole branching construct is an incremental builder like the rest of the Kafka Streams API. In other words, just like this is a valid program: builder.stream("input")
.filter(myPredicate)so would be Ivan's example: builder.split()
.branch("myBranch", ...)What I mean by "incremental builder" is that each time you call a chained method in the DSL, it immediately adds nodes to the program, as opposed to having to call any kind of |
|
Hi @vvcephei , thank you for your comment. There's another question that we were unable to solve without you -- see #9107 (comment) from the words 'CI checks fail' and further discussion. Can you clarify, what's expected from |
|
@vvcephei -- hope you are also ok with the proposed changes to the KIP as per the PR description on top: #9107 (comment) |
|
Hey @inponomarev , I just took a look at the Scala API. Thanks for adding that! I figured it'd be just easier to push a few tweaks than to describe what needs to be done.
These are all separate commits above, so you can scrutinize each one. This PR is your work, so feel free to protest any of my suggestions. |
|
Hi @vvcephei thank you for your commits! Is everything else OK, especially #9107 (comment)? @mjsax I pushed small fixes to Javadoc/Scaladoc, and AFAICS only tests not related to the changes are failing. |
|
Thanks @inponomarev , Ah, I didn't notice that method signature name. I actually prefer it this way :) Thanks also for pointing out the covariance change. This is also fine. Java's type system only contains a partial implementation of variance, so we do best we can. Did you already update the KIP? If not, please do. I'm +1 on this PR. |
* consumers cannot be null * typo: "function"->"consumer"
|
Hi @mjsax , I have rebased and manually merged conflicts, and also removed JDK8 build still fails, but this time much later -- something related to integration testing |
|
Wait failure do you see exactly? Seem Jenkins in still running. |
|
I was talking about build 17 (triggered by Commit db573f5, see https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-9107/) Where did build 18 come from, why did it take 8 hours and then timed out -- I can't understand 😃 |
|
Ah I see -- well, we do have from flaky tests, so nothing to worry about I guess. The last run timed out, so I retriggered the build. However, I could build it locally with Java8/Scale 2.12 and so I guess we can merge. Just waiting for @vvcephei to take a quick look at the last Scala commit. |
|
Thanks, all, that Scala fix looks perfect to me. |
|
Merged to Congrats for getting this into the 2.8.0 release @inponomarev -- great work! |
Committer Checklist (excluded from commit message)