KAFKA-13785: [6/N][Emit final] emit final for TimeWindowedKStreamImpl#11896
KAFKA-13785: [6/N][Emit final] emit final for TimeWindowedKStreamImpl#11896lihaosky wants to merge 45 commits intoapache:trunkfrom
Conversation
guozhangwang
left a comment
There was a problem hiding this comment.
Made a pass; we need to rebase this branch a bit and after that I will review again.
There was a problem hiding this comment.
Should this be updated?
There was a problem hiding this comment.
Should hasIndex be true here?
There was a problem hiding this comment.
Seems windowStore in window aggregate is only used by windowStore.fetch(key, window.start());, so I don't need to use index as fetching from time ordered schema is also a point get.
There was a problem hiding this comment.
This class's changes should be reverted.
There was a problem hiding this comment.
Yes. It was previously on topic another change
There was a problem hiding this comment.
nit: would it be better named emitWindowStart?
There was a problem hiding this comment.
This is not a suggestion: we are sending null with the guarantee that we should have never forward for this key before. I think a good test case coverage would be to have a windowed aggregation emit final, followed by a join. The join results would need both the old/new values to be able to correct, and if emit final we should only emit once, with old value setting to null.
lihaosky
left a comment
There was a problem hiding this comment.
@guozhangwang . Yeah. I'll need to rebase and add unit tests. Merging this needs #11829 which is also in review. I'll rebase this on top of #11829 and add unit test. I can start perf testing first before this can be merged.
There was a problem hiding this comment.
Yes. It was previously on topic another change
There was a problem hiding this comment.
Seems windowStore in window aggregate is only used by windowStore.fetch(key, window.start());, so I don't need to use index as fetching from time ordered schema is also a point get.
8c40ed1 to
2f4550b
Compare
2f4550b to
39ed91b
Compare
There was a problem hiding this comment.
@guozhangwang , should we call this another name to avoid overriding existing emit eager's store?
There was a problem hiding this comment.
For what scenario should it happen that we have two stores with the same name? (I believe there is no problem and the code is fine, but maybe I miss something?)
There was a problem hiding this comment.
If a user just enables emit final for their existing topology which uses emit eager, will it cause it to use existing state store which has wrong data format etc?
There was a problem hiding this comment.
I think it's ok to declare this a non-supported pattern? We should call it out in the docs.
There was a problem hiding this comment.
Sorry I got to this late --- also sgtm.
There was a problem hiding this comment.
Can we add details what "closes" means, ie, when a window is closed?
Should we also mention caching? Or would this go into the weeds?
There was a problem hiding this comment.
Sure. I feel this is not related to caching.
There was a problem hiding this comment.
What happens if it's used anyway?
(nit: missing . at the end of the sentence)
There was a problem hiding this comment.
Will update with details.
There was a problem hiding this comment.
I guess this won't be called many times and flood the log while this is helpful information. Will this be printed somewhere else (i.e. topology)?
There was a problem hiding this comment.
while this is helpful information
Well, is it? To me, it sounds like as if a filter logs: "you are using a filter". -- The concern is not about spamming the logs, but it seems not to be helpful but just noise in the logs. -- I am also ok to leave it in.
There was a problem hiding this comment.
Is filter logged in the topology? This isn't logged in topology though :)
There was a problem hiding this comment.
Similar to previous comment, I feel this won't be called many time (just once when creating the processor?). And the config won't be printed out when we print all configs since this is an internal config. This will be convenient for debugging.
There was a problem hiding this comment.
It's not about spamming but it's a question if it's useful? Also seems to leak an internal config that we might want to keep hidden?
There was a problem hiding this comment.
I put it here since we don't log it since it's internal config... I'm ok to delete this though
There was a problem hiding this comment.
Did we do the same thing for stream-stream join? I don't think we did? Might be worth to do there, too? (Not part of this PR...)
There was a problem hiding this comment.
stream-stream join didn't do this. Created https://issues.apache.org/jira/browse/KAFKA-13817 to track
There was a problem hiding this comment.
Why would this record be skipped? You define a hopping window, and while window [0,10) is closed, window [5,15) is still open and the record should still go into this second window?
There was a problem hiding this comment.
Yeah. It does go to second window. Will update the comment
There was a problem hiding this comment.
The output might be easier to read/understand if we use unique values, 1 to 7 for the seven input records.
There was a problem hiding this comment.
Didn't get your question. The fetch for emit final is based on windowStart order.
There was a problem hiding this comment.
Why do we get one more output record?
There was a problem hiding this comment.
I added one more input in processData: inputTopic.pipeInput("2", "30", 1000L);
There was a problem hiding this comment.
Not really 🥲 . Will update
6ad18b5 to
e1ce13d
Compare
b3dd2b5 to
4042108
Compare
4543a23 to
36b789d
Compare
…pache#12063) Fix two bugs related to dynamic broker configs in KRaft. The first bug is that we are calling reloadUpdatedFilesWithoutConfigChange when a topic configuration is changed, but not when a broker configuration is changed. This is backwards. This function must be called only for broker configs, and never for topic configs or cluster configs. The second bug is that there were several configurations such as max.connections which are related to broker listeners, but which do not involve changing the registered listeners. We should support these configurations in KRaft. This PR fixes the configuration change validation to support this case. Reviewers: Jason Gustafson <jason@confluent.io>, Matthew de Detrich <mdedetrich@gmail.com>
…12004) Reviewers: Mickael Maison <mickael.maison@gmail.com>
Reviewers: Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>
Improve documentation for Kafka zero-copy. Kafka combines pagecache and zero-copy to greatly improve message consumption efficiency. But zero-copy only works in PlaintextTransportLayer. Reviewers: Divij Vaidya <divijvaidya13@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
…pache#12030) A new cache for RocksDBTimeOrderedWindowStore. Need this because RocksDBTimeOrderedWindowStore's key ordering is different from CachingWindowStore which has issues for MergedSortedCacheWindowStoreIterator Reviewers: Matthias J. Sax <mjsax@apache.org>, Guozhang Wang <wangguoz@gmail.com>
…ation of internal topic names (apache#11703) Reviewers: Guozhang Wang <wangguoz@gmail.com>
…pache#12074) Reviewers: Victoria Xia <victoria.xia@confluent.io>, David Arthur <mumrah@gmail.com>
…pache#12069) Reviewers: Luke Chen <showuon@gmail.com>
In drainBatchesForOneNode method, there's possibility causing some partitions in a node will never get picked. Fix this issue by maintaining a drainIndex for each node. Reviewers: Luke Chen <showuon@gmail.com>, RivenSun <91005273+RivenSun2@users.noreply.github.com>
* Fix UP-TO-DATE check in `create*VersionFile` tasks `create*VersionFile` tasks explicitly declared output UP-TO-DATE status as being false. This change properly sets the inputs to `create*VersionFile` tasks to the `commitId` and `version` values and sets `receiptFile` locally rather than in an extra property. * Enable output caching for `process*Messages` tasks `process*Messages` tasks did not have output caching enabled. This change enables that caching, as well as setting a property name and RELATIVE path sensitivity. * Fix existing Gradle deprecations Replaces `JavaExec#main` with `JavaExec#mainClass` Replaces `Report#destination` with `Report#outputLocation` Adds a `generator` configuration to projects that need to resolve the `generator` project (rather than referencing the runtimeClasspath of the `generator` project from other project contexts. Reviewers: Mickael Maison <mickael.maison@gmail.com>
When LeaderRecoveryState was added to the PartitionChangeRecord, the check for being a noop was not updated. This commit fixes that and improves the associated test to avoid this oversight in the future. Reviewers: Colin Patrick McCabe <cmccabe@apache.org>
…pache#11881) Reviewers: Guozhang Wang <wangguoz@gmail.com>
…rs in KRaft mode (apache#12075) This PR fixes a case where we were unable to place on fenced brokers In KRaft mode. Specifically, if we had a broker registration in the metadata log, but no associated heartbeat, previously the HeartbeatManager would not track the fenced broker. This PR fixes this by adding this logic to the metadata log replay path in ClusterControlManager. Reviewers: David Arthur <mumrah@gmail.com>, dengziming <dengziming1993@gmail.com>
This patch does some initial cleanups in the context of KAFKA-13790. Mainly, it renames `ZkVersion` field to `PartitionEpoch` in the `LeaderAndIsrRequest`, the `LeaderAndIsr` and the `Partition`. Reviewers: Jason Gustafson <jason@confluent.io>, dengziming <dengziming1993@gmail.com>
The wildcard * in command without wrapped by single quote will be replaced into the file name under the current folder by bash. So we need to wrap with single quote. Update the doc and command option description. Reviewers: dengziming <dengziming1993@gmail.com>, Luke Chen <showuon@gmail.com>
Reviewers: Luke Chen <showuon@gmail.com>
Using enums instead of Strings for auto.offset.reset configuration Reviewers: Divij Vaidya <divijvaidya13@gmail.com>, Luke Chen <showuon@gmail.com
…pache#12032) Currently we validate recovery state before checking leader epoch in `KafkaController`. It seems more intuitive to validate leader epoch first since the leader might be working with stale state, which is what we do in KRaft. This patch fixes this and adds a couple additional validations to make the behavior consistent. Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>
This patch refactors kafka.cluster.Replica, it usages and tests. This is part of the work in KAFKA-13790. Reviewers: Jason Gustafson <jason@confluent.io>
Adding KRaft and ZK params to ConfigCommandIntegrationTest wherever appropriate. Reviewers: Kvicii <42023367+Kvicii@users.noreply.github.com>, dengziming <dengziming1993@gmail.com>, José Armando García Sancio <jsancio@users.noreply.github.com>
…2064) The bug was introduced in apache#11689 that an additional onAcknowledgement was made using the InterceptorCallback class. This is undesirable since onSendError will attempt to call onAcknowledgement once more. Reviewers: Jun Rao <junrao@gmail.com>
The html document generation has some errors in it, specifically related to protocols. The two issues identified and resolved are: * Missing </tbody> closing tags added * Invalid usage of a <p> tag as a wrapper element for <table> elements. Changed the <p> tag to be a <div>. Tested by running ./gradlew siteDocsTar and observing that the output was properly formed. Reviewers: Guozhang Wang <wangguoz@gmail.com>
guozhangwang
left a comment
There was a problem hiding this comment.
I incorporated the latest comments and pushed a new commit but it seems I messed github with other commits from trunk as well..
There was a problem hiding this comment.
I will rephrase the comment a bit to make it clearer.
There was a problem hiding this comment.
Sorry I got to this late --- also sgtm.
| } | ||
| } | ||
|
|
||
| final long startNs = time.nanoseconds(); |
There was a problem hiding this comment.
If we do intend to use nanoseconds instead of milliseconds, then we should name the metrics name as "...-latency-ns" and also in the description to emphasize it is measured in nanos, since by default all latency are measured in millis across AK package unless otherwise explicitly named / described.
Personally I think it's sufficient to measure in milis. WDYT @lihaosky @mjsax ?
| private static final String EMITTED_RECORDS_RATE_DESCRIPTION = | ||
| RATE_DESCRIPTION_PREFIX + EMITTED_RECORDS_DESCRIPTION + RATE_DESCRIPTION_SUFFIX; | ||
|
|
||
| private static final String EMIT_FINAL_LATENCY = "window-aggregate-final-emit" + LATENCY_SUFFIX; |
There was a problem hiding this comment.
Please see my other comment --- if we measure in nanos we'd need to explicitly add that in the name and in the description.
| RATE_DESCRIPTION_PREFIX + EMITTED_RECORDS_DESCRIPTION + RATE_DESCRIPTION_SUFFIX; | ||
|
|
||
| private static final String EMIT_FINAL_LATENCY = "window-aggregate-final-emit" + LATENCY_SUFFIX; | ||
| private static final String EMIT_FINAL_DESCRIPTION = "calls to emit final"; |
There was a problem hiding this comment.
We can replace with EMITTED_RECORDS + LATENCY_SUFFIX.
|
|
||
| builder.stream(TOPIC, | ||
| Consumed.with(Serdes.String(), Serdes.String())) | ||
| .transform(() -> new Transformer<String, String, KeyValue<String, String>>() { |
There was a problem hiding this comment.
I think all these transforms can be replaced as process since there's no resulted stream to continue.
There was a problem hiding this comment.
EDIT: never mind, I saw that you've done so in the other PR...
| public void shouldSkipNonExistBaseKeyInCache() { | ||
| cachingStore.put(bytesKey("aa"), bytesValue("0002"), 0); | ||
|
|
||
| final SegmentedCacheFunction baseCacheFunction = new SegmentedCacheFunction(new TimeFirstWindowKeySchema(), SEGMENT_INTERVAL); |
Description
Initial implementation to emit final for TimeWindowedKStreamImpl. This PR is on top of #12030
Testing
Unit test and integration test