MINOR: Improve Join integration test coverage, PART I#4331
MINOR: Improve Join integration test coverage, PART I#4331guozhangwang wants to merge 23 commits intoapache:trunkfrom
Conversation
…into KMinor-join-integration-tests
…into KMinor-join-integration-tests
…oin-integration-tests
…oin-integration-tests
…oin-integration-tests
|
|
||
| final V oldValue = sendOldValues ? serdes.valueFrom(underlying.get(entry.key())) : null; | ||
|
|
||
| underlying.put(entry.key(), entry.newValue()); |
There was a problem hiding this comment.
This is intentional as a bug fix. See description.
There was a problem hiding this comment.
Wouldn't we want to have an else clause that still performs underlying.put in case flushListener is null?
There was a problem hiding this comment.
Ack, and I think this is the root cause of the jenkins failure.
| @Rule | ||
| public final TemporaryFolder testFolder = new TemporaryFolder(TestUtils.tempDirectory()); | ||
|
|
||
| @Parameterized.Parameters |
There was a problem hiding this comment.
IMHO we should consider changing to @Parameterized.Parameters(name = "caching enabled = {0}") which prints the whether caching is enabled or not vs. just the index of the parameter.
| void runTest(final List<List<String>> expectedResult, final String storeName) throws Exception { | ||
| assert expectedResult.size() == input.size(); | ||
|
|
||
| System.out.println(builder.build().describe()); |
| TestUtils.waitForCondition(new TestCondition() { | ||
| @Override | ||
| public boolean conditionMet() { | ||
| System.out.println("RESULT: " + finalResultReached.get()); |
|
left some comments. |
|
@guozhangwang I think the failure could be related (or maybe a rebase is needed), I can reproduce locally. |
|
@bbejeck Thanks for reminding, the leftover debugging was my bad.. |
…oin-integration-tests
|
retest this please |
|
Local runs saved about 40 seconds for the streams unit test suite. |
| if (flushListener != null) { | ||
|
|
||
| final V oldValue = sendOldValues ? serdes.valueFrom(underlying.get(entry.key())) : null; | ||
| underlying.put(entry.key(), entry.newValue()); |
There was a problem hiding this comment.
Is this actually needed? If a downstream processor tries to get this value won't it get it from the cache? i.e., i don't think an evicted entry is removed from the cache until after flush has finished
There was a problem hiding this comment.
Actually it will be removed, i.e. when listener.apply(entries); is called the entry is no longer in the cache anymore.
What I observed originally is an issue when caching is turned off: note that with that case we still go through this code path (which should be optimized away anyways in the future I think), when you call a put on the store, it will immediately trigger flush and hence be processed downstream while it is not put into the underlying and also not in the cache any more.
|
|
||
| checkResult(OUTPUT_TOPIC, expectedFinalResult, numRecordsExpected); | ||
|
|
||
| if (storeName != null) |
| } | ||
| } | ||
|
|
||
| if (storeName != null) |
dguy
left a comment
There was a problem hiding this comment.
Thanks @guozhangwang. Left a couple of comments, but overall LGTM. Feel free to merge once you've addressed them
| assertThat(onlyEntry.value, is(expectedFinalResult)); | ||
| assertThat(all.hasNext(), is(false)); | ||
|
|
||
| all.close(); |
There was a problem hiding this comment.
this will never be called if one of the assertions fails
|
Thanks for your reviews. Merged to trunk. |
1. In the caching layer's flush listener call, we should always write to the underlying store, before flushing (see #4331 's point 4) for detailed explanation). When fixing 4331, it only touches on KV stores, but it turns out that we should fix for window and session store as well. 2. Also apply the optimization that was in session-store already: when the new value bytes and old value bytes are all null (this is possible e.g. if there is a put(K, V) followed by a remove(K) or put(K, null) and these two operations only hit the cache), upon flushing this mean the underlying store does not have this value at all and also no intermediate value has been sent to downstream as well. We can skip both putting a null to the underlying store as well as calling the flush listener sending `null -> null` in this case. Modifies corresponding unit tests. Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>
1. In the caching layer's flush listener call, we should always write to the underlying store, before flushing (see #4331 's point 4) for detailed explanation). When fixing 4331, it only touches on KV stores, but it turns out that we should fix for window and session store as well. 2. Also apply the optimization that was in session-store already: when the new value bytes and old value bytes are all null (this is possible e.g. if there is a put(K, V) followed by a remove(K) or put(K, null) and these two operations only hit the cache), upon flushing this mean the underlying store does not have this value at all and also no intermediate value has been sent to downstream as well. We can skip both putting a null to the underlying store as well as calling the flush listener sending `null -> null` in this case. Modifies corresponding unit tests. Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>
1. In the caching layer's flush listener call, we should always write to the underlying store, before flushing (see apache#4331 's point 4) for detailed explanation). When fixing 4331, it only touches on KV stores, but it turns out that we should fix for window and session store as well. 2. Also apply the optimization that was in session-store already: when the new value bytes and old value bytes are all null (this is possible e.g. if there is a put(K, V) followed by a remove(K) or put(K, null) and these two operations only hit the cache), upon flushing this mean the underlying store does not have this value at all and also no intermediate value has been sent to downstream as well. We can skip both putting a null to the underlying store as well as calling the flush listener sending `null -> null` in this case. Modifies corresponding unit tests. Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>
JoinIntegrationTesttoStreamStreamJoinIntegrationTest, which is only for KStream-KStream joins.AbstractJoinIntegrationTestwhich is going to be used for all the join integration test classes, parameterized with and without caching.KStreamRepartitionJoinTest.javaintoStreamStreamJoinIntegrationTest.javawith augmented stream-stream join.TableTableJoinIntegrationTestwith detailed per-step expected results and removedKTableKTableJoinIntegrationTest.Findings of the integration test:
Future works including stream-table joins will be in other PRs.
Committer Checklist (excluded from commit message)