Skip to content

KAFKA-13211: add support for infinite range query for WindowStore#11227

Merged
guozhangwang merged 9 commits intoapache:trunkfrom
showuon:KAFKA-13211
Sep 22, 2021
Merged

KAFKA-13211: add support for infinite range query for WindowStore#11227
guozhangwang merged 9 commits intoapache:trunkfrom
showuon:KAFKA-13211

Conversation

@showuon
Copy link
Copy Markdown
Member

@showuon showuon commented Aug 18, 2021

Add support for infinite range query for WindowStore.
Story JIRA: https://issues.apache.org/jira/browse/KAFKA-13210

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

Comment on lines 580 to 583
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The null cacheKeyFrom and cacheKeyTo will use range query, which is already supported in KIP-763.

Comment on lines 77 to 79
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

side fix: the all case should be from == null && to == null. Otherwise, call range method, which is already supported null range query in KIP-763

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it sufficient to distinguish the forward and reverse cases and just call range(from, to) or reverseRange(from, to)?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. Updated

@showuon showuon changed the title KAFKA-13211: add support for infinite range query for WindowStore [WIP] KAFKA-13211: add support for infinite range query for WindowStore Aug 19, 2021
@showuon showuon changed the title [WIP] KAFKA-13211: add support for infinite range query for WindowStore KAFKA-13211: add support for infinite range query for WindowStore Aug 20, 2021
@showuon
Copy link
Copy Markdown
Member Author

showuon commented Aug 20, 2021

@patrickstuedi @vvcephei , please take a look. Thank you.

Copy link
Copy Markdown
Contributor

@patrickstuedi patrickstuedi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work @showuon ! Just a few comments..

Comment thread checkstyle/suppressions.xml Outdated
files="StreamThread.java"/>

<suppress checks="BooleanExpressionComplexity"
files="InMemoryWindowStore.java"/>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is because of InMemoryWindowStore::isKeyWithinRange? Can we make that method more readable and at the same time avoid having to do this?

* This iterator must be closed after use.
*
* @param keyFrom the first key in the range
* A null value indicates a starting position from the first element in the store.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make those extra lines have the same indentation than the previous line so it can easily be seen that they belong together?

}
}

private boolean isKeyWithinRange(final Bytes key) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per comment above, can you make this method more readable by splitting the statements?

public void shouldThrowNullPointerExceptionOnRangeNullToKey() {
assertThrows(NullPointerException.class, () -> windowStore.fetch(1, null, ofEpochMilli(1L), ofEpochMilli(2L)));
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of deleting maybe you want to change the name and check that the the store returns the right values.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I've already tested these 2 test cases above (i.e. testFetchRange and testBackwardFetchRange). I don't think we should test them again here. What do you think?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, no, if you have them covered up there that's fine.

iterator.next(),
new Windowed<>(bytesKey("a"), new TimeWindow(DEFAULT_TIMESTAMP, DEFAULT_TIMESTAMP + WINDOW_SIZE)),
"a");
assertFalse(iterator.hasNext());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you try if you can consolidate that code into a common method? It seem it's the same verification for different value sets in each of the tests.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Thanks.

Copy link
Copy Markdown
Contributor

@patrickstuedi patrickstuedi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few more comments..

Comment on lines 77 to 79
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it sufficient to distinguish the forward and reverse cases and just call range(from, to) or reverseRange(from, to)?

@showuon
Copy link
Copy Markdown
Member Author

showuon commented Aug 26, 2021

@patrickstuedi , thanks for the comments. I've addressed all your comments and add test coverage. Please take a look again. Thank you.

Copy link
Copy Markdown
Contributor

@patrickstuedi patrickstuedi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update!

One more thing, it might be worth adding a topologyDriver based tests and an integration test, to test all combinations of layered window stores. You can take a look at the two tests for the non-window stores:
-streams/src/test/java/org/apache/kafka/streams/integration/KTableEfficientRangeQueryTest.java

  • streams/src/test/java/org/apache/kafka/streams/integration/RangeQueryIntegrationTest.java

@showuon
Copy link
Copy Markdown
Member Author

showuon commented Sep 2, 2021

Integration tests added, but found a bug that will fail these tests. Will wait for the PR got merged and continue this PR. Thanks. #11292

@patrickstuedi
Copy link
Copy Markdown
Contributor

@showuon Any new updates on this?

@showuon
Copy link
Copy Markdown
Member Author

showuon commented Sep 10, 2021

@patrickstuedi , yes, the fix PR (#11292) is under reviewing (should be close). Thank you.

@showuon
Copy link
Copy Markdown
Member Author

showuon commented Sep 14, 2021

@patrickstuedi , the PR(#11292) is merged into trunk. I've rebased this PR, so it is good to review now. Thank you.

Copy link
Copy Markdown
Contributor

@patrickstuedi patrickstuedi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work @showuon! Looking good to me.

Copy link
Copy Markdown
Contributor

@guozhangwang guozhangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @showuon the PR LGTM. The only meta comment I have in mind is the rationale to add a separate integration test on top of all the unit tests. Usually we have integration test in order to test the interaction between multiple modules, which would be more complicated (and more likely to become flaky due to timing issues), and takes more time to run. I feel that for this functionality just the augmented unit tests are sufficient, but I might be wrong so please let me know if you feel it does bring additional coverage.

} else if (keyFrom == null && key.compareTo(getKey(keyTo)) <= 0) {
// start from the beginning
isKeyInRange = true;
} else if (key.compareTo(getKey(keyFrom)) >= 0 && keyTo == null) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: let's move keyTo == null up first so that if it does not satisfy, we do not need to trigger getKey anymore.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think about that a bit more, maybe we can make it simpler as:

            if (keyFrom == null && keyTo == null) {
                // fetch all
                return true;
            } else if (keyFrom == null) {
                // start from the beginning
                return key.compareTo(getKey(keyTo)) <= 0;
            } else if (keyTo == null) {
                // end to the last
                return key.compareTo(getKey(keyFrom)) >= 0; 
            } else {
                return key.compareTo(getKey(keyFrom)) >= 0 && key.compareTo(getKey(keyTo)) <= 0;
            }

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice refactor! Thanks.

@SuppressWarnings("deprecation") // Old PAPI. Needs to be migrated.
@RunWith(Parameterized.class)
@Category({IntegrationTest.class})
public class RangeQueryForWindowStoreIntegrationTest {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we refactor it into a unit test instead of an integration test? Does integration environment bring any additional coverage here?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, what additional coverage does this test provide on top of the augmented unit tests (which is great, btw!) below?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason we have an integration test here is because we use TopologyTestDriver in the unit test, without real brokers. But on second thought, I think we can remove the integration test, because that's why TopologyTestDriver exists, to simulate the streaming running. Especially this test case doesn't involve interaction between multiple modules. So I removed it now. Thank you.

@guozhangwang guozhangwang merged commit b61ec00 into apache:trunk Sep 22, 2021
@guozhangwang
Copy link
Copy Markdown
Contributor

LGTM! Merged to trunk.

xdgrulez pushed a commit to xdgrulez/kafka that referenced this pull request Dec 22, 2021
…ache#11227)

Add support for infinite range query for WindowStore. Story JIRA: https://issues.apache.org/jira/browse/KAFKA-13210

Reviewers: Patrick Stuedi <pstuedi@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants