[MINOR] Guard against crashing on invalid key range queries#6521
[MINOR] Guard against crashing on invalid key range queries#6521bbejeck merged 10 commits intoapache:trunkfrom
Conversation
|
Streams should at least be consistent across store types in its handling of invalid range queries, and I felt it was better to log the error and return nothing than to throw an exception. However maybe silently returning "incorrect" results is worse than crashing and alerting users to the issue...WDYT? |
guozhangwang
left a comment
There was a problem hiding this comment.
One meta comment: should we add documentations similar to https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBCachingPriorityQueueSet.java#L71 to indicate the object logical ordering and the serialized lexicographic ordering to be consistent as well?
|
This ordering only needs to be enforced for IQ, correct? |
I think it should be applied universally, since whenever you call a |
bbejeck
left a comment
There was a problem hiding this comment.
Thanks @ableegoldman. Overall looks good to me I just have a minor comment regarding the logging level.
There was a problem hiding this comment.
nit: since this represents an invalid range maybe this could be a WARN?
1e4286f to
ac27e85
Compare
bbejeck
left a comment
There was a problem hiding this comment.
Thanks for the update @ableegoldman LGTM.
|
call for second review any of @guozhangwang @mjsax @vvcephei @cadonna |
cadonna
left a comment
There was a problem hiding this comment.
Hi @ableegoldman,
Are there unit tests in place to verify the changes in this PR?
For the rest, I have just a couple of nits.
| @Override | ||
| public KeyValueIterator<Bytes, byte[]> range(final Bytes from, | ||
| final Bytes to) { | ||
| // Make sure this is a valid query |
There was a problem hiding this comment.
nit: I would remove the comment here (and in all occurrences below), because the code itself is clear enough about what it does. Maybe rename from and to to fromKey and toKey (or similar) to make it even more clearer. Renaming would also apply to some of the changes below.
| final Bytes to) { | ||
| // Make sure this is a valid query | ||
| if (from.compareTo(to) > 0) { | ||
| LOG.warn("Returning empty iterator for range query with invalid range: keyFrom > keyTo."); |
There was a problem hiding this comment.
nit: I would avoid to write variable names (i.e., keyFrom and keyTo) to a log, because they are hard to maintain consistently with the code (as you can see here).
|
|
||
| @Test | ||
| public void shouldNotThrowInvalidRangeExceptionWithNegativeFromKey() { | ||
| store.range(-1, 1); |
There was a problem hiding this comment.
You can use org.apache.kafka.streams.processor.internals.testutil.LogCaptureAppender to assert the correct log message
There was a problem hiding this comment.
Ah thanks, will add to tests
|
retest this, please |
| LogCaptureAppender.setClassLoggerToDebug(InMemoryWindowStore.class); | ||
| final LogCaptureAppender appender = LogCaptureAppender.createAndRegister(); | ||
|
|
||
| store.range(-1, 1); |
There was a problem hiding this comment.
Could you add a check to verify that the returned iterator is empty. Something along the lines of assertThat(iterator.hasNext(), is(false))?
Could you also add a test for a range query where the start key is equal to the end key? Such a unit test ensures correct behaviour for this special case.
nit: I would rename the test to shouldReturnEmptyIteratorForRangeQueryWithInvalidKeyRange. Correct me, if I am wrong, but I think the empty iterator and the invalid key range are the points here, not the negative starting key. I would even change the range from (-1, 1) to (5, 3). It took me a bit to understand why (-1, 1) is an invalid range.
These comments apply also to the unit tests below.
There was a problem hiding this comment.
Ack Re: verify returned iterator is empty, add unit tests for equal start/end keys
Regarding your third point, this patch is mostly aimed at the bug in [https://issues.apache.org/jira/browse/KAFKA-8159]
which went undiscovered for a while because there were no tests of range queries with a negative key. I actually think it's fair to say we make no guarantees about what will happen if your app makes an invalid query; however we definitely shouldn't crash on what appears to be a valid query range (ie [-1,1]), which is the key point here
|
Java 8 failed with retest this please |
|
Java 8 passed Java 11 failure unrelated retest this please |
|
Java 8 failed retest this please |
|
Merged 6521 to trunk |
) Due to KAFKA-8159, Streams will throw an unchecked exception when a caching layer or in-memory underlying store is queried over a range of keys from negative to positive. We should add a check for this and log it then return an empty iterator (as the RocksDB stores happen to do) rather than crash Reviewers: Bruno Cadonna <bruno@confluent.io> Bill Bejeck <bbejeck@gmail.com>
Due to KAFKA-8159, Streams will throw an unchecked exception when a caching layer or in-memory underlying store is queried over a range of keys from negative to positive. We should add a check for this and log it then return an empty iterator (as the RocksDB stores happen to do) rather than crash
Committer Checklist (excluded from commit message)