KAFKA-6704: InvalidStateStoreException from IQ when StreamThread closes store by bbejeck · Pull Request #4801 · apache/kafka

bbejeck · 2018-03-30T22:10:11Z

While using an iterator from IQ, it's possible to get an InvalidStateStoreException if the StreamThread closes the store during a range query.

Added a unit test to SegmentIteratorTest for this condition.

Committer Checklist (excluded from commit message)

Verify design and implementation
Verify test coverage and CI build status
Verify documentation (including upgrade notes)

bbejeck · 2018-03-30T22:12:16Z

This was an expected condition before. This fix will suppress this Exception, do we want to consider another approach?

Should this not still throw and NoSuchElementException? Not sure if we need to test for this here thought. It's just for clarification.

next() does. I agree about the test maybe we should remove it?
WDYT?

\cc @guozhangwang @vvcephei

Cannot follow with your comment. "maybe we should remove it" remove what?

The test itself.

I think this test is fine -- make sense to ensure that hasNext does not throw. We should add a second test, that expects next() to throw IMHO.

I agree we should keep the test.

The Iterator interface documents this relationship between next() and hasNext(). If it's in doubt whether next() will throw when hasNext() returns false, I think we should test it.

bbejeck · 2018-03-30T22:12:40Z

@guozhangwang @mjsax @vvcephei for reviews

mjsax · 2018-04-01T05:43:54Z

Should this not still throw and NoSuchElementException? Not sure if we need to test for this here thought. It's just for clarification.

mjsax · 2018-04-01T05:54:05Z

Why do we need this? I thought close() will trigger an exception in the inner iterator already?

ack. Cleaned the test up.

bbejeck · 2018-04-04T21:18:11Z

updated this

vvcephei

Just to clarify for myself, is there any explicit or implied guarantee that the iterator will return all the records in the store?

Because this appears to be the behavior before this change: that if you drain the iterator until hasNext() returns false, you will see all records in the store. If you get an exception in the middle, you obviously know you may not have seen all records. But if we now just "end" the iterator when the state store is closed, I cannot distinguish whether the state store closed in the middle of my read or whether I read all the records.

vvcephei · 2018-04-04T23:42:40Z

I agree we should keep the test.

The Iterator interface documents this relationship between next() and hasNext(). If it's in doubt whether next() will throw when hasNext() returns false, I think we should test it.

vvcephei · 2018-04-04T23:43:54Z

I guess that comment about verifying that next() throws also applies here.

mjsax · 2018-04-05T00:14:04Z

Meta comment: @vvcephei Last comment just started this thought. We should rethink this change. Don't we introduce a race condition between hasNext() and next() ? If hasNext() return true and than the segment expires, next() might still throw....

Maybe it's better to just keep the current behavior that hasNext() might throw, and handle the exception in an upper layer?

bbejeck · 2018-04-05T14:13:32Z

@vvcephei good point.

@mjsax you raise a good point as well if we should do this change at all as it will not cover the condition of hasNext() -> true; the store is closed; next() is called and throws...

I'm thinking now it's better to leave behavior as we have it. WDYT?

\cc @guozhangwang @mjsax @vvcephei

vvcephei · 2018-04-05T15:51:50Z

Because I hate to make any decision too easy, you can fix that race condition by buffering the next record in hasNext and using the existence of a record in the buffer to determine the answer to hasNext. This is how AbstractIterator in Guava works.

But I still kinda think that the current behavior might be better, for the reason I cited earlier ;)

It might be the case that we could document it better or throw a clearer exception.

mjsax · 2018-04-12T23:07:06Z

I tend to agree, that just document the behavior in detail might be better -- would be good to hear @guozhangwang and @dguy opinion.

guozhangwang · 2018-04-18T21:13:46Z

I think this issue can be resolved following the org.apache.kafka.common.utils.AbstractIterator pattern, i.e when we call hasNext we make the next element ready, so that next() will become a trivial call to return that item, and it does not check on hasNext again any more. By doing this we can remove the gap that can cause race condition.

Still, if we realize that in hasNext the store has closed, return false.

bbejeck · 2018-04-18T21:20:35Z

sounds good to me.

mjsax · 2018-04-19T13:47:31Z

Thanks @guozhangwang -- that's along the lines what @vvcephei suggested. +1

vvcephei · 2018-04-24T23:03:31Z

@guozhangwang @mjsax , that pattern would resolve the race condition, but I'm still wondering if we're better off throwing an exception.

If I'm iterating over a collection, and I reach the end of the iterator, wouldn't I assume that I've seen the whole collection?

For some context, I searched around a little to find out what happens in other databases. Apparently, in Oracle (and some other unnamed RDBMS), if you concurrently run a query and drop or truncate the table, depending on the exact options, you'll get one of these outcomes:

the table becomes unavailable for new queries, the query completes fully and normally, and then the table is destroyed
the table is immediately destroyed and the query immediately returns with an error (that the "object no longer exists")

FWIW, I think these semantics are actually simpler.

KIP-216 may come to bear on this topic. I think under that proposal we would throw some kind of RetriableException instead of InvalidStateStoreException. Maybe this is a good balance of safe and clear?

guozhangwang · 2018-04-25T15:58:04Z

Note that the issue we're fixing only relates to range queries for window stores, which we organize as segmented stores. This is an internal implementation detail that users should be abstracted away with. So to user's point of view, as long as the whole window store is not closed (i.e. task not migrated out, thread not dying etc) it should always be queryable. So I think capturing the exception internally and skip the segment is a better idea.

When we capture the exception, we should not terminate the iterator by returning false to the outmost hasNext directly, but we should only skip that closed segment only; if there are more segments to iterate over then we should continue to them.

guozhangwang · 2018-05-12T00:16:58Z

@bbejeck what's the status of this PR?

bbejeck · 2018-06-04T14:59:21Z

rebased this

guozhangwang · 2018-06-04T17:46:57Z

retest this please.

guozhangwang

I think we can still try out the approach with

private class RocksDbIterator extends AbstractIterator<KeyValue<Bytes, byte[]>> implements KeyValueIterator<Bytes, byte[]>

bbejeck · 2018-06-04T20:11:58Z

I have an update using AbstractIterator just need to get some other tests passing will update this PR soon

…extCondition.hasNext

… StreamThread

bbejeck · 2018-06-05T13:56:13Z

updated this using AbstractIterator

bbejeck · 2018-06-05T14:15:48Z

        public synchronized KeyValue<Bytes, byte[]> next() {
-            if (!hasNext())
-                throw new NoSuchElementException();
+            return super.next();


Not sure if required but this method was synchronized in the first place so I've kept it that way.

bbejeck · 2018-06-05T14:25:36Z

        public synchronized boolean hasNext() {
            if (!open) {
                throw new InvalidStateStoreException(String.format("RocksDB store %s has closed", storeName));
            }


Having this check here has got me to thinking more about this issue.

Without this guard condition, we have some failing unit tests.

In both the RocksDBIterator and the AbstractIterator all calls to next() make a call to hasNext() first before returning the next object. I'm not sure about changing the semantics where we return from next() without calling hasNext() first (which if we end up keeping those semantics, leaves us in the same position as before extending AbstractIterator).

I guess the question is, do we want to continue to throw an exception when hasNext() is called (when the store is closed) or simply return false?

I could be overthinking this, but I'm not entirely comfortable with returning a value from next() after closing the store. I feel like that creates more corner cases for potential errors or unexpected behavior.

WDYT?

@bbejeck That is a good question!

Originally I thought it is okay to always calling hasNext inside next(), as long as we make sure hasNext implementation is idempotent, i.e. calling it multiple times before next() does not have side effect is sufficient. But by making it idempotent we could have the corner case you mentioned. For example:

t0: call `hasNext()` -> store is still open -> call `makeNext` -> `next` field is set. t1: store is closed. t2: call `next()` -> call `hasNext()` again

Without this check, at t3 we would still return the next field.

guozhangwang · 2018-06-05T17:58:19Z

+        }
+
+        private KeyValue<Bytes, byte[]> getKeyValue() {
+            return new KeyValue<>(new Bytes(iter.key()), iter.value());


A nit (and paranoid) comment: maybe we can reuse the same KeyValue object, but just set its key / value fields since they are public and not final. So we do not create those short-lived objects for young gen GC. Not sure how much it will really get us, but just want to be safer since it is part of a critical code path (i.e. one object per each iterated element).

With another look, KeyValue is immutable key and value fields are final. We could extend KeyValue as an inner class of RocksDBStore to accomplish this. WDYT?

NM that won't work.

I see. Do not bother then :) At lease we are not introduce a regression to make perf worse :)

I am late thus just a meta comment: we hand the KeyValue object to the user and user might actually keep a reference. Thus, we cannot reuse an object anyway, because we might mess up user code if they access an earlier return KeyValue again, after they retrieved newer ones.

yeah, that's an excellent point.

…grained-acl-create-topics * apache-github/trunk: KAFKA-5588: Remove deprecated --new-consumer tools option (apache#5097) MINOR: Fix for the location of the trogdor.sh executable file in the documentation. (apache#5040) KAFKA-6997: Exclude test-sources.jar when $INCLUDE_TEST_JARS is FALSE MINOR: docs should point to latest version (apache#5132) KAFKA-6981: Move the error handling configuration properties into the ConnectorConfig and SinkConnectorConfig classes (KIP-298) [KAFKA-6730] Simplify State Store Recovery (apache#5013) MINOR: Rename package `internal` to `internals` for consistency (apache#5137) KAFKA-6704: InvalidStateStoreException from IQ when StreamThread closes store (apache#4801) MINOR: Add missing configs for resilience settings MINOR: Add regression tests for KTable mapValues and filter (apache#5134) KAFKA-6750: Add listener name to authentication context (KIP-282) (apache#4829) KAFKA-3665: Enable TLS hostname verification by default (KIP-294) (apache#4956) KAFKA-6938: Add documentation for accessing Headers on Kafka Streams Processor API (apache#5128) KAFKA-6813: return to double-counting for count topology names (apache#5075) KAFKA-5919; Adding checks on "version" field for tools using it MINOR: Remove deprecated KafkaStreams constructors in docs (apache#5118)

…refix * apache-github/trunk: KAFKA-6726: Fine Grained ACL for CreateTopics (KIP-277) (apache#4795) KAFKA-5588: Remove deprecated --new-consumer tools option (apache#5097) MINOR: Fix for the location of the trogdor.sh executable file in the documentation. (apache#5040) KAFKA-6997: Exclude test-sources.jar when $INCLUDE_TEST_JARS is FALSE MINOR: docs should point to latest version (apache#5132) KAFKA-6981: Move the error handling configuration properties into the ConnectorConfig and SinkConnectorConfig classes (KIP-298) [KAFKA-6730] Simplify State Store Recovery (apache#5013) MINOR: Rename package `internal` to `internals` for consistency (apache#5137) KAFKA-6704: InvalidStateStoreException from IQ when StreamThread closes store (apache#4801) MINOR: Add missing configs for resilience settings MINOR: Add regression tests for KTable mapValues and filter (apache#5134) KAFKA-6750: Add listener name to authentication context (KIP-282) (apache#4829) KAFKA-3665: Enable TLS hostname verification by default (KIP-294) (apache#4956) KAFKA-6938: Add documentation for accessing Headers on Kafka Streams Processor API (apache#5128) KAFKA-6813: return to double-counting for count topology names (apache#5075) KAFKA-5919; Adding checks on "version" field for tools using it MINOR: Remove deprecated KafkaStreams constructors in docs (apache#5118)

…es store (apache#4801) While using an iterator from IQ, it's possible to get an InvalidStateStoreException if the StreamThread closes the store during a range query. Added a unit test to SegmentIteratorTest for this condition. Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>

bbejeck commented Mar 30, 2018

View reviewed changes

mjsax requested review from dguy, guozhangwang and mjsax March 30, 2018 23:14

mjsax added the streams label Mar 30, 2018

mjsax reviewed Apr 1, 2018

View reviewed changes

bbejeck force-pushed the KAFKA_6704_invalid_state_store_error_possible_from_iq branch from c3b3d43 to da22892 Compare April 4, 2018 21:17

vvcephei requested changes Apr 4, 2018

View reviewed changes

bbejeck force-pushed the KAFKA_6704_invalid_state_store_error_possible_from_iq branch from da22892 to 3323708 Compare June 4, 2018 14:58

guozhangwang reviewed Jun 4, 2018

View reviewed changes

bbejeck added 5 commits June 4, 2018 22:12

KAFKA-6704: hasNext should throw InvalidStateStoreException from hasN…

c9efd08

…extCondition.hasNext

KAFKA-6704: hasNext could throw when IQ iterates over store closed by…

d97ef62

… StreamThread

updates per comments

f078efe

fixed checkstyle issue

111ceb3

Implement AbstractIterator

16e2a57

Added check for if the store is closed on hasNext

9e8d65c

bbejeck force-pushed the KAFKA_6704_invalid_state_store_error_possible_from_iq branch from 3323708 to 9e8d65c Compare June 5, 2018 13:55

added synchronized to hasNext and next

01be095

bbejeck commented Jun 5, 2018

View reviewed changes

guozhangwang reviewed Jun 5, 2018

View reviewed changes

guozhangwang merged commit ef41369 into apache:trunk Jun 5, 2018

bbejeck deleted the KAFKA_6704_invalid_state_store_error_possible_from_iq branch July 10, 2024 12:56

Conversation

bbejeck commented Mar 30, 2018

Committer Checklist (excluded from commit message)

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bbejeck commented Mar 30, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bbejeck commented Apr 4, 2018

Uh oh!

vvcephei left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mjsax commented Apr 5, 2018

Uh oh!

bbejeck commented Apr 5, 2018

Uh oh!

vvcephei commented Apr 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mjsax commented Apr 12, 2018

Uh oh!

guozhangwang commented Apr 18, 2018

Uh oh!

bbejeck commented Apr 18, 2018

Uh oh!

mjsax commented Apr 19, 2018

Uh oh!

vvcephei commented Apr 24, 2018

Uh oh!

guozhangwang commented Apr 25, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

guozhangwang commented May 12, 2018

Uh oh!

bbejeck commented Jun 4, 2018

Uh oh!

guozhangwang commented Jun 4, 2018

Uh oh!

guozhangwang left a comment

Choose a reason for hiding this comment

Uh oh!

bbejeck commented Jun 4, 2018

Uh oh!

bbejeck commented Jun 5, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bbejeck Jun 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vvcephei commented Apr 5, 2018 •

edited

Loading

guozhangwang commented Apr 25, 2018 •

edited

Loading

bbejeck Jun 5, 2018 •

edited

Loading

bbejeck Jun 5, 2018 •

edited

Loading