KAFKA-6607: Commit correct offsets for transactional input data by mjsax · Pull Request #8040 · apache/kafka

mjsax · 2020-02-05T01:16:19Z

Currently, Kafka Streams commits "offset + 1" that may lead to incorrect "consumer lag" if the input topic is transactional, because the committed offset does "step on" the commit marker, instead of "skipping it".

With this PR, we commit "offsetOfNextRecord" or consumer.position() to step over potential transactional markers to fix this issue.

Call for review @guozhangwang @ableegoldman

This PR is against 2.5 branch on purpose to avoid conflict with the current Kafka Streams refactoring. After the refactoring is merged, we can port this PR to trunk.

mjsax · 2020-02-05T01:17:59Z

Some additional side cleanup

mjsax · 2020-02-05T01:18:28Z

This new method is add for the fix

mjsax · 2020-02-05T01:19:10Z

To reuse this condition, we move it to out test utils class

mjsax · 2020-02-05T01:19:31Z

new method create in test utils class to make is reusable

mjsax · 2020-02-05T01:21:10Z

Need to set retries no a not-zero value for transactions...

The default value for retries are Integer.MAX_VALUE anyways right?

Yes, but we set it to zero within TestUtils.producerConfig() (not sure why) -- should we remove it there instead?

Hmm.. I am not sure either but we can check if removing that breaks any other tests --- with the new deliver timeout value we should no longer rely on that config values, --- if we want, we should change the deliver.timeout not this one.

I guess it's up to you :) if it is too much we can just keep it as is.

I removed the retries overwrite and left the delivery timeout default. Locally all integration tests passed. Hence, I hope it's fine that way.

mjsax · 2020-02-05T01:23:08Z

The original method was too long and checkstyle failed -- we could also add a checkstyle exception... This was just a quick fix -- let me know what you think

Sounds good to me -- better not add more checkstyle exceptions :)

mjsax · 2020-02-05T01:23:25Z

add verification for the "head record offset"

mjsax · 2020-02-05T01:24:04Z

Improve some existing tests, and add couple of more that are missing.

mjsax · 2020-02-05T01:24:20Z

Side cleanup

mjsax · 2020-02-05T01:24:34Z

add "head record offset" verification

mjsax · 2020-02-05T01:24:49Z

just some test improvments

mjsax · 2020-02-05T01:25:46Z

Because we call consumer.position() now, we need to fix the mock consumer setup in the TTD

mjsax · 2020-02-05T01:26:37Z

We cannot share the consumer any longer, because the global task calls unassign() that nukes our setup from above.

Not clear why? In MockConsumer we only do the following:

public synchronized void unsubscribe() { ensureNotClosed(); committed.clear(); subscriptions.unsubscribe(); }

And the beginningOffsets map are not nuked.

The problem is that the subscription is nuked and when we call position() the MockConsumer checks if the passed in partition is in its subscription and fails before it tries to access the beginningOffsets map.

Ah okay, got it.

mjsax · 2020-02-05T01:27:15Z

This was actually detected by the improved tests... Minor side fix.

guozhangwang

Thanks for the PR @mjsax ! The proposed solution looks good to me. Just a few minor comments plus a meta one for consumer.position exception handling.

guozhangwang · 2020-02-06T00:14:47Z

We need to consider handling two exceptions that consumer.position may throw: KafkaException -> should be a fatal one; TimeoutException -> in this case we cannot commit, probably have to treat as fatal..

Ack. We can wrap KafkaException as StreamsException. A TimeoutException should never happen (compare my comment -- let me know if you think the comment is incorrect) -- hence, we can rethrow TimeoutException as IllegalStateException to flag potential bugs.

guozhangwang · 2020-02-06T00:19:07Z

The default value for retries are Integer.MAX_VALUE anyways right?

guozhangwang · 2020-02-06T00:19:44Z

nit: unkonwn-partition.

guozhangwang · 2020-02-06T00:20:51Z

Sounds good to me -- better not add more checkstyle exceptions :)

guozhangwang · 2020-02-06T00:27:39Z

Not clear why? In MockConsumer we only do the following:

public synchronized void unsubscribe() { ensureNotClosed(); committed.clear(); subscriptions.unsubscribe(); }

And the beginningOffsets map are not nuked.

mjsax · 2020-02-06T01:24:43Z

Updated.

guozhangwang · 2020-02-06T01:25:37Z

LGTM!

mjsax · 2020-02-06T17:06:01Z

Java 8:

kafka.admin.ResetConsumerGroupOffsetTest.testResetOffsetsAllTopicsAllGroups
org.apache.kafka.streams.processor.internals.RecordQueueTest.shouldThrowOnNegativeTimestamp

Java 11:

kafka.admin.DeleteConsumerGroupsTest.testDeleteEmptyGroup
org.apache.kafka.streams.integration.SmokeTestDriverIntegrationTest.shouldWorkWithRebalance
org.apache.kafka.streams.processor.internals.RecordQueueTest.shouldThrowOnNegativeTimestamp
kafka.api.PlaintextProducerSendTest.testNonBlockingProducer
org.apache.kafka.streams.kstream.internals.KStreamImplTest.shouldSupportTriggerMaterializedWithKTableFromKStream

Retest this please.

mjsax · 2020-02-06T21:20:16Z

Java 8:

kafka.api.ConsumerBounceTest.testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup
org.apache.kafka.streams.integration.SmokeTestDriverIntegrationTest.shouldWorkWithRebalance

Java 11:

kafka.admin.DescribeConsumerGroupTest.testDescribeGroupWithShortInitializationTimeout
kafka.api.ConsumerBounceTest.testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup
kafka.api.PlaintextConsumerTest.testLowMaxFetchSizeForRequestAndPartition
org.apache.kafka.streams.integration.SmokeTestDriverIntegrationTest.shouldWorkWithRebalance
org.apache.kafka.streams.kstream.internals.KStreamImplTest.shouldSupportTriggerMaterializedWithKTableFromKStream

Retest this please.

mjsax · 2020-02-08T03:19:55Z

@guozhangwang The SmokeTestDriverIntegrationTest failed on this PR exposing an issue: for an unknown reason, sometimes consumer.position() returned an offset one larger than expected. This seems to happen during a rebalance and leads to data loss -- the original thread did process data up to offset X, but does commit offset X+2 (note that this test does run with at-least-once semantics, hence, there are no tx-marker gaps), and thus the new thread starts at the wrong position and the record at offset X+1 is never processed. I talked to @hachikuji about this today, but was also not sure why this happens.

Long story short: to fix the issue, I change this PR to get consumer.position not when we commit, but each each when we add data to the internal task queues. This allows us to get the correct consumer position.

However, with that fix, the EosIntegrationTest failed. The reason is, that in this test, we use consumer config max.poll = 1. This implies that each time we get one record at offset X, the consumer position is always X+1. When we reach the end of the input topic (with last processed record offset being Y), we commit Y+1 instead of Y+2 because the consumer did not step over the commit marker yet. Only in the next poll() call, the consumer would step over it. However, when the consumer steps over it does not return any data for the partition (as we are the end of the partition) and thus we also don't update our tracked consumer position in the record queue. @hachikuji suggested to let the consumer return an empty list of consumer records for the corresponding partition for this case. This PR includes this consumer change.

Let me know what you think. Also @hachikuji for review of the consumer change.

mjsax · 2020-02-08T03:21:23Z

Even if records might not be empty, we need to filter out the dummy records we added to indicate tx-markers

mjsax · 2020-02-08T03:21:57Z

We add a dummy record is we stepped over an tx-marker

mjsax · 2020-02-08T03:25:37Z

Instead of nothing, we return an empty list if we step over a tx-marker. After a second fetchRecords() we return nothing.

mjsax · 2020-02-08T03:26:02Z

We know pass in the current consumerPosition to track it correctly.

mjsax · 2020-02-08T03:26:29Z

We track the consumer position expliclity now

mjsax · 2020-02-08T03:27:08Z

This could happen if the buffer is empty and if the consumer only stepped over a commit marker passing in empty list of records.

mjsax · 2020-02-08T03:30:01Z

Previously, we clear the partitionGroup within closeTopology() that we call above -- however, because of the consumer position tracking, we need to delay it after the commit.

mjsax · 2020-02-08T03:30:42Z

We add the current consumer position when we add records to the queue now.

mjsax · 2020-02-08T03:31:16Z

I improve this test a little bit, adding more conditions.

mjsax · 2020-02-08T19:10:40Z

@guozhangwang @hachikuji This fix is for KS that does manual commits. I am wondering if consumer auto.offset.commit does the right thing already or if we would need a fix for this case, too?

guozhangwang · 2020-02-10T03:01:06Z

@mjsax Hmm.. after reading the PR I'm a bit inclined to fix the "unknown reason" that caused consumer.position to return the wrong value here; I'm still wondering if it is related to consumer only (I read through the code once again just now and cannot find an obvious root cause) or it is related to the interaction of streams / consumer. Can we reproduce this bug with a consumer smoke scenario?

@guozhangwang @hachikuji This fix is for KS that does manual commits. I am wondering if consumer auto.offset.commit does the right thing already or if we would need a fix for this case, too?

If we believe the consumer.position is buggy then it would affect auto.offset.commit too since we just rely on the position to commit when it triggers.

mjsax · 2020-02-10T05:03:12Z

I basically agree (just wanted to make progress on this PR -- it's easy to revert it back :) -- even if I think that the explicit position tracking is too bad -- not sure about the change in the consumer itself...) -- however, I don't understand the consumer code good enough to fix the consumer (if there is really a bug). Not sure if you or @hachikuji would have time to investigate and verify if there is a bug or if it's a usage issue in Streams?

This fix should go into 2.5 and code freeze is this week (Wednesday 2/12).

For me, SmokeTestDriverIntegrationTest failed in each run so it should be easy to reproduce with an older version of this PR).

mjsax · 2020-02-11T01:45:26Z

Updated this PR.

guozhangwang

Reviewed the latest commit, just one minor comment otherwise LGTM!

guozhangwang · 2020-02-11T02:41:54Z

+                    // if we are in PENDING_SHUTDOWN and don't find the task it implies that it was a newly assigned
+                    // task that we just skipped to create;
+                    // hence, we just skip adding the corresponding records
+                    continue;


maybe we can also log this as INFO for debugging purposes?

mjsax · 2020-02-11T17:54:08Z

Java 8 passed.
Java 11 failed with org.apache.kafka.connect.mirror.MirrorConnectorsIntegrationTest.testReplication

Retest this please.

mjsax · 2020-02-11T21:59:14Z

Java 8: kafka.admin.DescribeConsumerGroupTest.testDescribeGroupWithShortInitializationTimeout
Java 11: kafka.admin.DescribeConsumerGroupTest.testDescribeGroupMembersWithShortInitializationTimeout

Different test than above. Seems unrelated. Mering this PR.

mjsax · 2020-02-11T22:00:16Z

PR for trunk: #8091

…he#8040) Reviewer: Guozhang Wang <guozhang@confluent.io>

mjsax added the streams label Feb 5, 2020

mjsax changed the base branch from trunk to 2.5 February 5, 2020 01:16

mjsax commented Feb 5, 2020

View reviewed changes

guozhangwang reviewed Feb 6, 2020

View reviewed changes

mjsax force-pushed the kafka-6607-eos-commit branch from b551ea0 to fc3cc69 Compare February 8, 2020 03:07

mjsax commented Feb 8, 2020

View reviewed changes

mjsax force-pushed the kafka-6607-eos-commit branch from f4cfcc7 to 1281c0b Compare February 8, 2020 20:14

mjsax force-pushed the kafka-6607-eos-commit branch from 1281c0b to ab3ba68 Compare February 11, 2020 00:22

mjsax added 5 commits February 10, 2020 17:09

KAFKA-6607: Commit correct offsets for transactional input data

8a564d8

Remove debug stuff

d73faa3

Github comments

96f63c3

Fix test

be45571

Revert consumer change and fix StreamThread to not drop records

7f3c62b

mjsax force-pushed the kafka-6607-eos-commit branch from ab3ba68 to 7f3c62b Compare February 11, 2020 01:44

guozhangwang approved these changes Feb 11, 2020

View reviewed changes

Github comment

944fe8e

mjsax mentioned this pull request Feb 11, 2020

KAFKA-6607: Commit correct offsets for transactional input data #8091

Merged

mjsax merged commit 4912a8d into apache:2.5 Feb 11, 2020

mjsax deleted the kafka-6607-eos-commit branch February 11, 2020 21:59

stanislavkozlovski pushed a commit to stanislavkozlovski/kafka that referenced this pull request Feb 18, 2020

KAFKA-6607: Commit correct offsets for transactional input data (apac…

387185f

…he#8040) Reviewer: Guozhang Wang <guozhang@confluent.io>

Conversation

mjsax commented Feb 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

guozhangwang left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mjsax commented Feb 6, 2020

Uh oh!

guozhangwang commented Feb 6, 2020

Uh oh!

mjsax commented Feb 6, 2020

Uh oh!

mjsax commented Feb 6, 2020

Uh oh!

mjsax commented Feb 8, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

mjsax commented Feb 5, 2020 •

edited

Loading

mjsax commented Feb 10, 2020 •

edited

Loading