KAFKA-7280: Synchronize consumer fetch request/response handling#5495
KAFKA-7280: Synchronize consumer fetch request/response handling#5495hachikuji merged 3 commits intoapache:trunkfrom
Conversation
|
@hachikuji I haven't written tests yet, but just wanted to know if this is the right place for synchronization. I was initially thinking of making |
There was a problem hiding this comment.
Wouldn't it be enough to lock on the given instance of the FetchSessionHandler given that's the object that runs into a concurrent modification? This way as I understand we completely evict concurrency from the sendFetches method for a given instance.
Also I would generally prefer using a lock object instead of locking on this. The reason is that this way the synchronized is externalized to the public API (well in this case it is arguable since it's an internals class), + it enables accidental lock stealing, ie. a different class locking on Fetcher.this. I don't know if this is a concern now though.
There was a problem hiding this comment.
@viktorsomogyi Thanks for the review. There is also a sessionHandlers HashMap. That would need to become a ConcurrentHashMap. I wasn't sure if there was any other state. We do broad locking of the coordinator for thread-safety, I thought the same for Fetcher would be the simplest and safest fix. Since this code is generally single-threaded and locking is only to avoid concurrent access in the heartbeat thread, I am not sure it matters so much. Will wait for @hachikuji 's review and then update if required.
There was a problem hiding this comment.
I see, then it's probably ok. I was missing the detail about the sessionHandlers map, but thanks for the heads-up :).
hachikuji
left a comment
There was a problem hiding this comment.
@rajinisivaram Thanks for the patch. I guess if we use this approach, then completedFetches may not need to be a ConcurrentLinkedQueue any longer?
I think one of the problems here is that the need for concurrency control is a little obscured. I don't have any great ideas to fix it off the top of my head. Maybe we should just take the brute force approach and synchronize all of the Fetcher APIs (as well as the callbacks). That's pretty much what we ended up doing in AbstractCoordinator. Eventually I hope we can make the consumer more like the producer and the admin client, with all network IO happening in the background.
7a9ff7c to
203d0f3
Compare
|
@hachikuji I have documented the change and also added a test. I left |
hachikuji
left a comment
There was a problem hiding this comment.
Thanks for the updates, Rajini. I just had a minor comment about the test case.
| } | ||
| } | ||
| if (fetcher.hasCompletedFetches()) { | ||
| Map<TopicPartition, List<ConsumerRecord<byte[], byte[]>>> fetchedRecords = fetcher.fetchedRecords(); |
There was a problem hiding this comment.
Might be useful to have some assertions which verify fetch progress. Like perhaps we can assert the last fetched offset after we complete fetchesRemaining?
There was a problem hiding this comment.
@hachikuji Thanks for the review, updated the test.
hachikuji
left a comment
There was a problem hiding this comment.
LGTM. Thanks for the patch!
This patch fixes unsafe concurrent access in the consumer by the heartbeat thread and the thread calling `poll()` to the fetch session state in `FetchSessionHandler`. Reviewers: Viktor Somogyi <viktorsomogyi@gmail.com>, Jason Gustafson <jason@confluent.io>
This patch fixes unsafe concurrent access in the consumer by the heartbeat thread and the thread calling `poll()` to the fetch session state in `FetchSessionHandler`. Reviewers: Viktor Somogyi <viktorsomogyi@gmail.com>, Jason Gustafson <jason@confluent.io>
KAFKA-7280; Synchronize consumer fetch request/response handling (apache#5495) This patch fixes unsafe concurrent access in the consumer by the heartbeat thread and the thread calling `poll()` to the fetch session state in `FetchSessionHandler`. Reviewers: Viktor Somogyi <viktorsomogyi@gmail.com>, Jason Gustafson <jason@confluent.io>
KAFKA-7280; Synchronize consumer fetch request/response handling (apache#5495) This patch fixes unsafe concurrent access in the consumer by the heartbeat thread and the thread calling `poll()` to the fetch session state in `FetchSessionHandler`. Reviewers: Viktor Somogyi <viktorsomogyi@gmail.com>, Jason Gustafson <jason@confluent.io>
…che#5495) This patch fixes unsafe concurrent access in the consumer by the heartbeat thread and the thread calling `poll()` to the fetch session state in `FetchSessionHandler`. Reviewers: Viktor Somogyi <viktorsomogyi@gmail.com>, Jason Gustafson <jason@confluent.io>
Committer Checklist (excluded from commit message)