KAFKA-9087 Replace log high watermark by future log high watermark wh…#13075
KAFKA-9087 Replace log high watermark by future log high watermark wh…#13075chia7712 merged 8 commits intoapache:trunkfrom
Conversation
…en adding log dir fetchers
|
Is it possible to add a test? |
|
So the error occurs when the fetch offset is not the futureLog's log end offset. Just curious -- is this because the log's highwatermark is lower or higher than the futureLog's? It seems that the value we put into the InitialFetchState is the first offset we fetch from, so just wanted to clarify the race and why this fixes the issue. (Not doubting you, just trying to understand 😄 ) |
|
@jolshan : It seems that this is a regression introduced in #6841. When a future log is just created, the current log's HWM could be higher than the future replica's HWM (since it's bounded by its log end offset). When the future replica starts fetching, we want it to fetch from its local HWM, not the current log's HWM. Otherwise, there will be holes in the future log. |
higher, and thanks to @junrao for the great explanation. |
| // this method should use hw of future log to create log dir fetcher. Otherwise, it causes offset mismatch error | ||
| val result = rm.maybeAddLogDirFetchers(Set(partition), new LazyOffsetCheckpoints(rm.highWatermarkCheckpoints), _ => None) | ||
| assertEquals(1, result.size) | ||
| assertEquals(0L, result(new TopicPartition(topic, 0)).initOffset) |
|
unrelated error will merge it |
…9-jan-2023 * apache/trunk: (16 commits) KAFKA-14570: Fix parenthesis in verifyFullFetchResponsePartitions output (apache#13072) MINOR: Remove public mutable fields from ProducerAppendInfo (apache#13091) KAFKA-14558: Move/Rewrite LastRecord, TxnMetadata, BatchMetadata, ProducerStateEntry, and ProducerAppendInfo to the storage module (apache#13043) KAFKA-14535: Fix flaky EndToEndAuthorization tests which were sensitive to ACL change reordering (apache#13086) KAFKA-9087 Replace log high watermark by future log high watermark wh… (apache#13075) MINOR: add error reason when controller failed to handle events (apache#13050) MINOR: doc: note how JDK-8136913 can affect client SASL (apache#13071) 2023 (apache#13083) KAFKA-14279; Add 3.3.x to core compatibility tests (apache#13076) MINOR Fixed doc generation for LogConfig class in genTopicConfigDocs. (apache#13079) ...
apache#13075) Reviewers: Ismael Juma <ismael@juma.me.uk>, Justine Olshan <jolshan@confluent.io>, Jun Rao <junrao@gmail.com>
https://issues.apache.org/jira/browse/KAFKA-9087
The race condition of processing LeaderAndIsrRequest and AlterReplicaLogDirsRequest causes above error (on V2 message format). Also, the error can be reproduced easily on V1 since there is no epoch cache. The ReplicaAlterLogDirsThread checks the offset of “future log” rather than “log. Hence, here is my two cents, we can replace log.highWatermark by futureLog.highWatermark to resolve this issue. I tested it on our cluster and it works well (on both V1 and V2).
Committer Checklist (excluded from commit message)