[KAFKA-13369] Follower fetch protocol changes for tiered storage.#11390
[KAFKA-13369] Follower fetch protocol changes for tiered storage.#11390junrao merged 46 commits intoapache:trunkfrom
Conversation
bc2983c to
270988f
Compare
270988f to
de8ef09
Compare
|
@satishd It'd be helpful if you could please update the PR description explaining the scope of the draft PR (in its current form) and what's remaining to be done. |
f0e0b61 to
9d6e2a9
Compare
9d6e2a9 to
1436fb3
Compare
611244c to
eabd343
Compare
|
Thanks @junrao for the review. Please find inline replies, addressed most of them with latest commits. |
|
|
||
| case Errors.OFFSET_MOVED_TO_TIERED_STORAGE => | ||
| // No need to retry this as it indicates that the requested offset is moved to tiered storage. | ||
| // Check whether topicId is available here. |
There was a problem hiding this comment.
Was this comment addressed?
There was a problem hiding this comment.
The return value is not very intuitive. I'd expect a true return value to indicate that the request is handled successfully.
There was a problem hiding this comment.
This is inline with handleOutOfRangeError contract. I am fine with the suggested change but it is good to have similar semantics to handleOutOfRangeError method too for uniformity.
baa5b72 to
0c0b477
Compare
|
@junrao : Thanks for the review. Please find inline replies, updated the PR addressing them with the latest commit. |
c5830c2 to
2a69bea
Compare
Summary: Added more UTs for RemoteLogManager Test Plan: UT Reviewers: abhijeek, satishd Reviewed By: satishd Revert Plan: git revert API Changes: NA Monitoring and Alerts: NA
Summary: - The latest metadata version IBP_3_3_IV4 was not referred on the existing tests which lead to the failures. - After the local and remote leader endpoint changes, the fetcher thread tests related to remote storage become stale, fixed them. - Fixed the issue while renaming the parent directory of the index files. Test Plan: ./gradlew :core:test ./gradlew :metadata:test Reviewers: abhijeek, satishd Revert Plan: git revert API Changes: NA Monitoring and Alerts: NA
- Added loading of the existing indexes in the disk and added respective test. - Addressed other minor comments.
…h setting the right metadata max version
Resolved a few test cases related to new versions.
…amp as the first message timestamp
…or getIndex if it is already closed. Addressed review comments
Updated highwatermark when producerstate is rebuilt in fetcher. Throwing RemoteStorageException from follwer replica when remtoe stoage is not yet enabled.
…emoteLogManagerTest
|
Thanks @junrao for your updated review. Addressed them with inline comments and updated with the latest commits. |
| def loadProducerState(lastOffset: Long): Unit = lock synchronized { | ||
| rebuildProducerState(lastOffset, producerStateManager) | ||
| maybeIncrementFirstUnstableOffset() | ||
| updateHighWatermark(localLog.logEndOffset) |
There was a problem hiding this comment.
logEndOffset still uses offset. We want to use logEndOffsetMetadata.
…n UnifiedLog#loadProducerState
|
Thanks @junrao for the review, addressed it with the latest commit. There are a few tests that are failed but they do not seem to be related to this PR. |
|
Hey @satishd, I wanted to let you know about KAFKA-14470 as I think it affects some of the future KIP-405 PRs. Can we align these efforts so that we can get to the desired end state faster? For example, once the PRs that have been submitted are merged, we can move |
This PR implements the follower fetch protocol as mentioned in KIP-405.
Added a new version for
ListOffsetsprotocol to receive local log start offset on the leader replica. This is used by follower replicas to find the local log star offset on the leader.Added a new version for
FetchRequestprotocol to receive OffsetMovedToTieredStorageException error. This is part of the enhanced fetch protocol as described in KIP-405.We introduced a new field
locaLogStartOffsetto maintain the log start offset in the local logs. Existing logStartOffset will continue to be the log start offset of the effective log that includes the segments in remote storage.When a follower receives OffsetMovedToTieredStorage, then it tries to build the required state from the leader and remote storage so that it can be ready to move to fetch state.
Introduced
RemoteLogManagerwhich is responsible forRemoteStorageManagerandRemoteLogMetadataManagerinstances.Followup PRs will add more functionality like copying segments to tiered storage, retention checks to clean local and remote log segments. This will change the local log start offset and make sure the follower fetch protocol works fine for several cases.
You can look at the detailed protocol changes in KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage#KIP405:KafkaTieredStorage-FollowerReplication
Authors: satishd@apache.org, kamal.chandraprakash@gmail.com, yingz@uber.com
Committer Checklist (excluded from commit message)