KAFKA-12181; Loosen raft fetch offset validation of remote replicas#10309
KAFKA-12181; Loosen raft fetch offset validation of remote replicas#10309hachikuji merged 7 commits intoapache:trunkfrom
Conversation
jsancio
left a comment
There was a problem hiding this comment.
Thanks of the PR. Looks good in general.
jsancio
left a comment
There was a problem hiding this comment.
LGTM. Thanks for the improvement.
abbccdda
left a comment
There was a problem hiding this comment.
Thanks for the patch, left some comments.
| Set<Integer> voters, | ||
| Set<Integer> grantingVoters | ||
| Set<Integer> grantingVoters, | ||
| LogContext logContext |
There was a problem hiding this comment.
Just for my own education, when is it preferable to use upper class log context vs creating own log context?
There was a problem hiding this comment.
The log context is useful because it carries with it a logging prefix which can be used to distinguish log messages. For example, in a streams application, the fact that we have multiple producers can make debugging difficult. Or in the context of integration/system/simulation testing, we often get logs from multiple nodes mixed together. With a common prefix, it is easy to grep messages for a particular instance so long as the LogContext is carried through to all the dependencies. Sometimes it is a little annoying to add the extra parameter, but it is worthwhile for improved debugging whenever the parent object already has a log context.
| public void testNoOpForNegativeRemoteNodeId() { | ||
| int observerId = -1; | ||
| long endOffset = 10L; | ||
| long epochStartOffset = 10L; |
There was a problem hiding this comment.
So this offset was named wrong previously?
There was a problem hiding this comment.
I'm not sure I'd call it wrong. The epoch start offset is initialized as the current log end offset. But I thought it was better to choose a more explicit name.
| throw new IllegalStateException("Detected non-monotonic update of local " + | ||
| "end offset: " + currentEndOffset.offset + " -> " + endOffsetMetadata.offset); | ||
| } else { | ||
| log.warn("Detected non-monotonic update of fetch offset from nodeId {}: {} -> {}", |
There was a problem hiding this comment.
I wonder whether the current approach is too loose. Maybe this is already done, but do we want to inform failed replica to cleanup or truncate in the FetchResponse?
There was a problem hiding this comment.
The situation we are trying to handle is when a follower loses its disk. Basically the damage is already done by the time we receive the Fetch and the only thing we can do is let the follower try to catch back up. The problem with the old logic is that it prevented this even in situations which would not violate guarantees. I am planning to file a follow-up jira to think of some ways to handle disk loss situations more generally. We would like to at least detect the situation and see if we can prevent it from causing too much damage.
…pache#10309) Currently the Raft leader raises an exception if there is a non-monotonic update to the fetch offset of a replica. In a situation where the replica had lost it disk state, this would prevent the replica from being able to recover. In this patch, we relax the validation to address this problem. It is worth pointing out that this validation could not be relied on to protect from data loss after a voter has lost committed state. Reviewers: José Armando García Sancio <jsancio@gmail.com>, Boyang Chen <boyang@confluent.io>
Currently the Raft leader raises an exception if there is a non-monotonic update to the fetch offset of a replica. In a situation where the replica had lost it disk state, this would prevent the replica from being able to recover. In this patch, we relax the validation to address this problem. It is worth pointing out that this validation could not be relied on to protect from data loss after a voter has lost committed state.
Committer Checklist (excluded from commit message)