Skip to content

KAFKA-12181; Loosen raft fetch offset validation of remote replicas#10309

Merged
hachikuji merged 7 commits intoapache:trunkfrom
hachikuji:KAFKA-12181
Mar 22, 2021
Merged

KAFKA-12181; Loosen raft fetch offset validation of remote replicas#10309
hachikuji merged 7 commits intoapache:trunkfrom
hachikuji:KAFKA-12181

Conversation

@hachikuji
Copy link
Copy Markdown
Contributor

Currently the Raft leader raises an exception if there is a non-monotonic update to the fetch offset of a replica. In a situation where the replica had lost it disk state, this would prevent the replica from being able to recover. In this patch, we relax the validation to address this problem. It is worth pointing out that this validation could not be relied on to protect from data loss after a voter has lost committed state.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

Copy link
Copy Markdown
Member

@jsancio jsancio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks of the PR. Looks good in general.

Comment thread raft/src/main/java/org/apache/kafka/raft/LeaderState.java Outdated
Comment thread raft/src/test/java/org/apache/kafka/raft/LeaderStateTest.java Outdated
Comment thread raft/src/test/java/org/apache/kafka/raft/RaftEventSimulationTest.java Outdated
Comment thread raft/src/test/java/org/apache/kafka/raft/RaftEventSimulationTest.java Outdated
Copy link
Copy Markdown
Member

@jsancio jsancio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the improvement.

Copy link
Copy Markdown

@abbccdda abbccdda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch, left some comments.

Set<Integer> voters,
Set<Integer> grantingVoters
Set<Integer> grantingVoters,
LogContext logContext
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my own education, when is it preferable to use upper class log context vs creating own log context?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log context is useful because it carries with it a logging prefix which can be used to distinguish log messages. For example, in a streams application, the fact that we have multiple producers can make debugging difficult. Or in the context of integration/system/simulation testing, we often get logs from multiple nodes mixed together. With a common prefix, it is easy to grep messages for a particular instance so long as the LogContext is carried through to all the dependencies. Sometimes it is a little annoying to add the extra parameter, but it is worthwhile for improved debugging whenever the parent object already has a log context.

Comment thread raft/src/test/java/org/apache/kafka/raft/RaftEventSimulationTest.java Outdated
public void testNoOpForNegativeRemoteNodeId() {
int observerId = -1;
long endOffset = 10L;
long epochStartOffset = 10L;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this offset was named wrong previously?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I'd call it wrong. The epoch start offset is initialized as the current log end offset. But I thought it was better to choose a more explicit name.

throw new IllegalStateException("Detected non-monotonic update of local " +
"end offset: " + currentEndOffset.offset + " -> " + endOffsetMetadata.offset);
} else {
log.warn("Detected non-monotonic update of fetch offset from nodeId {}: {} -> {}",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether the current approach is too loose. Maybe this is already done, but do we want to inform failed replica to cleanup or truncate in the FetchResponse?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The situation we are trying to handle is when a follower loses its disk. Basically the damage is already done by the time we receive the Fetch and the only thing we can do is let the follower try to catch back up. The problem with the old logic is that it prevented this even in situations which would not violate guarantees. I am planning to file a follow-up jira to think of some ways to handle disk loss situations more generally. We would like to at least detect the situation and see if we can prevent it from causing too much damage.

Copy link
Copy Markdown

@abbccdda abbccdda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hachikuji hachikuji merged commit f5f66b9 into apache:trunk Mar 22, 2021
Terrdi pushed a commit to Terrdi/kafka that referenced this pull request Apr 1, 2021
…pache#10309)

Currently the Raft leader raises an exception if there is a non-monotonic update to the fetch offset of a replica. In a situation where the replica had lost it disk state, this would prevent the replica from being able to recover. In this patch, we relax the validation to address this problem. It is worth pointing out that this validation could not be relied on to protect from data loss after a voter has lost committed state.

Reviewers: José Armando García Sancio <jsancio@gmail.com>, Boyang Chen <boyang@confluent.io>
@ijuma ijuma added the kraft label Aug 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants