-
Notifications
You must be signed in to change notification settings - Fork 3.7k
#6386 Enforce null check at Catalog.getDb and Database.getTable #6416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a special Exception to handle those "replay" logic.
I am worried that
MetaNotFoundExceptionwill be thrown elsewhere in the replay logic, but it may be unexpected.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is considerable. But "replay" throwing any exception is an extremely big risk, which will cause all FE crush and cannot recover. These
MetaNotFoundExceptionare mostly thrown by getDb and getTable, due to the lock inconsistence that makes editlogs out of order. Semantically, throwing this exception means some metadata the editlog want to edit on is missing during replay, which makes this replay unable to continue. Just like getDb/Table, This kind of inconsistence cannot recover anyway, and these editlogs are theoretically only affect lost metadata.I prefer to use
MetaNotFoundExceptionthat indicate metadata has lost rather than other exception which may cause confuse. The inconsistence can be checked from warning log.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ccoffline how about check db or table state after getting db or table lock , so that the edit log will always keep in order? we don't need some much check code?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@caiconghui we don't have any promise that the edit logs are in order. These check code is to prevent the worst from happening.
Edit logs that out of order may cause meta inconsistent, which has to be fixed sooner or later. We are exploring ways to ensure consistency and minimize the cost. Until then, we have to check all NPE or let all the FE to crash.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ccoffline so we only need to simply check null for getdb and get table in replay operation and just simply skip the following operation? now, the only way that fe may crash is with that replay operation when fe restart?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it better to split this PR into two PRs ? one for code refactor and another for db and table null check, I think it is easy to review and not-error-prone when PR is not so big? @ccoffline