-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[fix](clone) fix stale tablet report miss the new cloning replica #38695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix](clone) fix stale tablet report miss the new cloning replica #38695
Conversation
|
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
|
run buildall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
be/src/olap/task/engine_clone_task.h
Outdated
| std::vector<TTabletInfo>* tablet_infos); | ||
| ~EngineCloneTask() override = default; | ||
|
|
||
| public: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: redundant access specifier has the same accessibility as the previous access specifier [readability-redundant-access-specifiers]
| public: |
Additional context
be/src/olap/task/engine_clone_task.h:49: previously declared here
public:
^
TPC-H: Total hot run time: 41337 ms |
dataroaring
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
TPC-DS: Total hot run time: 169973 ms |
ClickBench: Total hot run time: 30.05 s |
deardeng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
run external |
|
run buildall |
eb8087f to
3fc4f8e
Compare
|
run buildall |
1 similar comment
|
run buildall |
|
run buildall |
|
clang-tidy review says "All clean, LGTM! 👍" |
|
run performance |
TPC-H: Total hot run time: 41295 ms |
TPC-DS: Total hot run time: 169677 ms |
ClickBench: Total hot run time: 29.61 s |
dataroaring
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…8695) BUG: 1. BE begin collect tablet report; 2. BE clone a new replica A; 3. FE handle this BE's tablet report from step 1. But it's stale, it don't include the replica A, then FE mark replica A as bad; only after 1min later, BE report tablets again, then the new report contains replica A, only after that, FE will change replica A from bad to good. Fix: If BE clone a new replica, it should increase its report version and tell FE to update it. Then if FE handle the stale tablet report, it will compare BE's report version, then found the tablet report is stale and discard it.
…8695) BUG: 1. BE begin collect tablet report; 2. BE clone a new replica A; 3. FE handle this BE's tablet report from step 1. But it's stale, it don't include the replica A, then FE mark replica A as bad; only after 1min later, BE report tablets again, then the new report contains replica A, only after that, FE will change replica A from bad to good. Fix: If BE clone a new replica, it should increase its report version and tell FE to update it. Then if FE handle the stale tablet report, it will compare BE's report version, then found the tablet report is stale and discard it.
BUG:
only after 1min later, BE report tablets again, then the new report contains replica A, only after that, FE will change replica A from bad to good.
Fix:
If BE clone a new replica, it should increase its report version and tell FE to update it. Then if FE handle the stale tablet report, it will compare BE's report version, then found the tablet report is stale and discard it.