-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Fix](cloud-mow) Compaciton should release delete bitmap lock when abort fail #47766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1404,7 +1404,12 @@ int64_t CloudCompactionMixin::get_compaction_permits() { | |
|
|
||
| CloudCompactionMixin::CloudCompactionMixin(CloudStorageEngine& engine, CloudTabletSPtr tablet, | ||
| const std::string& label) | ||
| : Compaction(tablet, label), _engine(engine) {} | ||
| : Compaction(tablet, label), _engine(engine) { | ||
| auto uuid = UUIDGenerator::instance()->next_uuid(); | ||
| std::stringstream ss; | ||
| ss << uuid; | ||
| _uuid = ss.str(); | ||
| } | ||
|
|
||
| Status CloudCompactionMixin::execute_compact_impl(int64_t permits) { | ||
| OlapStopWatch watch; | ||
|
|
@@ -1439,11 +1444,26 @@ Status CloudCompactionMixin::execute_compact_impl(int64_t permits) { | |
| return Status::OK(); | ||
| } | ||
|
|
||
| int64_t CloudCompactionMixin::initiator() const { | ||
| return HashUtil::hash64(_uuid.data(), _uuid.size(), 0) & std::numeric_limits<int64_t>::max(); | ||
| } | ||
|
|
||
| Status CloudCompactionMixin::execute_compact() { | ||
| TEST_INJECTION_POINT("Compaction::do_compaction"); | ||
| int64_t permits = get_compaction_permits(); | ||
| HANDLE_EXCEPTION_IF_CATCH_EXCEPTION(execute_compact_impl(permits), | ||
| [&](const doris::Exception& ex) { garbage_collection(); }); | ||
| HANDLE_EXCEPTION_IF_CATCH_EXCEPTION( | ||
| execute_compact_impl(permits), [&](const doris::Exception& ex) { | ||
| auto st = garbage_collection(); | ||
| if (!st.ok() && initiator() != INVALID_COMPACTION_INITIATOR_ID) { | ||
| // if compaction fail, be will try to abort compaction, and delete bitmap lock | ||
| // will release if abort job successfully, but if abort failed, delete bitmap | ||
| // lock will not release, in this situation, be need to send this rpc to ms | ||
| // to try to release delete bitmap lock. | ||
| _engine.meta_mgr().remove_delete_bitmap_update_lock( | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "If the RPC to release the lock fails here, will the lock still not be released?"
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, this pr just try to release lock, it may fail on TXN_KV_CONFICT, this problem has existed for a long time, if release rpc fail, this lock will be released when lock is expired.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @luwei16 It's ok if the rpc failed to release lock here, it's the same with rpc failed on commit phase. |
||
| _tablet->table_id(), COMPACTION_DELETE_BITMAP_LOCK_ID, initiator(), | ||
| _tablet->tablet_id()); | ||
| } | ||
| }); | ||
| _load_segment_to_cache(); | ||
| return Status::OK(); | ||
| } | ||
|
|
@@ -1488,9 +1508,9 @@ Status CloudCompactionMixin::construct_output_rowset_writer(RowsetWriterContext& | |
| return Status::OK(); | ||
| } | ||
|
|
||
| void CloudCompactionMixin::garbage_collection() { | ||
| Status CloudCompactionMixin::garbage_collection() { | ||
| if (!config::enable_file_cache) { | ||
| return; | ||
| return Status::OK(); | ||
| } | ||
| if (_output_rs_writer) { | ||
| auto* beta_rowset_writer = dynamic_cast<BaseBetaRowsetWriter*>(_output_rs_writer.get()); | ||
|
|
@@ -1501,6 +1521,7 @@ void CloudCompactionMixin::garbage_collection() { | |
| file_cache->remove_if_cached_async(file_key); | ||
| } | ||
| } | ||
| return Status::OK(); | ||
| } | ||
|
|
||
| void CloudCompactionMixin::update_compaction_level() { | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not return the status?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove_delete_bitmap_update_lock just try to release lock, some times it may fail, like lock id not found beacause lock is expired and loading task has already taken the lock, in this situation, we can do nothing, just print a warnning lock is enought, so no need to return a wrong statu