-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[opt](topn)Optimize the time for topn to lazy materialized reading of external tables #52114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
|
run buildall |
TPC-H: Total hot run time: 35153 ms |
TPC-DS: Total hot run time: 186100 ms |
ClickBench: Total hot run time: 29.16 s |
|
run buildall |
TPC-H: Total hot run time: 33815 ms |
TPC-DS: Total hot run time: 184906 ms |
ClickBench: Total hot run time: 29.38 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 33885 ms |
TPC-DS: Total hot run time: 184507 ms |
ClickBench: Total hot run time: 30.09 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 34244 ms |
TPC-DS: Total hot run time: 184964 ms |
ClickBench: Total hot run time: 29.91 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
morningman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
PR approved by at least one committer and no changes requested. |
be/src/exec/rowid_fetcher.cpp
Outdated
| } | ||
|
|
||
| { | ||
| stringstream ss; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not use stringstream use fmt to get better performance
be/src/exec/rowid_fetcher.cpp
Outdated
| RETURN_IF_ERROR(vfile_scanner_ptr->read_one_line_from_range( | ||
| scan_range_desc, request_block_desc.row_id(j), &result_block, external_info, | ||
| init_reader_ms, get_block_ms)); | ||
| dst_col->insert_range_from(src_col, block_idx, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
length 1, can use insert_from to do the work。
if want speed up the call, also can call insert_from_multi_column
be/src/pipeline/dependency.cpp
Outdated
| .request = multi_get_request, | ||
| .callback = nullptr, | ||
| .rpc_timer = MonotonicStopWatch()}); | ||
| // if (profile != nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if unless should delete
be/src/pipeline/dependency.cpp
Outdated
| return Status::OK(); | ||
| } | ||
|
|
||
| Status MaterializationSharedState::_update_profile_info(int64_t backend_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the func only return OK, should be void
be/src/pipeline/dependency.cpp
Outdated
| } | ||
| auto& info_map = backend_profile_info_string[backend_id]; | ||
|
|
||
| auto func = [&](const std::string& info_key) -> Status { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Choose a meaningful name
| child.to_proto(proto_counters, child_counter_map); | ||
| } | ||
|
|
||
| // Recurse into children |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if unless,should detele
be/src/util/runtime_profile.h
Outdated
| } | ||
|
|
||
| PProfileCounter to_proto_peak(const std::string& name) const { | ||
| PProfileCounter counter; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same code like to_proto just call to_proto
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 32929 ms |
TPC-DS: Total hot run time: 186311 ms |
ClickBench: Total hot run time: 29.31 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
|
PR approved by at least one committer and no changes requested. |
HappenLee
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What problem does this PR solve?
Related PR: #51329
Problem Summary:
Topn lazy materialize was introduced in pr#51329 , but the implementation had performance issues when reading external tables. This pr is used for optimization.
RowIDFetchertoMATERIALIZATION_OPERATOR.The example is as follows:
1FE 2BE
sql :select * from ali_hive.tpch100_orc.lineitem order by l_partkey limit 10;
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)