-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Fix](multi-catalog) Fix core in orc and parquet reader sometimes after low mem exception. #36574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix](multi-catalog) Fix core in orc and parquet reader sometimes after low mem exception. #36574
Conversation
|
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
0b15d2a to
234af45
Compare
|
clang-tidy review says "All clean, LGTM! 👍" |
1 similar comment
|
clang-tidy review says "All clean, LGTM! 👍" |
|
run buildall |
234af45 to
5926dd3
Compare
|
run buildall |
|
clang-tidy review says "All clean, LGTM! 👍" |
TPC-H: Total hot run time: 40291 ms |
TPC-DS: Total hot run time: 174994 ms |
ClickBench: Total hot run time: 30.71 s |
5926dd3 to
8083e36
Compare
|
run buildall |
|
clang-tidy review says "All clean, LGTM! 👍" |
TPC-H: Total hot run time: 39538 ms |
TPC-DS: Total hot run time: 173519 ms |
ClickBench: Total hot run time: 31.51 s |
|
run p0 |
1 similar comment
|
run p0 |
morningman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
morningman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2.1 has reverted this PR #37086
8083e36 to
1b27117
Compare
…er low mem exception.
1b27117 to
b6b50e5
Compare
|
run buildall |
|
clang-tidy review says "All clean, LGTM! 👍" |
TPC-H: Total hot run time: 39759 ms |
TPC-DS: Total hot run time: 174441 ms |
ClickBench: Total hot run time: 30.33 s |
morningman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
PR approved by at least one committer and no changes requested. |
…er low mem exception. (apache#36574) ## Proposed changes ### Issue ``` SIGSEGV address not mapped to object (@0x8) received by PID 2264997 (TID 2267559 OR 0x7ff60b2a4640) from PID 8; stack trace: *** 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/common/signal_handler.h:421 1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0] in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so 2# JVM_handle_linux_signal in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so 3# 0x00007FFB96D77520 in /lib/x86_64-linux-gnu/libc.so.6 4# doris::vectorized::Block::clear_column_data(int) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/core/block.cpp:705 5# doris::vectorized::OrcReader::get_next_block_impl(doris::vectorized::Block*, unsigned long*, bool*) in /mnt/hdd01/PERFORMANCE_ENV/be/lib/doris_be 6# doris::vectorized::OrcReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1533 7# doris::vectorized::VFileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/exec/scan/vfile_scanner.cpp:355 8# doris::vectorized::VFileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/exec/scan/vfile_scanner.cpp:298 9# doris::vectorized::VScanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) in /mnt/hdd01/PERFORMANCE_ENV/be/lib/doris_be 10# doris::vectorized::VScanner::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/exec/scan/vscanner.cpp:96 11# doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:258 ``` ### Root cause It is found that when executing conjuncts expressions when there is insufficient memory, column names and types may be inserted, but the columns are not inserted into the block, which eventually leads to a crash. If an exception occurs after `block->insert({nullptr, _data_type, _expr_name});`, it will lead to the situation mentioned above ``` Status VectorizedFnCall::_do_execute(doris::vectorized::VExprContext* context, doris::vectorized::Block* block, int* result_column_id, std::vector<size_t>& args) { ... // prepare a column to save result block->insert({nullptr, _data_type, _expr_name}); if (_can_fast_execute) { auto can_fast_execute = fast_execute(*block, args, num_columns_without_result, block->rows(), _function->get_name()); if (can_fast_execute) { *result_column_id = num_columns_without_result; return Status::OK(); } } ``` ### Solution apache#37086
…er low mem exception. (#36574) ## Proposed changes ### Issue ``` SIGSEGV address not mapped to object (@0x8) received by PID 2264997 (TID 2267559 OR 0x7ff60b2a4640) from PID 8; stack trace: *** 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/common/signal_handler.h:421 1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0] in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so 2# JVM_handle_linux_signal in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so 3# 0x00007FFB96D77520 in /lib/x86_64-linux-gnu/libc.so.6 4# doris::vectorized::Block::clear_column_data(int) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/core/block.cpp:705 5# doris::vectorized::OrcReader::get_next_block_impl(doris::vectorized::Block*, unsigned long*, bool*) in /mnt/hdd01/PERFORMANCE_ENV/be/lib/doris_be 6# doris::vectorized::OrcReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/exec/format/orc/vorc_reader.cpp:1533 7# doris::vectorized::VFileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/exec/scan/vfile_scanner.cpp:355 8# doris::vectorized::VFileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/exec/scan/vfile_scanner.cpp:298 9# doris::vectorized::VScanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) in /mnt/hdd01/PERFORMANCE_ENV/be/lib/doris_be 10# doris::vectorized::VScanner::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/exec/scan/vscanner.cpp:96 11# doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:258 ``` ### Root cause It is found that when executing conjuncts expressions when there is insufficient memory, column names and types may be inserted, but the columns are not inserted into the block, which eventually leads to a crash. If an exception occurs after `block->insert({nullptr, _data_type, _expr_name});`, it will lead to the situation mentioned above ``` Status VectorizedFnCall::_do_execute(doris::vectorized::VExprContext* context, doris::vectorized::Block* block, int* result_column_id, std::vector<size_t>& args) { ... // prepare a column to save result block->insert({nullptr, _data_type, _expr_name}); if (_can_fast_execute) { auto can_fast_execute = fast_execute(*block, args, num_columns_without_result, block->rows(), _function->get_name()); if (can_fast_execute) { *result_column_id = num_columns_without_result; return Status::OK(); } } ``` ### Solution #37086
Proposed changes
Issue
Root cause
It is found that when executing conjuncts expressions when there is insufficient memory, column names and types may be inserted, but the columns are not inserted into the block, which eventually leads to a crash.
If an exception occurs after
block->insert({nullptr, _data_type, _expr_name});, it will lead to the situation mentioned aboveSolution
#37086