Skip to content

Conversation

@mrhhsg
Copy link
Member

@mrhhsg mrhhsg commented Oct 18, 2023

Proposed changes

==2456606==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200e22a872 at pc 0x56412ed8f92d bp 0x7f40311324b0 sp 0x7f4031131c78
READ of size 4 at 0x60200e22a872 thread T440 (_scanner_scan)
    #0 0x56412ed8f92c in __asan_memcpy (/mnt/hdd01/ci/master-deploy/be/lib/doris_be+0x2057992c) (BuildId: ff45118ed8ebe51d)
    #1 0x564130aec228 in doris::ComparisonPredicateBase<(doris::PrimitiveType)5, (doris::PredicateType)1>::is_always_true(std::pair const&) const /home/zcp/repo_center/doris_master/doris/be/src/olap/comparison_predicate.h:184:9
    #2 0x56412f8d5068 in doris::segment_v2::ColumnReader::prune_predicates_by_zone_map(std::vector>&, int) const /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/column_reader.cpp:357:24
    #3 0x56412f35d1f4 in doris::segment_v2::Segment::new_iterator(std::shared_ptr, doris::StorageReadOptions const&, std::unique_ptr>*) /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/segment.cpp:165:28
    #4 0x56412fb20883 in doris::BetaRowsetReader::get_segment_iterators(doris::RowsetReaderContext*, std::vector>, std::allocator>>>*, bool) /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/beta_rowset_reader.cpp:237:27
    #5 0x56412fb22001 in doris::BetaRowsetReader::_init_iterator() /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/beta_rowset_reader.cpp:264:5
    #6 0x56412fb2453e in doris::BetaRowsetReader::_init_iterator_once()::$_0::operator()() const /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/beta_rowset_reader.cpp:259:49
    #7 0x56412fb21b56 in doris::Status doris::DorisCallOnce::call(doris::BetaRowsetReader::_init_iterator_once()::$_0) /home/zcp/repo_center/doris_master/doris/be/src/util/once.h:63:27
    #8 0x56412fb21920 in doris::BetaRowsetReader::_init_iterator_once() /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/beta_rowset_reader.cpp:259:28
    #9 0x56412fb230b6 in doris::BetaRowsetReader::next_block(doris::vectorized::Block*) /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/beta_rowset_reader.cpp:299:5
    #10 0x564158f71f33 in doris::vectorized::VCollectIterator::Level0Iterator::_refresh() /home/zcp/repo_center/doris_master/doris/be/src/vec/olap/vcollect_iterator.h:256:36
    #11 0x564158f5d74c in doris::vectorized::VCollectIterator::Level0Iterator::refresh_current_row() /home/zcp/repo_center/doris_master/doris/be/src/vec/olap/vcollect_iterator.cpp:510:24
    #12 0x564158f5cf30 in doris::vectorized::VCollectIterator::Level0Iterator::init(bool) /home/zcp/repo_center/doris_master/doris/be/src/vec/olap/vcollect_iterator.cpp:468:15
    #13 0x564158f53ce7 in doris::vectorized::VCollectIterator::build_heap(std::vector, std::allocator>>&) /home/zcp/repo_center/doris_master/doris/be/src/vec/olap/vcollect_iterator.cpp:123:33
    #14 0x564158ef77b2 in doris::vectorized::BlockReader::_init_collect_iter(doris::TabletReader::ReaderParams const&) /home/zcp/repo_center/doris_master/doris/be/src/vec/olap/block_reader.cpp:147:5
    #15 0x564158ef9fad in doris::vectorized::BlockReader::init(doris::TabletReader::ReaderParams const&) /home/zcp/repo_center/doris_master/doris/be/src/vec/olap/block_reader.cpp:225:19
    #16 0x56414b531aa3 in doris::vectorized::NewOlapScanner::open(doris::RuntimeState*) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/scan/new_olap_scanner.cpp:263:32
    #17 0x56414b5b89b2 in doris::vectorized::ScannerScheduler::_scanner_scan(doris::vectorized::ScannerScheduler*, doris::vectorized::ScannerContext*, std::shared_ptr) /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:344:27
    #18 0x56414b5bcd80 in doris::vectorized::ScannerScheduler::_schedule_scanners(doris::vectorized::ScannerContext*)::$_1::operator()() const::'lambda1'()::operator()() const /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:251:35
    #19 0x56414b5bcc26 in void std::__invoke_impl(std::__invoke_other, doris::vectorized::ScannerScheduler::_schedule_scanners(doris::vectorized::ScannerContext*)::$_1::operator()() const::'lambda1'()&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14
    #20 0x56414b5bcb98 in std::enable_if, void>::type std::__invoke_r(doris::vectorized::ScannerScheduler::_schedule_scanners(doris::vectorized::ScannerContext*)::$_1::operator()() const::'lambda1'()&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:111:2
    #21 0x56414b5bc96e in std::_Function_handler::_M_invoke(std::_Any_data const&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291:9
    #22 0x56412ef901a6 in std::function::operator()() const /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:560:9
    #23 0x56412ef91f10 in doris::WorkThreadPool::work_thread(int) /home/zcp/repo_center/doris_master/doris/be/src/util/work_thread_pool.hpp:159:17
    #24 0x56412ef94614 in void std::__invoke_impl::* const&)(int), doris::WorkThreadPool*&, int&>(std::__invoke_memfun_deref, void (doris::WorkThreadPool::* const&)(int), doris::WorkThreadPool*&, int&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74:14
    #25 0x56412ef944d6 in std::__invoke_result::* const&)(int), doris::WorkThreadPool*&, int&>::type std::__invoke::* const&)(int), doris::WorkThreadPool*&, int&>(void (doris::WorkThreadPool::* const&)(int), doris::WorkThreadPool*&, int&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14
    #26 0x56412ef94496 in decltype(std::__invoke((*this)._M_pmf, std::forward*&>(fp), std::forward(fp))) std::_Mem_fn_base::*)(int), true>::operator()*&, int&>(doris::WorkThreadPool*&, int&) const /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/functional:131:11
    #27 0x56412ef94456 in void std::__invoke_impl::*)(int)>&, doris::WorkThreadPool*&, int&>(std::__invoke_other, std::_Mem_fn::*)(int)>&, doris::WorkThreadPool*&, int&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14
    #28 0x56412ef94358 in std::enable_if::*)(int)>&, doris::WorkThreadPool*&, int&>, void>::type std::__invoke_r::*)(int)>&, doris::WorkThreadPool*&, int&>(std::_Mem_fn::*)(int)>&, doris::WorkThreadPool*&, int&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:111:2
    #29 0x56412ef94285 in void std::_Bind_result::*)(int)> (doris::WorkThreadPool*, int)>::__call(std::tuple<>&&, std::_Index_tuple<0ul, 1ul>) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/functional:570:11
    #30 0x56412ef940df in void std::_Bind_result::*)(int)> (doris::WorkThreadPool*, int)>::operator()<>() /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/functional:629:17
    #31 0x56412ef93fe6 in void std::__invoke_impl::*)(int)> (doris::WorkThreadPool*, int)>>(std::__invoke_other, std::_Bind_result::*)(int)> (doris::WorkThreadPool*, int)>&&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14
    #32 0x56412ef93f86 in std::__invoke_result::*)(int)> (doris::WorkThreadPool*, int)>>::type std::__invoke::*)(int)> (doris::WorkThreadPool*, int)>>(std::_Bind_result::*)(int)> (doris::WorkThreadPool*, int)>&&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14
    #33 0x56412ef93f4e in void std::thread::_Invoker::*)(int)> (doris::WorkThreadPool*, int)>>>::_M_invoke<0ul>(std::_Index_tuple<0ul>) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:253:13
    #34 0x56412ef93f16 in std::thread::_Invoker::*)(int)> (doris::WorkThreadPool*, int)>>>::operator()() /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:260:11
    #35 0x56412ef93e5a in std::thread::_State_impl::*)(int)> (doris::WorkThreadPool*, int)>>>>::_M_run() /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211:13
    #36 0x56415e20cadf in execute_native_thread_routine /data/gcc-11.1.0/build/x86_64-pc-linux-gnu/libstdc++-v3/src/c++11/../../../../../libstdc++-v3/src/c++11/thread.cc:82:18
    #37 0x7f431cefb608 in start_thread /build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:477:8
    #38 0x7f431d1a8132 in __clone /build/glibc-SzIz7B/glibc-2.31/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@mrhhsg
Copy link
Member Author

mrhhsg commented Oct 18, 2023

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.00% (8281/22381)
Line Coverage: 29.08% (66413/228408)
Region Coverage: 27.73% (34463/124277)
Branch Coverage: 24.35% (17511/71914)
Coverage Report: http://coverage.selectdb-in.cc/coverage/9b1752c1dd0f7cc9549e73a986d821ac73d2410d_9b1752c1dd0f7cc9549e73a986d821ac73d2410d/report/index.html

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.54 seconds
stream load tsv: 552 seconds loaded 74807831229 Bytes, about 129 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.7 seconds inserted 10000000 Rows, about 348K ops/s
storage size: 17162086118 Bytes

@mrhhsg mrhhsg force-pushed the fix_prune_predicates branch from 9b1752c to 0b67e0f Compare October 18, 2023 04:40
@mrhhsg
Copy link
Member Author

mrhhsg commented Oct 18, 2023

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.01% (8283/22381)
Line Coverage: 29.09% (66436/228418)
Region Coverage: 27.74% (34473/124287)
Branch Coverage: 24.36% (17521/71926)
Coverage Report: http://coverage.selectdb-in.cc/coverage/0b67e0f3a72987a1b6a1b232392c9cc7fab9b684_0b67e0f3a72987a1b6a1b232392c9cc7fab9b684/report/index.html

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.87 seconds
stream load tsv: 575 seconds loaded 74807831229 Bytes, about 124 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 29.1 seconds inserted 10000000 Rows, about 343K ops/s
storage size: 17162296314 Bytes

Copy link
Contributor

@yiguolei yiguolei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Oct 18, 2023
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

@Gabriel39 Gabriel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yiguolei yiguolei merged commit 80e5e72 into apache:master Oct 18, 2023
@mrhhsg mrhhsg deleted the fix_prune_predicates branch October 18, 2023 08:15
@xiaokang xiaokang added the p0_c label Oct 18, 2023
yiguolei pushed a commit that referenced this pull request Oct 18, 2023
…he segment (#25582)

* [improvement](scanner) Remove the predicate that is always true for the segment (#25366) (#25427)

By utilizing the zonemap index of the segment, we can ascertain if a predicate is always true. For example, if the segment’s maximum value is 100 and the predicate is col < 101, then this predicate is always true for this segment.

* [fix](scanner) coredump caused by 'prune_predicates_by_zone_map' (#25555)
XuJianxu pushed a commit to XuJianxu/doris that referenced this pull request Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.0.3-merged merge_conflict p0_c reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants