Skip to content

[C++] Occasional failure arrow-compute-hash-join-node-test #20052

@asfimport

Description

@asfimport

The test seems to be flaky. Full log


44/84 Test #35: arrow-compute-hash-join-node-test .........***Failed    8.63 sec
Running arrow-compute-hash-join-node-test, redirecting output into /build/cpp/build/test-logs/arrow-compute-hash-join-node-test.txt (attempt 1/1)
/arrow/cpp/build-support/run-test.sh: line 88: 19125 Segmentation fault      (core dumped) $TEST_EXECUTABLE "$@" > $LOGFILE.raw 2>&1
Running main() from /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest_main.cc
[==========] Running 23 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 7 tests from HashJoin
[ RUN      ] HashJoin.Random
/arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:934: Failure
Failed
'_error_or_value46.status()' failed with Cancelled: Scheduler cancelled
Google Test trace:
/arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:1053: FULL_OUTER IS parallel = false
/build/cpp/src/arrow/compute/exec

Another one observed in AMD64 Conda C++ Full Log


[----------] 7 tests from HashJoin
[ RUN      ] HashJoin.Random
Found core dump, printing backtrace:warning: core file may not match specified executable file.
[New LWP 19309]
[New LWP 19308]
[New LWP 19306]
[New LWP 19310]
[New LWP 19307]
[New LWP 19311]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/build/cpp/debug/arrow-compute-hash-join-node-test'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000000000011479 in ?? ()
[Current thread is 1 (Thread 0x7f8cfcb7d700 (LWP 19309))]Thread 6 (Thread 0x7f8cf9fff700 (LWP 19311)):
#0  0x00007f8d01131065 in futex_abstimed_wait_cancelable (private=<optimized out>, abstime=0x7f8cf9ffd4a0, expected=0, futex_word=0x7f8cff40a790) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1  __pthread_cond_wait_common (abstime=0x7f8cf9ffd4a0, mutex=0x7f8cff40a7d8, cond=0x7f8cff40a768) at pthread_cond_wait.c:539
#2  __pthread_cond_timedwait (cond=0x7f8cff40a768, mutex=0x7f8cff40a7d8, abstime=0x7f8cf9ffd4a0) at pthread_cond_wait.c:667
#3  0x00007f8d04411496 in background_thread_sleep (tsdn=<optimized out>, interval=<optimized out>, info=0x7f8cff40a760) at src/background_thread.c:255
#4  background_work_sleep_once (ind=<optimized out>, info=<optimized out>, tsdn=<optimized out>) at src/background_thread.c:307
#5  background_work (ind=<optimized out>, tsd=<optimized out>) at src/background_thread.c:497
#6  background_thread_entry () at src/background_thread.c:522
#7  0x00007f8d0112a6db in start_thread (arg=0x7f8cf9fff700) at pthread_create.c:463
#8  0x00007f8d0180171f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95Thread 5 (Thread 0x7f8cfedff700 (LWP 19307)):
#0  0x00007f8d01131065 in futex_abstimed_wait_cancelable (private=<optimized out>, abstime=0x7f8cfedfd4a0, expected=0, futex_word=0x7f8cff40a5f0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1  __pthread_cond_wait_common (abstime=0x7f8cfedfd4a0, mutex=0x7f8cff40a638, cond=0x7f8cff40a5c8) at pthread_cond_wait.c:539
#2  __pthread_cond_timedwait (cond=0x7f8cff40a5c8, mutex=0x7f8cff40a638, abstime=0x7f8cfedfd4a0) at pthread_cond_wait.c:667
#3  0x00007f8d04411bc6 in background_thread_sleep (tsdn=<optimized out>, interval=<optimized out>, info=<optimized out>) at src/background_thread.c:255
#4  background_work_sleep_once (ind=0, info=<optimized out>, tsdn=<optimized out>) at src/background_thread.c:307
#5  background_thread0_work (tsd=<optimized out>) at src/background_thread.c:452
#6  background_work (ind=<optimized out>, tsd=<optimized out>) at src/background_thread.c:490
#7  background_thread_entry () at src/background_thread.c:522
#8  0x00007f8d0112a6db in start_thread (arg=0x7f8cfedff700) at pthread_create.c:463
#9  0x00007f8d0180171f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95Thread 4 (Thread 0x7f8cfb5ff700 (LWP 19310)):
#0  0x00007f8d01131065 in futex_abstimed_wait_cancelable (private=<optimized out>, abstime=0x7f8cfb5fd4a0, expected=0, futex_word=0x7f8cff40a6c0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1  __pthread_cond_wait_common (abstime=0x7f8cfb5fd4a0, mutex=0x7f8cff40a708, cond=0x7f8cff40a698) at pthread_cond_wait.c:539
#2  __pthread_cond_timedwait (cond=0x7f8cff40a698, mutex=0x7f8cff40a708, abstime=0x7f8cfb5fd4a0) at pthread_cond_wait.c:667
#3  0x00007f8d04411496 in background_thread_sleep (tsdn=<optimized out>, interval=<optimized out>, info=0x7f8cff40a690) at src/background_thread.c:255
#4  background_work_sleep_once (ind=<optimized out>, info=<optimized out>, tsdn=<optimized out>) at src/background_thread.c:307
#5  background_work (ind=<optimized out>, tsd=<optimized out>) at src/background_thread.c:497
#6  background_thread_entry () at src/background_thread.c:522
#7  0x00007f8d0112a6db in start_thread (arg=0x7f8cfb5ff700) at pthread_create.c:463
#8  0x00007f8d0180171f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95Thread 3 (Thread 0x7f8cff6db0c0 (LWP 19306)):
#0  0x00005630dcb287d4 in __gnu_cxx::operator==<int const*, std::vector<int, std::allocator<int> > > (__lhs=..., __rhs=...) at /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/stl_iterator.h:890
#1  0x00005630dcb1d3d1 in std::vector<int, std::allocator<int> >::empty (this=0x7ffc7ab72ae0) at /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/stl_vector.h:1005
#2  0x00005630dcafd4ea in arrow::compute::HashJoinSimpleInt (join_type=arrow::compute::JoinType::FULL_OUTER, l=..., null_in_key_l=..., r=..., null_in_key_r=..., result_l=0x7ffc7ab72cb0, result_r=0x7ffc7ab72cd0, output_length_limit=100000, length_limit_reached=0x7ffc7ab72e77) at /arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:781
#3  0x00005630dcafe22c in arrow::compute::HashJoinSimple (ctx=0x5630de57a320, join_type=arrow::compute::JoinType::FULL_OUTER, cmp=..., num_key_fields=1, key_id_l=..., key_id_r=..., original_l=..., original_r=..., l=..., r=..., output_ids_l=..., output_ids_r=..., output_length_limit=100000, length_limit_reached=0x7ffc7ab72e77) at /arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:887
#4  0x00005630dcb011c0 in arrow::compute::HashJoin_Random_Test::TestBody (this=0x5630de47a300) at /arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:1067
#5  0x00007f8d056a3c9c in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (object=0x5630de47a300, method=&virtual testing::Test::TestBody(), location=0x7f8d056b897b "the test body") at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2607
#6  0x00007f8d0569add2 in testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (object=0x5630de47a300, method=&virtual testing::Test::TestBody(), location=0x7f8d056b897b "the test body") at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2643
#7  0x00007f8d05675c03 in testing::Test::Run (this=0x5630de47a300) at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2682
#8  0x00007f8d0567663b in testing::TestInfo::Run (this=0x5630de476b50) at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2861
#9  0x00007f8d05677010 in testing::TestSuite::Run (this=0x5630de476c70) at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:3015
#10 0x00007f8d0568731c in testing::internal::UnitTestImpl::RunAllTests (this=0x5630de4762e0) at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:5855
#11 0x00007f8d056a4ce8 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x5630de4762e0, method=(bool (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * const)) 0x7f8d05686ed8 <testing::internal::UnitTestImpl::RunAllTests()>, location=0x7f8d056b9468 "auxiliary test code (environments or event listeners)") at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2607
#12 0x00007f8d0569c064 in testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x5630de4762e0, method=(bool (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * const)) 0x7f8d05686ed8 <testing::internal::UnitTestImpl::RunAllTests()>, location=0x7f8d056b9468 "auxiliary test code (environments or event listeners)") at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2643
#13 0x00007f8d056857b7 in testing::UnitTest::Run (this=0x7f8d056e5260 <testing::UnitTest::GetInstance()::instance>) at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:5438
#14 0x00007f8d056e6919 in RUN_ALL_TESTS () at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/include/gtest/gtest.h:2490
#15 0x00007f8d056e695c in main (argc=1, argv=0x7ffc7ab739d8) at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest_main.cc:52
#16 0x00007f8d01701bf7 in __libc_start_main (main=0x7f8d056e691b <main(int, char**)>, argc=1, argv=0x7ffc7ab739d8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffc7ab739c8) at ../csu/libc-start.c:310
#17 0x00005630dcaedf49 in _start ()Thread 2 (Thread 0x7f8cfdb7e700 (LWP 19308)):
#0  0x00007f8d01130ad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x5630de565a80) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x5630de565a30, cond=0x5630de565a58) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x5630de565a58, mutex=0x5630de565a30) at pthread_cond_wait.c:655
#3  0x00007f8d01b994d1 in __gthread_cond_wait (__mutex=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, __cond=<optimized out>) at /home/conda/feedstock_root/build_artifacts/gcc_compilers_1643063175398/work/build/x86_64-conda-linux-gnu/libstdc++-v3/src/c++11/condition_variable.cc:865
#4  std::__condvar::wait (__m=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, this=<optimized out>) at ../../../../../libstdc++-v3/src/c++11/gthr-default.h:155
#5  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:41
#6  0x00007f8d02f8fbb7 in arrow::internal::WorkerLoop (state=..., it=...) at /arrow/cpp/src/arrow/util/thread_pool.cc:195
#7  0x00007f8d02f90960 in arrow::internal::ThreadPool::<lambda()>::operator()(void) const (__closure=0x5630de561958) at /arrow/cpp/src/arrow/util/thread_pool.cc:344
#8  0x00007f8d02f97498 in std::__invoke_impl<void, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()> >(std::__invoke_other, arrow::internal::ThreadPool::<lambda()> &&) (__f=...) at /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/invoke.h:60
#9  0x00007f8d02f97438 in std::__invoke<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()> >(arrow::internal::ThreadPool::<lambda()> &&) (__fn=...) at /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/invoke.h:95
#10 0x00007f8d02f973d6 in std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()> > >::_M_invoke<0>(std::_Index_tuple<0>) (this=0x5630de561958) at /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:244
#11 0x00007f8d02f97293 in std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()> > >::operator()(void) (this=0x5630de561958) at /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:251
#12 0x00007f8d02f971e4 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()> > > >::_M_run(void) (this=0x5630de561950) at /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:195
#13 0x00007f8d01b9d9d4 in std::execute_native_thread_routine (__p=<optimized out>) at /home/conda/feedstock_root/build_artifacts/gcc_compilers_1643063175398/work/build/x86_64-conda-linux-gnu/libstdc++-v3/include/bits/new_allocator.h:82
#14 0x00007f8d0112a6db in start_thread (arg=0x7f8cfdb7e700) at pthread_create.c:463
#15 0x00007f8d0180171f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95Thread 1 (Thread 0x7f8cfcb7d700 (LWP 19309)):
#0  0x0000000000011479 in ?? ()
#1  0x00007f8d0331fae3 in arrow::compute::TaskSchedulerImpl::ScheduleMore (this=0x5630de572960, thread_id=0, num_tasks_finished=0) at /arrow/cpp/src/arrow/compute/exec/task_util.cc:326
#2  0x00007f8d0331e94c in arrow::compute::TaskSchedulerImpl::StartTaskGroup (this=0x5630de572960, thread_id=0, group_id=1, total_num_tasks=0) at /arrow/cpp/src/arrow/compute/exec/task_util.cc:153
#3  0x00007f8d0327d952 in arrow::compute::HashJoinBasicImpl::ProbeQueuedBatches (this=0x7f8cec24aee0, thread_index=0) at /arrow/cpp/src/arrow/compute/exec/hash_join.cc:726
#4  0x00007f8d0327d13b in arrow::compute::HashJoinBasicImpl::BuildHashTable_on_finished (this=0x7f8cec24aee0, thread_index=0) at /arrow/cpp/src/arrow/compute/exec/hash_join.cc:663
#5  0x00007f8d0327d2db in arrow::compute::HashJoinBasicImpl::RegisterBuildHashTable()::{lambda(unsigned long)#2}::operator()(unsigned long) const (__closure=0x5630de654840, thread_index=0) at /arrow/cpp/src/arrow/compute/exec/hash_join.cc:674
#6  0x00007f8d0328213c in std::_Function_handler<arrow::Status (unsigned long), arrow::compute::HashJoinBasicImpl::RegisterBuildHashTable()::{lambda(unsigned long)#2}>::_M_invoke(std::_Any_data const&, unsigned long&&) (__functor=..., __args#0=@0x7f8cfcb7b138: 0) at /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/std_function.h:286
#7  0x00007f8d032aa81e in std::function<arrow::Status (unsigned long)>::operator()(unsigned long) const (this=0x5630de654840, __args#0=0) at /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/std_function.h:688
#8  0x00007f8d0331f041 in arrow::compute::TaskSchedulerImpl::OnTaskGroupFinished (this=0x5630de572960, thread_id=0, group_id=0, all_task_groups_finished=0x7f8cfcb7b230) at /arrow/cpp/src/arrow/compute/exec/task_util.cc:244
#9  0x00007f8d0331f934 in arrow::compute::TaskSchedulerImpl::<lambda(size_t)>::operator()(size_t) const (__closure=0x5630de6a1390, thread_id=0) at /arrow/cpp/src/arrow/compute/exec/task_util.cc:349
#10 0x00007f8d0332152f in std::_Function_handler<arrow::Status(long unsigned int), arrow::compute::TaskSchedulerImpl::ScheduleMore(size_t, int)::<lambda(size_t)> >::_M_invoke(const std::_Any_data &, unsigned long &&) (__functor=..., __args#0=@0x7f8cfcb7b2b8: 0) at /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/std_function.h:286
#11 0x00007f8d032aa81e in std::function<arrow::Status (unsigned long)>::operator()(unsigned long) const (this=0x5630de654f70, __args#0=0) at /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/std_function.h:688
#12 0x00007f8d032a7f8c in arrow::compute::HashJoinNode::ScheduleTaskCallback(std::function<arrow::Status (unsigned long)>)::{lambda()#1}::operator()() const (__closure=0x5630de654f68) at /arrow/cpp/src/arrow/compute/exec/hash_join_node.cc:604
#13 0x00007f8d032b9329 in arrow::internal::FnOnce<void ()>::FnImpl<arrow::compute::HashJoinNode::ScheduleTaskCallback(std::function<arrow::Status (unsigned long)>)::{lambda()#1}>::invoke() (this=0x5630de654f60) at /arrow/cpp/src/arrow/util/functional.h:152
#14 0x00007f8d02f91ade in arrow::internal::FnOnce<void ()>::operator()() && (this=0x7f8cfcb7b3f0) at /arrow/cpp/src/arrow/util/functional.h:140
#15 0x00007f8d02f8fa87 in arrow::internal::WorkerLoop (state=..., it=...) at /arrow/cpp/src/arrow/util/thread_pool.cc:177
#16 0x00007f8d02f90960 in arrow::internal::ThreadPool::<lambda()>::operator()(void) const (__closure=0x5630de659468) at /arrow/cpp/src/arrow/util/thread_pool.cc:344
#17 0x00007f8d02f97498 in std::__invoke_impl<void, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()> >(std::__invoke_other, arrow::internal::ThreadPool::<lambda()> &&) (__f=...) at /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/invoke.h:60
#18 0x00007f8d02f97438 in std::__invoke<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()> >(arrow::internal::ThreadPool::<lambda()> &&) (__fn=...) at /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/invoke.h:95
#19 0x00007f8d02f973d6 in std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()> > >::_M_invoke<0>(std::_Index_tuple<0>) (this=0x5630de659468) at /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:244
#20 0x00007f8d02f97293 in std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()> > >::operator()(void) (this=0x5630de659468) at /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:251
#21 0x00007f8d02f971e4 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()> > > >::_M_run(void) (this=0x5630de659460) at /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:195
#22 0x00007f8d01b9d9d4 in std::execute_native_thread_routine (__p=<optimized out>) at /home/conda/feedstock_root/build_artifacts/gcc_compilers_1643063175398/work/build/x86_64-conda-linux-gnu/libstdc++-v3/include/bits/new_allocator.h:82
#23 0x00007f8d0112a6db in start_thread (arg=0x7f8cfcb7d700) at pthread_create.c:463
#24 0x00007f8d0180171f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
/build/cpp/src/arrow/compute/exec 

Reporter: David Li / @lidavidm

Related issues:

Original Issue Attachments:

Note: This issue was originally created as ARROW-15221. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions