Skip to content

[C++] Segmentation fault in hash-join/swiss-join #39951

@mpimenov

Description

@mpimenov

Describe the bug, including details regarding any error messages, version, and platform.

Continuing from #32570

Here is the crash I see in my code occasionally. Unfortunately, I do not have a small or even a reliable test case to reproduce.

@zanmato1984 fixed several crashes related to hash join before, I suspect this one may be another case from that family.

Visit<(lambda at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join.cc:332:9)> at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join.cc:133 [0x2e57685]
DecodeSelected at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join.cc:330 [0x2e57685]
FlushBuildColumn at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join.cc:1672 [0x2e5f9f6]
Flush at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join.cc:1716 [0x2e60178]
AppendAndOutput<(lambda at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join_internal.h:612:9), (lambda at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join.cc:1993:9)> at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join_internal.h:570 [0x2e60dbc]
Append<(lambda at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join.cc:1993:9)> at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join_internal.h:610 [0x2e60dbc]
OnNextBatch at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join.cc:1993 [0x2e60dbc]
ProbeSingleBatch at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join.cc:2144 [0x2e653e7]
OnProbeSideBatch at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/hash_join_node.cc:818 [0x2e0766d]
InputReceived at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/hash_join_node.cc:891 [0x2e06491]
OutputBatchCallback at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/hash_join_node.cc:1004 [0x2e0a3af]
operator() at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/hash_join_node.cc:947 [0x2e0a3af]
__invoke_impl<arrow::Status, (lambda at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/hash_join_node.cc:947:5) &, long, arrow::compute::ExecBatch> at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/invoke.h:61 [0x2e0a1fb]
__invoke_r<arrow::Status, (lambda at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/hash_join_node.cc:947:5) &, long, arrow::compute::ExecBatch> at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/invoke.h:114 [0x2e0a1fb]
_M_invoke at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/std_function.h:290 [0x2e0a1fb]
operator() at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/std_function.h:591 [0x2e60eb5]
operator() at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join.cc:1993 [0x2e60eb5]
AppendAndOutput<(lambda at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join_internal.h:612:9), (lambda at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join.cc:1993:9)> at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join_internal.h:571 [0x2e60eb5]
Append<(lambda at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join.cc:1993:9)> at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join_internal.h:610 [0x2e60eb5]
OnNextBatch at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join.cc:1993 [0x2e60eb5]
ProbeSingleBatch at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join.cc:2144 [0x2e653e7]
OnProbeSideBatch at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/hash_join_node.cc:818 [0x2e0766d]
InputReceived at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/hash_join_node.cc:891 [0x2e06491]
OutputBatchCallback at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/hash_join_node.cc:1004 [0x2e0a3af]
operator() at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/hash_join_node.cc:947 [0x2e0a3af]
__invoke_impl<arrow::Status, (lambda at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/hash_join_node.cc:947:5) &, long, arrow::compute::ExecBatch> at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/invoke.h:61 [0x2e0a1fb]
__invoke_r<arrow::Status, (lambda at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/hash_join_node.cc:947:5) &, long, arrow::compute::ExecBatch> at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/invoke.h:114 [0x2e0a1fb]
_M_invoke at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/std_function.h:290 [0x2e0a1fb]
operator() at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/std_function.h:591 [0x2e618ef]
operator() at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join.cc:2039 [0x2e618ef]
Flush<(lambda at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join.cc:2039:5)> at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join_internal.h:626 [0x2e618ef]
OnFinished at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join.cc:2039 [0x2e618ef]
OnScanHashTableFinished at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join.cc:2425 [0x2e683ff]
StartScanHashTable at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join.cc:2323 [0x2e68ff0]
ProbingFinished at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/swiss_join.cc:2153 [0x2e655f9]
OnQueuedBatchesProbed at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/hash_join_node.cc:876 [0x2e0a61b]
operator() at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/hash_join_node.cc:968 [0x2e0a61b]
__invoke_impl<arrow::Status, (lambda at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/hash_join_node.cc:967:9) &, unsigned long> at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/invoke.h:61 [0x2e0a61b]
__invoke_r<arrow::Status, (lambda at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/hash_join_node.cc:967:9) &, unsigned long> at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/invoke.h:114 [0x2e0a61b]
_M_invoke at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/std_function.h:290 [0x2e0a61b]
operator() at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/std_function.h:591 [0x2e695f5]
OnTaskGroupFinished at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/task_util.cc:252 [0x2e695f5]
operator() at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/task_util.cc:371 [0x2e6a313]
__invoke_impl<arrow::Status, (lambda at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/task_util.cc:371:5) &, unsigned long> at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/invoke.h:61 [0x2e6a313]
__invoke_r<arrow::Status, (lambda at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/task_util.cc:371:5) &, unsigned long> at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/invoke.h:114 [0x2e6a313]
_M_invoke at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/std_function.h:290 [0x2e6a313]
operator() at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/std_function.h:591 [0x2e3279f]
operator() at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/query_context.cc:82 [0x2e3279f]
__invoke_impl<arrow::Status, (lambda at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/query_context.cc:80:40) &> at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/invoke.h:61 [0x2e3279f]
__invoke_r<arrow::Status, (lambda at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/acero/query_context.cc:80:40) &> at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/invoke.h:114 [0x2e3279f]
_M_invoke at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/std_function.h:290 [0x2e3279f]
operator() at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/std_function.h:591 [0x2e3355e]
operator()<std::function<arrow::Status ()> &, arrow::Status, arrow::Future<arrow::internal::Empty> > at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/util/future.h:150 [0x2e3355e]
__invoke_impl<void, arrow::detail::ContinueFuture &, arrow::Future<arrow::internal::Empty> &, std::function<arrow::Status ()> &> at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/invoke.h:61 [0x2e3355e]
operator() at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/util/functional.h:140 [0x3376417]
WorkerLoop at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/util/thread_pool.cc:457 [0x3376417]
operator() at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/util/thread_pool.cc:618 [0x3376417]
__invoke_impl<void, (lambda at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/util/thread_pool.cc:616:23)> at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/invoke.h:61 [0x3376417]
__invoke<(lambda at /tmp/source-root/.conan2/p/b/arrow19d6b0dc5db3a/b/src/cpp/src/arrow/util/thread_pool.cc:616:23)> at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/invoke.h:96 [0x3376417]
_M_invoke<0UL> at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/std_thread.h:292 [0x3376417]
operator() at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/std_thread.h:299 [0x3376417]
_M_run at /usr/bin/../lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/bits/std_thread.h:244 [0x3376417]

Component(s)

C++

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions