Describe the bug
In planner.rs, we wrap inputs to HashJoinExec in a CopyExec, but we do not do the same for SortMergeExec. According to DataFusion's documentation, SortMergeExec can cache input batches, so this seems incorrect.
The following error was a factor in identifying this issue:
[info] Cause: org.apache.comet.CometNativeException: overflow
[info] at comet::errors::init::{{closure}}(__internal__:0)
[info] at std::panicking::rust_panic_with_hook(__internal__:0)
[info] at std::panicking::begin_panic_handler::{{closure}}(__internal__:0)
[info] at std::sys::backtrace::__rust_end_short_backtrace(__internal__:0)
[info] at __rustc::rust_begin_unwind(__internal__:0)
[info] at core::panicking::panic_fmt(__internal__:0)
[info] at core::option::expect_failed(__internal__:0)
[info] at arrow_select::take::take_bytes(__internal__:0)
[info] at arrow_select::take::take_impl(__internal__:0)
[info] at arrow_select::take::take(__internal__:0)
[info] at datafusion_physical_plan::joins::sort_merge_join::SortMergeJoinStream::freeze_streamed(__internal__:0)
[info] at datafusion_physical_plan::joins::sort_merge_join::SortMergeJoinStream::poll_buffered_batches(__internal__:0)
[info] at <datafusion_physical_plan::joins::sort_merge_join::SortMergeJoinStream as futures_core::stream::Stream>::poll_next(__internal__:0)
[info] at comet::execution::jni_api::Java_org_apache_comet_Native_executePlan::{{closure}}::{{closure}}(__internal__:0)
[info] at Java_org_apache_comet_Native_executePlan(__internal__:0)
[info] at <unknown>(__internal__:0)
Steps to reproduce
No response
Expected behavior
No response
Additional context
No response
Describe the bug
In
planner.rs, we wrap inputs toHashJoinExecin aCopyExec, but we do not do the same forSortMergeExec. According to DataFusion's documentation,SortMergeExeccan cache input batches, so this seems incorrect.The following error was a factor in identifying this issue:
Steps to reproduce
No response
Expected behavior
No response
Additional context
No response