Describe the bug
In TPC-DS, the fact table scans fall back to Spark because they use DPP. For some reason, the filter that wraps the scan also falls back to Spark, but Comet uses JVM shuffle write after the filter. This means there are two transitions in the plan.
CometShuffleWriter
CometRowToColumnar
SparkFilter
SparkColumnarToRow
SparkScan
The CometRowToColumnar also involves FFI transfers.
Steps to reproduce
No response
Expected behavior
Perhaps it would be better for Comet not to try and accelerate this stage or introduce the columnar exchange? I am not sure.
Additional context
No response
Describe the bug
In TPC-DS, the fact table scans fall back to Spark because they use DPP. For some reason, the filter that wraps the scan also falls back to Spark, but Comet uses JVM shuffle write after the filter. This means there are two transitions in the plan.
The
CometRowToColumnaralso involves FFI transfers.Steps to reproduce
No response
Expected behavior
Perhaps it would be better for Comet not to try and accelerate this stage or introduce the columnar exchange? I am not sure.
Additional context
No response