Describe the bug
During testing of the Iceberg integration, we ran into memory corruption issues caused by unsafe reuse of mutable buffers.
In one example, the following plan has a SortExec, which caches batches, but the batches are coming from a ScanExec which reuses the underlying buffers. There is a CopyExec in the plan, but it uses the mode UnpackOrClone rather than UnpackOrDeepCopy.
SortExec: expr=[col_0@0 ASC, col_2@2 ASC], preserve_partitioning=[false]
CopyExec [UnpackOrClone]
FilterExec: col_0@0 IS NOT NULL
ScanExec: source=[CometBatchScan testhadoop.default.table (unknown)], schema=[col_0: Int64, col_1: Int32, col_2: Timestamp(Microsecond, Some("UTC"))]
This issue is to track the work of reviewing this issue again and writing better internal documentation and improving testing.
Steps to reproduce
No response
Expected behavior
No response
Additional context
No response
Describe the bug
During testing of the Iceberg integration, we ran into memory corruption issues caused by unsafe reuse of mutable buffers.
In one example, the following plan has a
SortExec, which caches batches, but the batches are coming from aScanExecwhich reuses the underlying buffers. There is aCopyExecin the plan, but it uses the modeUnpackOrClonerather thanUnpackOrDeepCopy.This issue is to track the work of reviewing this issue again and writing better internal documentation and improving testing.
Steps to reproduce
No response
Expected behavior
No response
Additional context
No response