Skip to content

EnforceSorting resorts the inout of UnionExec unnecessarily #4943

@alamb

Description

@alamb

Describe the bug

Given the following input plan (I see this by enabling trace logging via RUST_LOG=trace:

SortExec: [tag@2 ASC NULLS LAST]
  ProjectionExec: expr=[bar@0 as bar, foo@1 as foo, tag@2 as tag, time@3 as time]
    DeduplicateExec: [tag@2 ASC,time@3 ASC]
      SortPreservingMergeExec: [tag@2 ASC,time@3 ASC]
        UnionExec
          ParquetExec: limit=None, partitions={1 group: [[d.parquet]]}, output_ordering=[tag@2 ASC, time@3 ASC], projection=[bar, foo, tag, time]
          SortExec: [tag@2 ASC,time@3 ASC]
            RecordBatchesExec: batches_groups=1 batches=1

Here is the input to enforce sorting:

Optimized physical plan by EnforceDistribution:
SortExec: [tag@2 ASC NULLS LAST]
  CoalescePartitionsExec
    ProjectionExec: expr=[bar@0 as bar, foo@1 as foo, tag@2 as tag, time@3 as time]
      RepartitionExec: partitioning=RoundRobinBatch(4)
        DeduplicateExec: [tag@2 ASC,time@3 ASC]
          SortPreservingMergeExec: [tag@2 ASC,time@3 ASC]
            UnionExec                                 <-- ** Note that the ParquetExec is already sorted correctly!
              ParquetExec: limit=None, partitions={1 group: [[d.parquet]]}, output_ordering=[tag@2 ASC, time@3 ASC], projection=[bar, foo, tag, time]
              SortExec: [tag@2 ASC,time@3 ASC]
                RecordBatchesExec: batches_groups=1 batches=1

And here is the output from EnforceSorting, where it has moved the SortExec up to the top of the union:

Optimized physical plan by EnforceSorting:
SortExec: [tag@2 ASC NULLS LAST]
  CoalescePartitionsExec
    ProjectionExec: expr=[bar@0 as bar, foo@1 as foo, tag@2 as tag, time@3 as time]
      RepartitionExec: partitioning=RoundRobinBatch(4)
        DeduplicateExec: [tag@2 ASC,time@3 ASC]
          SortPreservingMergeExec: [tag@2 ASC,time@3 ASC]
            SortExec: [tag@2 ASC,time@3 ASC]        <-- ** SortExec is moved to the output of Union, *resorting* the parquet file
              UnionExec
                ParquetExec: limit=None, partitions={1 group: [[1/1/1/1/57d6a92a-314a-4a32-a633-33bc3e1fe7a3.parquet]]}, output_ordering=[tag@2 ASC, time@3 ASC], projection=[bar, foo, tag, time]
                RecordBatchesExec: batches_groups=1 batches=1

To Reproduce
I have a reproducer from IOx -- see https://github.com/influxdata/influxdb_iox/pull/6528#discussion_r1070632410

Expected behavior
I expect the SortExec to be left where it is (at the input to the

Additional context
I found this in the context of upgrading DataFusion in IOx: https://github.com/influxdata/influxdb_iox/pull/6528

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions