Skip to content

A corner bug in union #1818

@xudong963

Description

@xudong963

Describe the bug
When union some select constant1, contant2, ... similar queries, union can't produce the result.

To Reproduce

SELECT 1 as col1, 'one' as col2
UNION ALL
SELECT 2, 'two'
UNION ALL
SELECT 3, 'three';

Plan("UNION ALL schemas are expected to be the same")

❯ SELECT 1 , 'one'
UNION ALL
SELECT 2, 'two'
UNION ALL
SELECT 3, 'three';

Plan("UNION ALL schemas are expected to be the same")

Expected behavior
Let' see results from postgres:

postgres=# SELECT 1 AS column1, 'one' AS column2
UNION ALL
SELECT 2, 'two'
UNION ALL
SELECT 3, 'three';
 column1 | column2
---------+---------
       1 | one
       2 | two
       3 | three
(3 rows)

postgres=# SELECT 1 , 'one'
UNION ALL
SELECT 2, 'two'
UNION ALL
SELECT 3, 'three';
 ?column? | ?column?
----------+----------
        1 | one
        2 | two
        3 | three
(3 rows)

Additional context
I have done an easy analysis, the followings are my thought to solve the issue:

  1. We should fix the default name when the expr in projection doesn't have a name. Let's see an example. select a from table1, a is the column from table1, so it naturally has a name a, but for select 1, 1 isn't a column, it's a Value expr, so its name will be Int64(1) in current datatfusion's implementation, the related code is https://github.com/apache/arrow-datafusion/blob/7be0e268a69ffecbf06823c98ca572733dddb29e/datafusion/src/physical_plan/planner.rs#L91. So the first thing to solve the issue is to rethink how to create default name. I think we can keep consistent with postgres, such as ?column?.
  2. Currently, DF judges if the logical plans in union have the same schema by arrow schema's field name https://github.com/apache/arrow-datafusion/blob/7be0e268a69ffecbf06823c98ca572733dddb29e/datafusion-common/src/dfschema.rs#L251. After finishing 1, we can add a condition, if the col name is ?column?, we decide if logical plans have the same schema by arrow schema's filed datatype.

cc @alamb @houqp

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions