Skip to content

Datafusion 42 does not raise plan error when UNION branches have different number of columns #13092

@emanueledomingo

Description

@emanueledomingo

Describe the bug

Since datafusion 41, i got Datafusion Error when some SQL queries contain errors.

SELECT "Product" FROM "my_table" UNION ALL SELECT "Product", "Id" FROM "my_table"

raises

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
Cell In[4], line 4
      1 query = """
      2 SELECT "Product" FROM "my_table" UNION ALL SELECT "Product", "Id" FROM "my_table"
      3 """
----> 4 df = ctx.sql(query)

File ~/.../site-packages/datafusion/context.py:514, in SessionContext.sql(self, query, options)
    499 """Create a :py:class:`~datafusion.DataFrame` from SQL query text.
    500 
    501 Note: This API implements DDL statements such as ``CREATE TABLE`` and
   (...)
    511     DataFrame representation of the SQL query.
    512 """
    513 if options is None:
--> 514     return DataFrame(self.ctx.sql(query))
    515 return DataFrame(self.ctx.sql_with_options(query, options.options_internal))

Exception: DataFusion error: Plan("Union queries must have the same number of columns, (left is 1, right is 2)")

With new datafusion version, this doesn't happens anymore (it crashes later, when i run .collect() after the .sql() result)

To Reproduce

import pyarrow as pa
import datafusion as df
import pyarrow.dataset as pda

t = pa.Table.from_pydict(
    {
        "Id": [0, 0, 0, 0, 0, 1, 1, 1, 1],
        "Product": ["A", "A", "A", "B", "C", "B", "C", "B", "B"],
    },
)

ctx = df.SessionContext()
ctx.register_dataset(name="my_table", dataset=pda.dataset(t))

query = """
SELECT "Product" FROM "my_table" UNION ALL SELECT "Product", "Id" FROM "my_table"
"""
df = ctx.sql(query)  # with DF 41 this riases an error, with 42 doesn't

Expected behavior

DataFusion error exception correctly raised

Additional context

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions