-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
Since datafusion 41, i got Datafusion Error when some SQL queries contain errors.
SELECT "Product" FROM "my_table" UNION ALL SELECT "Product", "Id" FROM "my_table"raises
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
Cell In[4], line 4
1 query = """
2 SELECT "Product" FROM "my_table" UNION ALL SELECT "Product", "Id" FROM "my_table"
3 """
----> 4 df = ctx.sql(query)
File ~/.../site-packages/datafusion/context.py:514, in SessionContext.sql(self, query, options)
499 """Create a :py:class:`~datafusion.DataFrame` from SQL query text.
500
501 Note: This API implements DDL statements such as ``CREATE TABLE`` and
(...)
511 DataFrame representation of the SQL query.
512 """
513 if options is None:
--> 514 return DataFrame(self.ctx.sql(query))
515 return DataFrame(self.ctx.sql_with_options(query, options.options_internal))
Exception: DataFusion error: Plan("Union queries must have the same number of columns, (left is 1, right is 2)")
With new datafusion version, this doesn't happens anymore (it crashes later, when i run .collect() after the .sql() result)
To Reproduce
import pyarrow as pa
import datafusion as df
import pyarrow.dataset as pda
t = pa.Table.from_pydict(
{
"Id": [0, 0, 0, 0, 0, 1, 1, 1, 1],
"Product": ["A", "A", "A", "B", "C", "B", "C", "B", "B"],
},
)
ctx = df.SessionContext()
ctx.register_dataset(name="my_table", dataset=pda.dataset(t))
query = """
SELECT "Product" FROM "my_table" UNION ALL SELECT "Product", "Id" FROM "my_table"
"""
df = ctx.sql(query) # with DF 41 this riases an error, with 42 doesn'tExpected behavior
DataFusion error exception correctly raised
Additional context
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working