Skip to content

Creating dataframe with Recordbatch using pyarrow.Table.to_batches gives "type16 not valid error" when schema includes date32[day] type #949

@preetijoshi-womply

Description

@preetijoshi-womply

Describe the bug
I want to create a datafusion dataframe using an in memory pyarrow table which has a pa.date32[day] field.. But while doing so i am getting schema error. Saying type 16 not supported.

To Reproduce
import datafusion
import pyarrow
import datetime

ctx = datafusion.ExecutionContext()

batch = pyarrow.RecordBatch.from_arrays(
[pyarrow.array([1, 2, 3]), pyarrow.array([datetime.date(1970, 1, 1),datetime.date(1970, 1, 2),datetime.date(1970, 1, 3)])],
names=["a", "b"],
)
df = ctx.create_dataframe([[batch]])

Expected behavior
A dataframe should be created with or without date column

Additional context
pa_table.schema gives the following:
"a: string
b: date32[day]

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions