Skip to content

Add casting from arbitrary arrow types #5016

@alamb

Description

@alamb

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
While porting some tests to sqllogictest I want t ensure the same coverage for specific types.

For example

https://github.com/apache/arrow-datafusion/blob/350cb47289a76e579b221fe374e4cf09db332569/datafusion/core/tests/sql/mod.rs#L1233

It is easy to cast to some SQL datatype:

-- casts `x` to `bigint` which maps to DataType::Int64
select x::bigint from foo;
-- casts `x` to timestamp (DatType::Timestamp)
select x::timestamp from foo;

However, it is not possible to use SQL to name certain arrow types (e.g TimestampMicrosecondasTIMESTAMP` maps to TimestampNanos

I would like a way to convert an expression to an arbitrary arrow type.

Describe the solution you'd like
I would love to do something like

arrow_cast(source, target_type)

Where target_type is a string that describes an arrow tyoe

For example:

--  casts x to a Int8 (which I don't think is possible in sql)
select arrow_cast(x, 'Int8') from foo;
--  casts x to a LargeBinary
select arrow_cast(x, 'LargeBinary') from foo;

I would like the values accepted as 'target_datatype' to t same as returned by the arrow_typeof() function that goes the other direction (expression to string that represents Arrow type)

❯ select arrow_typeof('5');
+------------------------+
| arrowtypeof(Utf8("5")) |
+------------------------+
| Utf8                   |
+------------------------+
1 row in set. Query took 0.028 seconds.
❯ select arrow_typeof(5);
+-----------------------+
| arrowtypeof(Int64(5)) |
+-----------------------+
| Int64                 |
+-----------------------+

Describe alternatives you've considered

It would be nice to have some way to specify Arrow types in SQL datatype syntax. Perhaps like CAST X as CUSTOM TYPE 'Int64'. This would allow using functions such astry_cast, as well as creating tables with specific column types.

However, I am not really sure how to do this in the parser

Additional context
See #4460 for more details

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions