Skip to content

Migration of ORC Backed Timestamp Without Zone Tables yields Timestamp With Zone columns #2245

@RussellSpitzer

Description

@RussellSpitzer

A user was attempting to convert an ORC backed external table in hive to a Iceberg table using the migrate command but was immediately met with a "Can not promote TIMESTAMP to TIMESTAMP" error. This occurs because our Spark -> Iceberg conversion code always converts to a Timestamp.withZone.

This is pretty easy to reproduce with

spark.sql("CREATE EXTERNAL TABLE mytable (foo timestamp) STORED AS orc LOCATION '/Users/russellspitzer/Temp/foo'")

spark.sql("INSERT INTO mytable VALUES (now())")

spark.sql("CALL spark_catalog.system.migrate('mytable')")


spark.sql("SELECT * FROM mytable")
java.lang.IllegalArgumentException: Can not promote TIMESTAMP type to TIMESTAMP
	at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkArgument(Preconditions.java:441)
	at org.apache.iceberg.orc.ORCSchemaUtil.buildOrcProjection(ORCSchemaUtil.java:301)
	at org.apache.iceberg.orc.ORCSchemaUtil.buildOrcProjection(ORCSchemaUtil.java:275)
	at org.apache.iceberg.orc.ORCSchemaUtil.buildOrcProjection(ORCSchemaUtil.java:258)

I attempted to to just modify the Iceberg table so that it correctly typed the columns as Timestamp.withoutZone() but this leads to #2244

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions