-
Notifications
You must be signed in to change notification settings - Fork 3k
Closed
Description
I am trying to use SparkTableUtil.importSparkTable to import a Hive table (file format is ORC) as an Iceberg table, it is blocked by the following error:
Exception in thread "main" java.lang.IllegalArgumentException: ORC schema does not contain Iceberg IDs
at org.apache.iceberg.orc.ORCSchemaUtil.convert(ORCSchemaUtil.java:221)
at org.apache.iceberg.orc.OrcMetrics.buildOrcMetrics(OrcMetrics.java:100)
at org.apache.iceberg.orc.OrcMetrics.fromInputFile(OrcMetrics.java:83)
at org.apache.iceberg.orc.OrcMetrics.fromInputFile(OrcMetrics.java:78)
at org.apache.iceberg.spark.SparkTableUtil.lambda$listOrcPartition$8(SparkTableUtil.java:407)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
at org.apache.iceberg.spark.SparkTableUtil.listOrcPartition(SparkTableUtil.java:421)
at org.apache.iceberg.spark.SparkTableUtil.listPartition(SparkTableUtil.java:326)
at org.apache.iceberg.spark.SparkTableUtil.importUnpartitionedSparkTable(SparkTableUtil.java:545)
at org.apache.iceberg.spark.SparkTableUtil.importSparkTable(SparkTableUtil.java:519)
As the ORC data file is created by Hive, it does not have the Iceberg ID, so it is rejected by #1140. Am I getting it right?
Metadata
Metadata
Assignees
Labels
No labels