-
Notifications
You must be signed in to change notification settings - Fork 3k
Description
I wanted to see if I could implement #897 for ORC. Given how ORC value readers are implemented for Iceberg-generics and Iceberg-spark, we have to do extra work. We will have to traverse Iceberg schema in SparkOrcReader just like we currently do in GenericOrcReader This traversal with Iceberg schema is duplicate work which can probably be abstracted in a visitor just as we do for AvroSchemaWithTypeVisitor .
This is a big refactoring change, but overall will improve readability and remove duplicate traversal code. But we don't need to absolutely do it, we can traverse the Iceberg schema in SparkOrcReader and have an abstract StructReader which both Spark and data ORCreaders share.
I'm +1 for doing it, but before doing wanted to see if folks are ok with it. @rdblue, @shardulm94, what do u guys think?