Apache Iceberg version
main (development)
Please describe the bug 🐞
PR to reproduce puchengy@6808149
pyarrow.lib.ArrowInvalid: No match for FieldRef.Name(letter/abc) in letter_x2Fabc: string not null
The issue seems to be when writing parquet from Spark, we sanitized the name to follow the avro spec.
I think we should make some changes in _task_to_table function:
- Use sanitized column name for projection when interacting with pyarrow as that is the actual names stored on parquet files.
- Before returning pyarrow table, we should rename the columns back to right iceberg correct column name based one the parquet field id.
Apache Iceberg version
main (development)
Please describe the bug 🐞
PR to reproduce puchengy@6808149
The issue seems to be when writing parquet from Spark, we sanitized the name to follow the avro spec.
I think we should make some changes in
_task_to_tablefunction: