[Python] Prevent corrupting files with Multiple matches for FieldRef.Name

{**}Version{**}: pyarrow 9.0.0

 

**Description**

Users can add a column with the the same name as an existing column to a table via `pyarrow.Table.add_column()`.

 

Additionally, that table can be written to a parquet file with `pyarrow.parquet.write_table()`.

 

However, the written file cannot be read with `pyarrow.parquet.read_table()` due to having multiple columns with the same name.

 

Flagging this as a bug because I believe anything that is successfully written by `write_table()` should be readable by `read_table()`.

 

**Minimum reproducible example**

```

>>> import pyarrow.parquet as pq
>>> import pyarrow as pa
>>> t = pa.Table.from_pydict(\{'a': [1,2,3]})
>>> pq.write_table(t.add_column(0, 'a', pa.array([1.1,2.2,3.3])), 'test.parquet')
>>> pq.read_table('test.parquet')
pyarrow.lib.ArrowInvalid: Multiple matches for FieldRef.Name(a) in a: double
a: int64
__fragment_index: int32
__batch_index: int32
__last_in_fragment: bool
__filename: string

```

**Environment**: MacOS, Python 3.10.3
**Reporter**: [Grayden Shand](https://issues.apache.org/jira/browse/ARROW-17388)
**Assignee**: [Miles Granger](https://issues.apache.org/jira/browse/ARROW-17388) / @milesgranger
#### Related issues:
- [[C++][Dataset] Handling of duplicate columns in Dataset factory and scanning](https://github.com/apache/arrow/issues/24407) (relates to)
#### PRs and other links:
- [GitHub Pull Request #13938](https://github.com/apache/arrow/pull/13938)

<sub>**Note**: *This issue was originally created as [ARROW-17388](https://issues.apache.org/jira/browse/ARROW-17388). Please see the [migration documentation](https://github.com/apache/arrow/issues/14542) for further details.*</sub>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Python] Prevent corrupting files with Multiple matches for FieldRef.Name #32660

Related issues:

PRs and other links:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Python] Prevent corrupting files with Multiple matches for FieldRef.Name #32660

Description

Related issues:

PRs and other links:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions