[Python] Can not refer to field in a list of structs 

When the dataset has nested sturcts, "list<struct>",  we can not use `pyarrow.field(..)` to get the reference of the sub-field of the struct.

 

For example

 
```python

import pyarrow as pa
import pyarrow.dataset as ds
import pandas as pd

schema = pa.schema(
    [
        pa.field(
            "objects",
            pa.list_(
                pa.struct(
                    [
                        pa.field("name", pa.utf8()),
                        pa.field("attr1", pa.float32()),
                        pa.field("attr2", pa.int32()),
                    ]
                )
            ),
        )
    ]
)

table = pa.Table.from_pandas(
    pd.DataFrame([{"objects": [{"name": "a", "attr1": 5.0, "attr2": 20}]}])
)
print(table)

dataset = ds.dataset(table)
print(dataset)
dataset.scanner(columns=["objects.attr2"]).to_table()
```

which throws exception:

```

Traceback (most recent call last):
  File "foo.py", line 31, in <module>
    dataset.scanner(columns=["objects.attr2"]).to_table()
  File "pyarrow/_dataset.pyx", line 298, in pyarrow._dataset.Dataset.scanner
  File "pyarrow/_dataset.pyx", line 2356, in pyarrow._dataset.Scanner.from_dataset
  File "pyarrow/_dataset.pyx", line 2202, in pyarrow._dataset._populate_builder
  File "pyarrow/error.pxi", line 100, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: No match for FieldRef.Name(objects.attr2) in objects: list<item: struct<attr1: double, attr2: int64, name: string>>
__fragment_index: int32
__batch_index: int32
__last_in_fragment: bool
__filename: string
```


**Reporter**: [Lei (Eddy) Xu](https://issues.apache.org/jira/browse/ARROW-17540)
#### Related issues:
- [[Python] parquet.read_table nested fields in columns does not work for use_legacy_dataset=False](https://github.com/apache/arrow/issues/30143) (relates to)

<sub>**Note**: *This issue was originally created as [ARROW-17540](https://issues.apache.org/jira/browse/ARROW-17540). Please see the [migration documentation](https://github.com/apache/arrow/issues/14542) for further details.*</sub>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Python] Can not refer to field in a list of structs #32794

Related issues:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Python] Can not refer to field in a list of structs #32794

Description

Related issues:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions