[C++] FileSystemDataset FilenamePartitioning error - fsspec filesystem

Unless this is user error (which it may well be!), it seems that Dataset FilenamePartitioning on read doesn't seem to work with an fsspec filesystem. From what I can glean, the filenames can be parsed successfully when passed to the parse() method, but do not seem to be being extracted as fields from the filenames passed to dataset() – instead, they appear as nulls. When trying to use the partitioning discover() method (assuming this is a reasonable thing to try), I get the below traceback. (Repro python script attached).

Traceback (most recent call last):
  File "/zip_of_csvs_test.py", line 82, in <module>
    ds_partitioned = pds.dataset(
  File "/.pyenv/versions/3.8.2/lib/python3.8/site-packages/pyarrow/dataset.py", line 697, in dataset
    return _filesystem_dataset(source, \*\*kwargs)
  File "/.pyenv/versions/3.8.2/lib/python3.8/site-packages/pyarrow/dataset.py", line 449, in _filesystem_dataset
    return factory.finish(schema)
  File "pyarrow/_dataset.pyx", line 1857, in pyarrow._dataset.DatasetFactory.finish
  File "pyarrow/error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status
  File "pyarrow/error.pxi", line 100, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: No non-null segments were available for field 'frequency'; couldn't infer type

**Reporter**: [Adam Kirby](https://issues.apache.org/jira/browse/ARROW-17174)
#### Related issues:
- [[C++] Null values in partitioning field for FilenamePartitioning](https://github.com/apache/arrow/issues/31689) (is fixed by)
#### Original Issue Attachments:
- [zip_of_csvs_test.py](https://issues.apache.org/jira/secure/attachment/13047096/zip_of_csvs_test.py)

<sub>**Note**: *This issue was originally created as [ARROW-17174](https://issues.apache.org/jira/browse/ARROW-17174). Please see the [migration documentation](https://github.com/apache/arrow/issues/14542) for further details.*</sub>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[C++] FileSystemDataset FilenamePartitioning error - fsspec filesystem #32471

Related issues:

Original Issue Attachments:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[C++] FileSystemDataset FilenamePartitioning error - fsspec filesystem #32471

Description

Related issues:

Original Issue Attachments:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions