[C++][Python][Dataset] Provide an option to toggle validation and schema inference in FileSystemDatasetFactoryOptions

This can be costly and is not always necessary.

At the same time we could move file validation into the scan tasks; currently all files are inspected as the dataset is constructed, which can be expensive if the filesystem is slow. We'll be performing the validation multiple times but the check will be cheap since at scan time we'll be reading the file into memory anyway.

**Reporter**: [Ben Kietzman](https://issues.apache.org/jira/browse/ARROW-8058) / @bkietz
**Assignee**: [Francois Saint-Jacques](https://issues.apache.org/jira/browse/ARROW-8058) / @fsaintjacques
#### Related issues:
- [[C++][Dataset] Revisit File discovery failure mode](https://github.com/apache/arrow/issues/23918) (relates to)
#### PRs and other links:
- [GitHub Pull Request #6687](https://github.com/apache/arrow/pull/6687)

<sub>**Note**: *This issue was originally created as [ARROW-8058](https://issues.apache.org/jira/browse/ARROW-8058). Please see the [migration documentation](https://github.com/apache/arrow/issues/14542) for further details.*</sub>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[C++][Python][Dataset] Provide an option to toggle validation and schema inference in FileSystemDatasetFactoryOptions #24271

Related issues:

PRs and other links:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[C++][Python][Dataset] Provide an option to toggle validation and schema inference in FileSystemDatasetFactoryOptions #24271

Description

Related issues:

PRs and other links:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions