Skip to content

[C++] Add parallelism to streaming CSV reader #27731

@asfimport

Description

@asfimport

Currently the streaming CSV reader does not allow for much parallelism.  It doesn't allow for reading more than one segment at once (useful in S3) and it doesn't allow for column fan-out for parsing & converting.

It seems both of these options would speed up CSV reading in some scenarios although it's possible this is mostly mitigated in cases where there are many more files than cores (as per-file parallelism will occupy all the cores anyways).

Reporter: Weston Pace / @westonpace
Assignee: Weston Pace / @westonpace

PRs and other links:

Note: This issue was originally created as ARROW-11889. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions