-
Notifications
You must be signed in to change notification settings - Fork 4k
Description
Expose ARROW-13028 to the R bindings so that users can choose to have read_csv_arrow(as_data_frame = FALSE) and open_dataset(format = "csv) infer 32-bit integer fields instead of always inferring 64-bit integer fields for all integers (the current behavior).
Note that there is an existing option in the R bindings that controls something similar: arrow.int64_downcast. See ARROW-10093 for details. I think we can not reuse this option to control the CSV reader, behavior because (a) users might want to control these behaviors separately, and (b) the default value of arrow.int64_downcast is TRUE which does not align with the existing behavior of the CSV reader (always inferring 64-bit integer fields, i.e. not downcasting) and we probably want to retain that as the default behavior. So we will want to add a new argument or a new option to control this.
Reporter: Ian Cook / @ianmcook
Related issues:
- [C++] unify_schemas can't handle int64 + double, affects CSV dataset (is related to)
- [C++] CSV add convert option to attempt 32bit number inferences (depends upon)
Note: This issue was originally created as ARROW-14528. Please see the migration documentation for further details.