Skip to content

[R] collect int64, uint32, uint64 as R integer type if not out of bounds #25198

@asfimport

Description

@asfimport

bit64::integer64 can be awkward to work with in R (one example: #7385). Often in Arrow we get int64 types from compute methods or other translation methods that auto-promote to the largest integer type, but they would fit fine in a 32-bit integer, which is R's native type.

When calling Array__as_vector on an int64, we could first call the minmax function on the array, and if the extrema are within the range of a 32-bit int, return a regular R integer vector. This would add a little bit of ambiguity as to what R type you'll get from an Arrow type, but I wonder if the benefits are worth it since you can't do much with an integer64 in R. (We could also make this optional, similar to ARROW-7657, so you could specify a "strict" mode if you are in a use case where roundtrip fidelity is more important than R usability.)

Likewise, uint32 and uint64 could be kept as integers and prevent the conversion to double that is currently implemented.

Reporter: Neal Richardson / @nealrichardson
Assignee: Neal Richardson / @nealrichardson

Related issues:

PRs and other links:

Note: This issue was originally created as ARROW-9083. Please see the migration documentation for further details.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions