Skip to content

as_vector() downgrades int64 even when arrow.int64_downcast = TRUE #30065

@asfimport

Description

@asfimport

Using as_vector() on a Table or Array when the type is Int64 and arrow.int64_downcast = TRUE still downgrades, unless there is a value greater than Int32 can store (actually it switches over at some lower value; guessing the integer to numeric switch over in R).

library(arrow)
options(arrow.int64_downcast = TRUE)
int64s <- bit64::as.integer64(c(1:10, 101:110, 201:205))
y <- Array$create(int64s)
y$type
yv <- y$as_vector()
class(yv)
int64s <- c(int64s, bit64::as.integer64("68719476735")) # 0xF FFFFFFFF
y <- Array$create(int64s)
y$type
yv <- y$as_vector()
class(yv)
Outputs:
Int64
int64
[1] "integer"
Int64
int64
[1] "integer64"
This can cause an unexpected overflow
int64s <- bit64::as.integer64(c(1:10, 101:110, 201:205, 268435454, 2147483632)) # 0xFFFFFFE, 0x7FFFFFF0
cumsum(int64s)
y <- Array$create(int64s)
y$type
yv <- y$as_vector()
class(yv)
cumsum(yv)
as shown in the second cumsum
integer64
[1] 1 3 6 10 15 21
[7] 28 36 45 55 156 258
[13] 361 465 570 676 783 891
[19] 1000 1110 1311 1513 1716 1920
[25] 2125 268437579 2415921211
Int64
int64
[1] "integer"
[1] 1 3 6 10 15 21 28
[8] 36 45 55 156 258 361 465
[15] 570 676 783 891 1000 1110 1311
[22] 1513 1716 1920 2125 268437579 NA
Warning message:
integer overflow in 'cumsum'; use 'cumsum(as.numeric(.))'
The actual version is Version: 5.0.0.9000 running under R 3.6.3

Environment: linux
Reporter: Marc Colosimo

Related issues:

Note: This issue was originally created as ARROW-14509. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions