Skip to content

Bug: split.data.table can not split on a factor column if it is named 'x' #3151

@tdeenes

Description

@tdeenes

Just found the following bug:

library(data.table)
temp1 <- data.table(x = factor("a"))
split(temp1, by = "x")
# Error: is.data.table(x) is not TRUE

# this works
temp2 <- data.table(.x = factor("a"))
split(temp2, by = ".x")

# this works as well
temp3 <- data.table(x = "a")
split(temp3, by = "x")

So the problem only emerges if the splitting variable is called 'x' and it is a factor. The problem can be rooted back to this temp = eval(dtq) call in split.data.table. In the dtq unevaluated call, make.levels(x, cols=.cols, sorted=.sorted) finds the 'x' variable instead of the 'x' data.table. A quick fix is to write make.levels(.___x, cols=.cols, sorted=.sorted) instead, and do a temporary assignment .___x <- x before temp = eval(dtq).

SessionInfo (the bug is also present in 1.11.8):

R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 18.2

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.11.9 switchr_0.13.0   

loaded via a namespace (and not attached):
[1] compiler_3.5.1  tools_3.5.1     RCurl_1.95-4.11 remotes_2.0.2  
[5] bitops_1.0-6

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions