Skip to content

Incorrect setkey result #3766

@kirillmayantsev

Description

@kirillmayantsev

Hello!

Minimal reproducible example:

library(data.table)
x = data.table(a = 1, b = c(1, 4, 2, 3), c = c(4, 3, 2, 1))
y = x[a == 1, .(d = paste(b, c, sep = "_"), b, b2 = b, c)]
y
setkey(y, d)
y

> y
     d b b2 c
1: 1_4 1  1 4
2: 4_3 4  4 3
3: 2_2 2  2 2
4: 3_1 3  3 1
> setkey(y, d)
> y
     d b b2 c
1: 1_4 1  1 4
2: 2_2 3  3 2
3: 3_1 4  4 1
4: 4_3 2  2 3

After setting a key the order of "b" and "b2" columns is incorrect.
There are 3 "workarounds" which help to keep "y" data.table consistent:

  1. Remove filtration ("a == 1")
  2. Remove column duplication ("b2 = b")
  3. Assign to "b2" a copy of "b" (as in example "b2" has the same reference as "b" I guess)

But still the described behavior seems to be a bug.
This issue could be related to #3496

Output of sessionInfo():

R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7600)

Matrix products: default

locale:
[1] LC_COLLATE=Russian_Russia.1251  LC_CTYPE=Russian_Russia.1251    LC_MONETARY=Russian_Russia.1251
[4] LC_NUMERIC=C                    LC_TIME=Russian_Russia.1251    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.12.3

loaded via a namespace (and not attached):
[1] compiler_3.5.1 tools_3.5.1    yaml_2.2.0    

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions