Skip to content

Unexpected result when creating a factor column with 'by=' present #2522

@ben519

Description

@ben519

For example,

library(data.table)

dt <- data.table(
  id = 1:9,
  grp = rep(1:3, each = 3),
  val = c("a","b","c",   "a","b","c",   "a","b","c")
)

dt[, valfactor1 := factor(val), by = grp]  # fine and dandy
dt[, valfactor2 := factor(val), by = id]   # breaks
dt
   id grp val valfactor1 valfactor2
1:  1   1   a          a          c
2:  2   1   b          b          c
3:  3   1   c          c          c
4:  4   2   a          a          c
5:  5   2   b          b          c
6:  6   2   c          c          c
7:  7   3   a          a          c
8:  8   3   b          b          c
9:  9   3   c          c          c

I suspect this has come up before, but I didn't see anything on it. Also, I'm not sure what a reasonable behavior would be if I was trying to assign factors by group with ordered levels. At the least, I think a warning in these cases would be helpful. Right now it silently does something strange.

Thanks!

sessionInfo()

R version 3.4.2 (2017-09-28)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.10.5

loaded via a namespace (and not attached):
[1] compiler_3.4.2 tools_3.4.2    yaml_2.1.15

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions