According to #362, recycling with remainder when using data.table() was changed to a warning for sake of consistency within data.table. I cannot access the R-Forge link, but I assume this is in reference to the original behavior of :=.
However, since data.table v1.12.2 (07 Apr 2019), recycling for vectors of length > 1 in := has thrown an error:
:= no longer recycles length>1 RHS vectors. There was a warning when recycling left a remainder but no warning when the LHS length was an exact multiple of the RHS length (the same behaviour as base R). Consistent feedback for several years has been that recycling is more often a bug. In rare cases where you need to recycle a length>1 vector, please use rep() explicitly. Single values are still recycled silently as before. Early warning was given in this tweet. The 774 CRAN and Bioconductor packages using data.table were tested and the maintainers of the 16 packages affected (2%) were consulted before going ahead, #3310. Upon agreement we went ahead. Many thanks to all those maintainers for already updating on CRAN, #3347.
Would it therefore make sense to regress the behavior of data.table() to throw an error when recycling with remainder? This is potentially script breaking, but this would make the behavior more consistent both within the data.table package as well as with data.frame(), and the reasons/scenarios for recycling with remainder in data.table() seem no more reasonable than those for :=. Making the recycling behavior the same as := is probably too drastic (error when recycling vectors of length > 1), but regressing to an error when recycling with remainder seems like a good compromise, especially since it would make the behavior consistent with data.frame() and the presumed motivation for changing the behavior to a warning no longer holds. As with :=, I suspect most use cases of recycling with remainder are bugs.
# Minimal reproducible example
library(data.table)
data.table(1:3, 1:5)
# V1 V2
# 1: 1 1
# 2: 2 2
# 3: 3 3
# 4: 1 4
# 5: 2 5
# Warning message:
# In as.data.table.list(x, keep.rownames = keep.rownames, check.names = check.names, :
# Item 1 has 3 rows but longest item has 5; recycled with remainder.
# Output of sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.12.6
loaded via a namespace (and not attached):
[1] compiler_3.6.1 tools_3.6.1 yaml_2.2.0 knitr_1.25 xfun_0.10
According to #362, recycling with remainder when using
data.table()was changed to a warning for sake of consistency within data.table. I cannot access the R-Forge link, but I assume this is in reference to the original behavior of:=.However, since data.table v1.12.2 (07 Apr 2019), recycling for vectors of length > 1 in
:=has thrown an error:Would it therefore make sense to regress the behavior of
data.table()to throw an error when recycling with remainder? This is potentially script breaking, but this would make the behavior more consistent both within the data.table package as well as withdata.frame(), and the reasons/scenarios for recycling with remainder indata.table()seem no more reasonable than those for:=. Making the recycling behavior the same as:=is probably too drastic (error when recycling vectors of length > 1), but regressing to an error when recycling with remainder seems like a good compromise, especially since it would make the behavior consistent withdata.frame()and the presumed motivation for changing the behavior to a warning no longer holds. As with:=, I suspect most use cases of recycling with remainder are bugs.#Minimal reproducible example#Output of sessionInfo()