# Minimal reproducible example
I believe there's been a regression in data.table in version 1.12.0 relative to 1.11.8. TL;DR
I think the implementation of unique.data.table() has changed and that the new implementation doesn't support complex types like lists in columns.
Some users of uptasticsearch have reported receiving an error message from our code that looks like this:
Error in forderv(x, by = by, sort = FALSE, retGrp = TRUE) :
Column 2 of by= (2) is type 'list', not yet supported
After investigating tonight, I found the source of the issue and can reproduce it. I believe the behavior of unique() has changed and I'd consider that change a regression.
On 1.12.0:
someDT <- data.table::data.table(
col1 = 1:2,
col2 = list(list(TRUE, FALSE), list(FALSE, TRUE))
)
unique(someDT)
Raises error
Error in forderv(x, by = by, sort = FALSE, retGrp = TRUE) :
Column 2 of by= (2) is type 'list', not yet supported
To downgrade to the previous release, I ran the following from the command line:
Rscript -e "remove.packages('data.table')"
wget http://cran.rstudio.com/src/contrib/Archive/data.table/data.table_1.11.8.tar.gz
R CMD INSTALL data.table_1.11.8.tar.gz
Once I had v 1.11.8 installed, I re-ran the R code above
someDT <- data.table::data.table(
col1 = 1:2,
col2 = list(list(TRUE, FALSE), list(FALSE, TRUE))
)
unique(someDT)
Works as expected and returns:
col1 col2
1: 1 <list>
2: 2 <list>
So I went to the blame for unique.data.table() to see what had . changed. Looked like no substantive changes have been made between 1.11.8 and now, so then I thought to look at the blame for forderv().
I didn't see anything meaningful in the blame for forderv() either.
I decided to try one more thing...searching for the text "not yet supported" (from the error message). That led me to forder.c, whose blame led me to #3124.
As far as I can tell, this PR is the source of the problem above. There is no description on the PR so I'm not sure if this is an unintended side effect or a known regression that will be fixed in a future release of data.table.
# Output of sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.5.0 tools_3.5.0 yaml_2.2.0 data.table_1.12.0
#Minimal reproducible exampleI believe there's been a regression in
data.tablein version 1.12.0 relative to 1.11.8. TL;DRI think the implementation of
unique.data.table()has changed and that the new implementation doesn't support complex types like lists in columns.Some users of uptasticsearch have reported receiving an error message from our code that looks like this:
After investigating tonight, I found the source of the issue and can reproduce it. I believe the behavior of
unique()has changed and I'd consider that change a regression.On 1.12.0:
Raises error
To downgrade to the previous release, I ran the following from the command line:
Once I had v 1.11.8 installed, I re-ran the R code above
Works as expected and returns:
So I went to the blame for unique.data.table() to see what had . changed. Looked like no substantive changes have been made between 1.11.8 and now, so then I thought to look at the blame for
forderv().I didn't see anything meaningful in the blame for forderv() either.
I decided to try one more thing...searching for the text "not yet supported" (from the error message). That led me to
forder.c, whose blame led me to #3124.As far as I can tell, this PR is the source of the problem above. There is no description on the PR so I'm not sure if this is an unintended side effect or a known regression that will be fixed in a future release of
data.table.#Output of sessionInfo()R version 3.5.0 (2018-04-23)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.5.0 tools_3.5.0 yaml_2.2.0 data.table_1.12.0