I was looking to get ranks using frankv(...,ties.method="random") and encountered a seemingly inconsistent behavior. Say we'd like to calculate ranks, by reference, using a character vector of names of colums in .SD. The following all work:
Example
library(data.table)
dt = data.table(
x = 1:20,
y = 20:1,
z = rep(c("A","B","C","D"),5)
)
rank_cols = c('x','y')
dt[,ranks := frankv(.SD,cols = rank_cols,ties.method = 'average'),by = 'z']
dt[,ranks := frankv(.SD,cols = rank_cols,ties.method = 'first'),by = 'z']
dt[,ranks := frankv(.SD,cols = rank_cols,ties.method = 'max'),by = 'z']
dt[,ranks := frankv(.SD,cols = rank_cols,ties.method = 'min'),by = 'z']
dt[,ranks := frankv(.SD,cols = rank_cols,ties.method = 'dense'),by = 'z']
dt[,ranks := frankv(.SD[,rank_cols,with = FALSE],cols = rank_cols,ties.method = 'random'),by = 'z']
but these do not:
dt[,ranks := frankv(.SD,cols = rank_cols,ties.method = 'average',na.last = NA),by = 'z']
dt[,ranks := frankv(.SD,cols = rank_cols,ties.method = 'random'),by = 'z']
# two wrongs don't make a right!
dt[,ranks := frankv(.SD,cols = rank_cols,ties.method = 'random',na.last = NA),by = 'z']
it likewise works again when not explicitly invoking .SD:
dt[,ranks := frankv(get(rank_cols),ties.method = 'random'),by = 'z']
It seems like an interface bug that this call does not work for some (valid) choices of ties.method or na.last ('random' and NA respectively) while working for all others. In both cases it appears that frankv calling set during the function evaluation is what triggers the error. Trying to specify .SDcols also does not work.
In failing cases the error is of the form:
Error in set(x, NULL, "..stats_runif..", v) :
.SD is locked. Updating .SD by reference using := or set are reserved for future use. Use := in j directly. Or use copy(.SD) as a (slow) last resort, until shallow() is exported.
Session Info
R version 3.6.3 (2020-02-29)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.12.8
loaded via a namespace (and not attached):
[1] compiler_3.6.3 tools_3.6.3
I was looking to get ranks using
frankv(...,ties.method="random")and encountered a seemingly inconsistent behavior. Say we'd like to calculate ranks, by reference, using a character vector of names of colums in.SD. The following all work:Example
but these do not:
it likewise works again when not explicitly invoking .SD:
It seems like an interface bug that this call does not work for some (valid) choices of
ties.methodorna.last('random' and NA respectively) while working for all others. In both cases it appears thatfrankvcallingsetduring the function evaluation is what triggers the error. Trying to specify.SDcolsalso does not work.In failing cases the error is of the form:
Session Info