diff --git a/NEWS.md b/NEWS.md index 94856c2871..1e165bd427 100644 --- a/NEWS.md +++ b/NEWS.md @@ -20,6 +20,8 @@ 3. `fread(..., nrows=0L)` now works as intended and the same as `nrows=0`; i.e. returning the column names and typed empty columns determined by the large sample, [#4686](https://github.com/Rdatatable/data.table/issues/4686). Thanks to @hongyuanjia for reporting, and Benjamin Schwendinger for the PR. +4. Passing `.SD` to `frankv()` with `ties.method='random'` or with `na.last=NA` failed with `.SD is locked`, [#4429](https://github.com/Rdatatable/data.table/issues/4429). Thanks @smarches for the report. + ## NOTES 1. New feature 29 in v1.12.4 (Oct 2019) introduced zero-copy coercion. Our thinking is that requiring you to get the type right in the case of `0` (type double) vs `0L` (type integer) is too inconvenient for you the user. So such coercions happen in `data.table` automatically without warning. Thanks to zero-copy coercion there is no speed penalty, even when calling `set()` many times in a loop, so there's no speed penalty to warn you about either. However, we believe that assigning a character value such as `"2"` into an integer column is more likely to be a user mistake that you would like to be warned about. The type difference (character vs integer) may be the only clue that you have selected the wrong column, or typed the wrong variable to be assigned to that column. For this reason we view character to numeric-like coercion differently and will warn about it. If it is correct, then the warning is intended to nudge you to wrap the RHS with `as.()` so that it is clear to readers of your code that a coercion from character to that type is intended. For example : diff --git a/R/frank.R b/R/frank.R index 763b8267e5..47e701c4cd 100644 --- a/R/frank.R +++ b/R/frank.R @@ -22,10 +22,13 @@ frankv = function(x, cols=seq_along(x), order=1L, na.last=TRUE, ties.method=c("a if (!length(cols)) stop("x is a list, 'cols' can not be 0-length") } - x = .shallow(x, cols) # shallow copy even if list.. + # need to unlock for #4429 + x = .shallow(x, cols, unlock = TRUE) # shallow copy even if list.. setDT(x) cols = seq_along(cols) if (is.na(na.last)) { + if ("..na_prefix.." %chin% names(x)) + stop("Input column '..na_prefix..' conflicts with data.table internal usage; please rename") set(x, j = "..na_prefix..", value = is_na(x, cols)) order = if (length(order) == 1L) c(1L, rep(order, length(cols))) else c(1L, order) cols = c(ncol(x), cols) @@ -39,6 +42,8 @@ frankv = function(x, cols=seq_along(x), order=1L, na.last=TRUE, ties.method=c("a idx = NULL n = nrow(x) } + if ('..stats_runif..' %chin% names(x)) + stop("Input column '..stats_runif..' conflicts with data.table internal usage; please rename") set(x, idx, '..stats_runif..', stats::runif(n)) order = if (length(order) == 1L) c(rep(order, length(cols)), 1L) else c(order, 1L) cols = c(cols, ncol(x)) diff --git a/inst/tests/tests.Rraw b/inst/tests/tests.Rraw index 81612cceca..24b513875b 100644 --- a/inst/tests/tests.Rraw +++ b/inst/tests/tests.Rraw @@ -17329,3 +17329,12 @@ test(2168.01, print(DT, trunc.cols = 5L), error=c("Valid options for trunc.cols test(2168.02, print(DT, trunc.cols = NA), error=c("Valid options for trunc.cols are TRUE and FALSE")) test(2168.03, print(DT, trunc.cols = "thing"), error=c("Valid options for trunc.cols are TRUE and FALSE")) test(2168.04, print(DT, trunc.cols = c(TRUE, FALSE)), error=c("Valid options for trunc.cols are TRUE and FALSE")) + +# shallow copy of .SD must be unlocked for frank using na.last=NA or ties.method='random', #4429 +DT = data.table(a=1:10) +test(2169.1, DT[ , frankv(.SD, ties.method='average', na.last=NA)], as.double(1:10)) +test(2169.2, DT[ , frankv(.SD, ties.method='random')], 1:10) +# coverage tests for some issues discovered on the way +DT[, c('..na_prefix..', '..stats_runif..') := 1L] +test(2169.3, DT[ , frankv(.SD, ties.method='average', na.last=NA)], error="Input column '..na_prefix..' conflicts") +test(2169.4, DT[ , frankv(.SD, ties.method='random')], error="Input column '..stats_runif..' conflicts")