From 8a1972d24f4c58abdb6dd8b62da447f1673dbff3 Mon Sep 17 00:00:00 2001 From: Matt Dowle Date: Mon, 30 Aug 2021 16:21:48 -0600 Subject: [PATCH] DT[fctr] now works --- NEWS.md | 4 +++- R/data.table.R | 2 +- inst/tests/tests.Rraw | 6 ++++++ 3 files changed, 10 insertions(+), 2 deletions(-) diff --git a/NEWS.md b/NEWS.md index f6c834f07a..7e6c71f9a4 100644 --- a/NEWS.md +++ b/NEWS.md @@ -352,9 +352,11 @@ # # remaining 99,987 of these 100,000 were already identical ``` - + 41. `dcast(empty-DT)` now returns an empty `data.table` rather than error `Cannot cast an empty data.table`, [#1215](https://github.com/Rdatatable/data.table/issues/1215). Thanks to Damian Betebenner for reporting, and Matt Dowle for fixing. +42. `DT[factor("id")]` now works rather than error `i has evaluated to type integer. Expecting logical, integer or double`, [#1632](https://github.com/Rdatatable/data.table/issues/1632). `DT["id"]` has worked forever by automatically converting to `DT[.("id")]` for convenience, and joins have worked forever between char/fact, fact/char and fact/fact even when levels mismatch, so it was unfortunate that `DT[factor("id")]` managed to escape the simple automatic conversion to `DT[.(factor("id"))]` which is now in place. Thanks to @aushev for reporting, and Matt Dowle for the fix. + ## NOTES 1. New feature 29 in v1.12.4 (Oct 2019) introduced zero-copy coercion. Our thinking is that requiring you to get the type right in the case of `0` (type double) vs `0L` (type integer) is too inconvenient for you the user. So such coercions happen in `data.table` automatically without warning. Thanks to zero-copy coercion there is no speed penalty, even when calling `set()` many times in a loop, so there's no speed penalty to warn you about either. However, we believe that assigning a character value such as `"2"` into an integer column is more likely to be a user mistake that you would like to be warned about. The type difference (character vs integer) may be the only clue that you have selected the wrong column, or typed the wrong variable to be assigned to that column. For this reason we view character to numeric-like coercion differently and will warn about it. If it is correct, then the warning is intended to nudge you to wrap the RHS with `as.()` so that it is clear to readers of your code that a coercion from character to that type is intended. For example : diff --git a/R/data.table.R b/R/data.table.R index d70e677615..87f6eef10e 100644 --- a/R/data.table.R +++ b/R/data.table.R @@ -436,7 +436,7 @@ replace_dot_alias = function(e) { } } if (is.null(i)) return( null.data.table() ) - if (is.character(i)) { + if (is.character(i) || is.factor(i)) { isnull_inames = TRUE i = data.table(V1=i) # for user convenience; e.g. DT["foo"] without needing DT[.("foo")] } else if (identical(class(i),"list") && length(i)==1L && is.data.frame(i[[1L]])) { i = as.data.table(i[[1L]]) } diff --git a/inst/tests/tests.Rraw b/inst/tests/tests.Rraw index d4681d5471..5c83983a8e 100644 --- a/inst/tests/tests.Rraw +++ b/inst/tests/tests.Rraw @@ -18133,3 +18133,9 @@ test(2214.6, droplevels(DT, exclude=c("b", "d"))[["a"]], droplevels(DT[1:5,a], c test(2214.7, droplevels(DT, 1)[["a"]], x[1:5]) test(2214.8, droplevels(DT, in.place=TRUE), DT) +# factor i should be just like character i and work, #1632 +DT = data.table(A=letters[1:3], B=4:6, key="A") +test(2215.1, DT["b", B], 5L) # has worked forever +test(2215.2, DT[factor("b"), B], 5L) # now works too, joining fact/fact, char/fact and fact/char have plenty of tests + +