From d750413b3425e8e5e312b33c92ddd23cbac8c12d Mon Sep 17 00:00:00 2001 From: Michael Chirico Date: Thu, 30 Apr 2020 20:05:08 +0800 Subject: [PATCH 1/4] Add . and .. aliases to ?data.table --- NEWS.md | 2 ++ man/data.table.Rd | 4 ++++ 2 files changed, 6 insertions(+) diff --git a/NEWS.md b/NEWS.md index 026f06c86e..cbb0571c51 100644 --- a/NEWS.md +++ b/NEWS.md @@ -147,6 +147,8 @@ unit = "s") 7. Added more explanation/examples to `?data.table` for how to use `.BY`, [#1363](https://github.com/Rdatatable/data.table/issues/1363). +8. Also added some aliases redirecting to `?data.table`: `?"."`, `?".."`, `?".("`, and `?".()"` all point to that help page, [#4385](https://github.com/Rdatatable/data.table/issues/4385) and [#4407](https://github.com/Rdatatable/data.table/issues/4407). This will help new (and maybe some experienced!) users of `data.table` find the documentation for these convenience features available within `[]`. Recall that `.` is an alias for `list` inside `i`, `j`, and `by`, and `..var` forces `data.table` to look for `var` in the calling environment, as opposed to a column of the table. + # data.table [v1.12.8](https://github.com/Rdatatable/data.table/milestone/15?closed=1) (09 Dec 2019) diff --git a/man/data.table.Rd b/man/data.table.Rd index 8c8e0d5375..154331d5e6 100644 --- a/man/data.table.Rd +++ b/man/data.table.Rd @@ -5,6 +5,10 @@ \alias{Ops.data.table} \alias{is.na.data.table} \alias{[.data.table} +\alias{.} +\alias{.(} +\alias{.()} +\alias{..} \title{ Enhanced data.frame } \description{ \code{data.table} \emph{inherits} from \code{data.frame}. It offers fast and memory efficient: file reader and writer, aggregations, updates, equi, non-equi, rolling, range and interval joins, in a short and flexible syntax, for faster development. From d1c8ea54761c90ccbe43b661b6548a23caf3971d Mon Sep 17 00:00:00 2001 From: Matt Dowle Date: Mon, 21 Jun 2021 17:33:13 -0600 Subject: [PATCH 2/4] error message highlighted by #2145 simplified --- R/data.table.R | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/R/data.table.R b/R/data.table.R index b3e6cc826d..06fadd9ee6 100644 --- a/R/data.table.R +++ b/R/data.table.R @@ -1351,7 +1351,7 @@ replace_dot_alias = function(e) { # There isn't a copy of the columns here, the xvar symbols point to the SD columns (copy-on-write). if (is.name(jsub) && is.null(lhs) && !exists(jsubChar<-as.character(jsub), SDenv, inherits=FALSE)) { - stop("j (the 2nd argument inside [...]) is a single symbol but column name '",jsubChar,"' is not found. Perhaps you intended DT[, ..",jsubChar,"]. This difference to data.frame is deliberate and explained in FAQ 1.1.") + stop("j (the 2nd argument inside [...]) is a single symbol but column name '",jsubChar,"' is not found. If you intended to select columns using a variable in calling scope, please try DT[, ..",jsubChar,"]. The .. prefix conveys one-level-up similar to a file system path.") } jval = eval(jsub, SDenv, parent.frame()) From 09774c064e75f2f6347aaa66c41c08a40b19e5c8 Mon Sep 17 00:00:00 2001 From: Matt Dowle Date: Mon, 21 Jun 2021 19:30:00 -0600 Subject: [PATCH 3/4] revise FAQ 1.5 --- vignettes/datatable-faq.Rmd | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) diff --git a/vignettes/datatable-faq.Rmd b/vignettes/datatable-faq.Rmd index 1df42e166c..f66f9611f1 100644 --- a/vignettes/datatable-faq.Rmd +++ b/vignettes/datatable-faq.Rmd @@ -66,22 +66,21 @@ Also continue reading and see the FAQ after next. Skim whole documents before ge The `j` expression is the 2nd argument. Try `DT[ , c("x","y","z")]` or `DT[ , .(x,y,z)]`. -## I assigned a variable `mycol = "x"` but then `DT[ , mycol]` returns `"x"`. How do I get it to look up the column name contained in the `mycol` variable? +## I assigned a variable `mycol="x"` but then `DT[, mycol]` returns an error. How do I get it to look up the column name contained in the `mycol` variable? -What's happening is that the `j` expression sees objects in the calling scope. The variable `mycol` does not exist as a column name of `DT` so `data.table` then looked in the calling scope and found `mycol` there and returned its value `"x"`. This is correct behaviour currently. Had `mycol` been a column name, then that column's data would have been returned. +The error is that column named `"mycol"` cannot be found, and this error is correct. `data.table`'s scoping is different to `data.frame` in that you can use column names as if they are variables directly inside `DT[...]` without prefixing each column name with `DT$`; see FAQ 1.1 above. -To get the column `x` from `DT`, there are a few options: +To use `mycol` to select the column `x` from `DT`, there are a few options: ```r -# using .. to tell data.table the variable should be evaluated -DT[ , ..mycol] -# using with=FALSE to do the same -DT[ , mycol, with=FALSE] -# treating DT as a list and using [[ -DT[[mycol]] +DT[, ..mycol] # .. prefix conveys to look for the mycol one level up in calling scope +DT[, mycol, with=FALSE] # revert to data.frame behavior +DT[[mycol]] # treat DT as a list and use [[ from base R ``` -The `with` argument refers to the `base` function `with` -- when `with=TRUE`, `data.table` operates similar to `with`, i.e. `DT[ , mycol]` behaves like `with(DT, mycol)`. When `with=FALSE`, the standard `data.frame` evaluation rules apply. +See `?data.table` for more details about the `..` prefix. + +The `with` argument takes its name from the `base` function `with()`. When `with=TRUE` (default), `data.table` operates similar to `with()`, i.e. `DT[, mycol]` behaves like `with(DT, mycol)`. When `with=FALSE`, the standard `data.frame` evaluation rules apply to all variables in `j` and you can no longer use column names directly. ## What are the benefits of being able to use column names as if they are variables inside `DT[...]`? From e53ad07c5e7f3cd587fe31c76cd5db4e9f5f8ebf Mon Sep 17 00:00:00 2001 From: Matt Dowle Date: Mon, 21 Jun 2021 19:36:06 -0600 Subject: [PATCH 4/4] simplify comment in example(data.table) by removing comparison to DT[, .SD, .SDcols=colNum] --- man/data.table.Rd | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/man/data.table.Rd b/man/data.table.Rd index e624636ed6..9df490f77d 100644 --- a/man/data.table.Rd +++ b/man/data.table.Rd @@ -280,9 +280,9 @@ DT[2:5, cat(v, "\n")] # just for j's side effect # select columns the data.frame way DT[, 2] # 2nd column, returns a data.table always -colNum = 2 # to refer vars in `j` from the outside of data use `..` prefix -DT[, ..colNum] # same, equivalent to DT[, .SD, .SDcols=colNum] -DT[["v"]] # same as DT[, v] but much faster +colNum = 2 +DT[, ..colNum] # same, .. prefix conveys to look for colNum one-level-up in calling scope +DT[["v"]] # same as DT[, v] but faster if called in a loop # grouping operations - j and by DT[, sum(v), by=x] # ad hoc by, order of groups preserved in result