Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,8 @@

4. New option `options(datatable.quiet = TRUE)` turns off the package startup message, [#3489](https://github.com/Rdatatable/data.table/issues/3489). `suppressPackageStartupMessages()` continues to work too. Thanks to @leobarlach for the suggestion inspired by `options(tidyverse.quiet = TRUE)`. We don't know of a way to make a package respect the `quietly=` option of `library()` and `require()` because the `quietly=` isn't passed through for use by the package's own `.onAttach`. If you can see how to do that, please submit a patch to R.

5. When loading a `data.table` from disk (e.g. with `readRDS`), best practice is to run `setDT()` on the new object to assure it is correctly allocated memory for new column pointers. Barring this, unexpected behavior can follow; for example, if you assign a new column to `DT` from a function `f`, the new columns will only be assigned within `f` and `DT` will be unchanged. The `verbose` messaging in this situation is now more helpful, [#1729](https://github.com/Rdatatable/data.table/issues/1729). Thanks @vspinu for sharing his experience to spur this.


### Changes in [v1.12.2](https://github.com/Rdatatable/data.table/milestone/14?closed=1) (07 Apr 2019)

Expand Down
10 changes: 8 additions & 2 deletions R/data.table.R
Original file line number Diff line number Diff line change
Expand Up @@ -1161,15 +1161,21 @@ replace_dot_alias <- function(e) {
newnames=setdiff(lhs,names(x))
m[is.na(m)] = ncol(x)+seq_len(length(newnames))
cols = as.integer(m)
if ((ok<-selfrefok(x,verbose))==0L) # ok==0 so no warning when loaded from disk (-1) [-1 considered TRUE by R]
# don't pass verbose to selfrefok here -- only activated when
# ok=-1 which will trigger alloc.col with verbose in the next
# branch, which again calls _selfrefok and returns the message then
if ((ok<-selfrefok(x, verbose=FALSE))==0L) # ok==0 so no warning when loaded from disk (-1) [-1 considered TRUE by R]
warning("Invalid .internal.selfref detected and fixed by taking a (shallow) copy of the data.table so that := can add this new column by reference. At an earlier point, this data.table has been copied by R (or was created manually using structure() or similar). Avoid names<- and attr<- which in R currently (and oddly) may copy the whole data.table. Use set* syntax instead to avoid copying: ?set, ?setnames and ?setattr. If this message doesn't help, please report your use case to the data.table issue tracker so the root cause can be fixed or this message improved.")
if ((ok<1L) || (truelength(x) < ncol(x)+length(newnames))) {
DT = x # in case getOption contains "ncol(DT)" as it used to. TODO: warn and then remove
n = length(newnames) + eval(getOption("datatable.alloccol")) # TODO: warn about expressions and then drop the eval()
# i.e. reallocate at the size as if the new columns were added followed by alloc.col().
name = substitute(x)
if (is.name(name) && ok && verbose) { # && NAMED(x)>0 (TO DO) # ok here includes -1 (loaded from disk)
cat("Growing vector of column pointers from truelength ",truelength(x)," to ",n,". A shallow copy has been taken, see ?alloc.col. Only a potential issue if two variables point to the same data (we can't yet detect that well) and if not you can safely ignore this. To avoid this message you could alloc.col() first, deep copy first using copy(), wrap with suppressWarnings() or increase the 'datatable.alloccol' option.\n")
cat("Growing vector of column pointers from truelength ", truelength(x), " to ", n, ". A shallow copy has been taken, see ?alloc.col. Only a potential issue if two variables point to the same data (we can't yet detect that well) and if not you can safely ignore this. To avoid this message you could alloc.col() first, deep copy first using copy(), wrap with suppressWarnings() or increase the 'datatable.alloccol' option.\n")
# #1729 -- copying to the wrong environment here can cause some confusion
if (ok == -1L) cat("Note that the shallow copy will assign to the environment from which := was called. That means for example that if := was called within a function, the original table may be unaffected.\n")

# Verbosity should not issue warnings, so cat rather than warning.
# TO DO: Add option 'datatable.pedantic' to turn on warnings like this.

Expand Down
12 changes: 12 additions & 0 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -14467,6 +14467,18 @@ test(2036.1, !any(grepl("1: 1", capture.output(source(tmp, echo = TRUE)), fi
writeLines(c(setup, 'DT[ , a := 1][]'), tmp)
test(2036.2, source(tmp, echo = TRUE), output = "1:\\s+1")

# more helpful guidance when assigning before setDT() after readRDS(); #1729
DT = data.table(a = 1:3)
saveRDS(DT, tmp<-tempfile())
rm(DT)
DT = readRDS(tmp)
foo = function(x) { x[ , b:=4:6, verbose=TRUE][] }
test(2037.1, foo(DT), output='Please remember to always setDT()')
# no assignment was made to DT
test(2037.2, names(DT), 'a')
# _selrefok() verbose message was duplicated
test(2037.3, unname(table(unlist(strsplit(capture.output(foo(DT)), '\n|\\s+')))['ptr']), 1L)


###################################
# Add new tests above this line #
Expand Down
2 changes: 1 addition & 1 deletion src/assign.c
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ static int _selfrefok(SEXP x, Rboolean checkNames, Rboolean verbose) {
}
p = R_ExternalPtrAddr(v);
if (p==NULL) {
if (verbose) Rprintf(".internal.selfref ptr is NULL. This is expected and normal for a data.table loaded from disk. If not, please report to data.table issue tracker.\n");
if (verbose) Rprintf(".internal.selfref ptr is NULL. This is expected and normal for a data.table loaded from disk. Please remember to always setDT() immediately after loading to prevent unexpected behavior. If this table was not loaded from disk or you've already run setDT(), please report to data.table issue tracker.\n");
return -1;
}
if (!isNull(p)) error("Internal error: .internal.selfref ptr is not NULL or R_NilValue"); // # nocov
Expand Down