Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -222,6 +222,8 @@

24. `column not found` could incorrectly occur in rare non-equi-join cases, [#3635](https://github.com/Rdatatable/data.table/issues/3635). Thanks to @UweBlock for the report.

25. Slight fix to the logic for auto-naming the `by` clause for using a custom function like `evaluate` to now be named `evaluate` instead of the name of the first symbolic argument, [#3758](https://github.com/Rdatatable/data.table/issues/3758).

#### NOTES

1. `rbindlist`'s `use.names="check"` now emits its message for automatic column names (`"V[0-9]+"`) too, [#3484](https://github.com/Rdatatable/data.table/pull/3484). See news item 5 of v1.12.2 below.
Expand Down
5 changes: 3 additions & 2 deletions R/IDateTime.R
Original file line number Diff line number Diff line change
Expand Up @@ -87,8 +87,9 @@ round.IDate = function (x, digits=c("weeks", "months", "quarters", "years"), ...
`+.IDate` = function (e1, e2) {
if (nargs() == 1L)
return(e1)
# TODO: investigate Ops.IDate method a la Ops.difftime
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted in ?Ops

The classes of both arguments are considered in dispatching any member of this group. For each argument its vector of classes is examined to see if there is a matching specific (preferred) or Ops method. If a method is found for just one argument or the same method is found for both, it is used. If different methods are found, there is a warning about ‘incompatible methods’: in that case or if no method is found for either argument the internal method is used.

Hence we can't really reach this branch since the incompatible methods barrier is hit first.

if (inherits(e1, "difftime") || inherits(e2, "difftime"))
stop("difftime objects may not be added to IDate. Use plain integer instead of difftime.")
stop("Internal error -- difftime objects may not be added to IDate, but Ops dispatch should have intervened to prevent this") # nocov
if (isReallyReal(e1) || isReallyReal(e2)) {
return(`+.Date`(e1, e2))
# IDate doesn't support fractional days; revert to base Date
Expand All @@ -108,7 +109,7 @@ round.IDate = function (x, digits=c("weeks", "months", "quarters", "years"), ...
if (nargs() == 1L)
stop("unary - is not defined for \"IDate\" objects")
if (inherits(e2, "difftime"))
stop("difftime objects may not be subtracted from IDate. Use plain integer instead of difftime.")
stop("Internal error -- difftime objects may not be subtracted from IDate, but Ops dispatch should have intervened to prevent this") # nocov

if ( isReallyReal(e2) ) {
# IDate deliberately doesn't support fractional days so revert to base Date
Expand Down
2 changes: 1 addition & 1 deletion R/bmerge.R
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ bmerge = function(i, x, icols, xcols, roll, rollends, nomatch, mult, ops, verbos
}
}
if (xclass == iclass) {
if (verbose) cat("i.",names(i)[ic],"has same type (",xclass,") as x.",names(x)[xc],". No coercion needed.\n", sep="")
if (verbose) cat("i.",names(i)[ic]," has same type (",xclass,") as x.",names(x)[xc],". No coercion needed.\n", sep="")
next
}
if (xclass=="character" || iclass=="character" ||
Expand Down
40 changes: 20 additions & 20 deletions R/data.table.R
Original file line number Diff line number Diff line change
Expand Up @@ -841,8 +841,10 @@ replace_order = function(isub, verbose, env) {
byvars = all.vars(bysubl[[jj+1L]], functions = TRUE)
if (length(byvars) == 1L) tt = byvars
else {
tt = grep("^eval|^[^[:alpha:]. ]",byvars,invert=TRUE,value=TRUE)
if (length(tt)) tt = tt[1L] else all.vars(bysubl[[jj+1L]])[1L]
# take the first variable that is (1) not eval (#3758) and (2) starts with a character that can't start a variable name
tt = grep("^eval$|^[^[:alpha:]. ]", byvars, invert=TRUE, value=TRUE)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simple fix for #3758 -- add ending anchor $ to the regex for eval

# byvars but exclude functions or `0`+`1` becomes `+`
tt = if (length(tt)) tt[1L] else all.vars(bysubl[[jj+1L]])[1L]
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previous code I believe was doing nothing in the else case, so I'm not 100% sure of the intended behavior but I believe this is it.

}
# fix for #497
if (length(byvars) > 1L && tt %chin% all.vars(jsub, FALSE)) {
Expand Down Expand Up @@ -1087,9 +1089,9 @@ replace_order = function(isub, verbose, env) {
if (is.list(k)) {
origj = j = if (name[[1L]] == "$") as.character(name[[3L]]) else eval(name[[3L]], parent.frame(), parent.frame())
if (is.character(j)) {
if (length(j)!=1L) stop("L[[i]][,:=] syntax only valid when i is length 1, but it's length %d",length(j))
if (length(j)!=1L) stop("Cannot assign to an under-allocated recursively indexed list -- L[[i]][,:=] syntax is only valid when i is length 1, but it's length ", length(j))
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite the obscure error. Especially because the first code works while the second produces this error:

opt = options(datatable.alloccol=1L)
l = list(foo = list(bar = data.table(a = 1:3, b = 4:6)))
l$foo$bar[ , (letters) := 16:18]
l = list(foo = list(bar = data.table(a = 1:3, b = 4:6)))
l[[c('foo', 'bar')]][ , (letters) := 16:18]
options(opt)

j = match(j, names(k))
if (is.na(j)) stop("Item '",origj,"' not found in names of list")
if (is.na(j)) stop("Internal error -- item '", origj, "' not found in names of list") # nocov
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see a way for this to be triggered, so marking it as internal

}
.Call(Csetlistelt,k,as.integer(j), x)
} else if (is.environment(k) && exists(as.character(name[[3L]]), k)) {
Expand Down Expand Up @@ -1118,7 +1120,7 @@ replace_order = function(isub, verbose, env) {
xcolsAns = seq_along(ansvars)
icols = icolsAns = integer()
} else {
if (!length(leftcols)) stop("column(s) not found: ", paste(ansvars[wna],collapse=", "))
if (!length(leftcols)) stop("Internal error -- column(s) not found: ", paste(ansvars[wna],collapse=", ")) # nocov
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto above. AFAICT things that might trigger this are caught much earlier than here.

xcols = w[!wna]
xcolsAns = which(!wna)
map = c(seq_along(i), leftcols) # this map is to handle dups in leftcols, #3635
Expand All @@ -1131,7 +1133,7 @@ replace_order = function(isub, verbose, env) {
if (any(w2na <- is.na(w2))) {
ivars[leftcols] = paste0("i.",ivars[leftcols])
w2[w2na] = chmatch(ansvars[wna][w2na], ivars)
if (any(w2na <- is.na(w2))) stop("column(s) not found: ", paste(ansvars[wna][w2na],sep=", "))
if (any(w2na <- is.na(w2))) stop("Internal error -- column(s) not found: ", paste(ansvars[wna][w2na],sep=", ")) # nocov
}
}
icols = w2
Expand Down Expand Up @@ -1294,7 +1296,7 @@ replace_order = function(isub, verbose, env) {
identical(irows, integer(0L)) && !bynull,
length(irows) && !anyNA(irows) && all(irows==0L) ## anyNA() because all() returns NA (not FALSE) when irows is all-NA. TODO: any way to not check all 'irows' values?
))
if (is.atomic(jval)) jval = jval[0L] else jval = lapply(jval, `[`, 0L)
jval = lapply(jval, `[`, 0L)
Copy link
Copy Markdown
Member Author

@MichaelChirico MichaelChirico Aug 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe is.atomic(jval) is possible here.

Just above you can see this branch enters with either is.list(jval) or !missingby:

https://github.com/Rdatatable/data.table/pull/3761/files#diff-650e4a11ed4384f8e560a6ebfff4ff53L1287

if ((is.call(jsub) && is.list(jval) && jsub[[1L]] != "get" && !is.object(jval)) || !missingby)

Obviously is.atomic(jval) is impossible in the first case. You have to scroll a bit further up but I think the second case is also impossible:

https://github.com/Rdatatable/data.table/pull/3761/files#diff-650e4a11ed4384f8e560a6ebfff4ff53L1168

if (missingby || bynull || (!byjoin && !length(byval)))

So we've either got bynull or (!byjoin && !length(byval)). But I'm not sure how length(byval) can be 0 but bynull is FALSE -- e.g. by = integer() would have byval = list(integer = integer()), so it's already en-listed.

if (is.atomic(jval)) {
setattr(jval,"names",NULL)
jval = data.table(jval) # TO DO: should this be setDT(list(jval)) instead?
Expand Down Expand Up @@ -1884,7 +1886,7 @@ as.matrix.data.table = function(x, rownames=NULL, rownames.value=NULL, ...) {
non.numeric = non.atomic = FALSE
all.logical = TRUE
for (j in seq_len(p)) {
if (is.ff(X[[j]])) X[[j]] = X[[j]][] # to bring the ff into memory, since we need to create a matrix in memory
if (is.ff(X[[j]])) X[[j]] = X[[j]][] # nocov to bring the ff into memory, since we need to create a matrix in memory
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't have ff in Suggests, so nocov. We have is.ff as a simple wrapper, so technically we could construct some object and force ff class to it & test but that's not really a test of what this branch is supposed to do. So nocov unless we want to add ff to Suggests, or if you think we should add ff to the test packages.

xj = X[[j]]
if (length(dj <- dim(xj)) == 2L && dj[2L] > 1L) {
if (inherits(xj, "data.table"))
Expand Down Expand Up @@ -1930,13 +1932,13 @@ as.matrix.data.table = function(x, rownames=NULL, rownames.value=NULL, ...) {

# bug #2375. fixed. same as head.data.frame and tail.data.frame to deal with negative indices
head.data.table = function(x, n=6L, ...) {
if (!cedta()) return(NextMethod())
if (!cedta()) return(NextMethod()) # nocov
stopifnot(length(n) == 1L)
i = seq_len(if (n<0L) max(nrow(x)+n, 0L) else min(n,nrow(x)))
x[i, , ]
}
tail.data.table = function(x, n=6L, ...) {
if (!cedta()) return(NextMethod())
if (!cedta()) return(NextMethod()) # nocov
stopifnot(length(n) == 1L)
n = if (n<0L) max(nrow(x) + n, 0L) else min(n, nrow(x))
i = seq.int(to=nrow(x), length.out=n)
Expand Down Expand Up @@ -2077,7 +2079,7 @@ within.data.table = function (data, expr, ...)
# basically within.list but retains key (if any)
# will be slower than using := or a regular query (see ?within for further info).
{
if (!cedta()) return(NextMethod())
if (!cedta()) return(NextMethod()) # nocov
parent = parent.frame()
e = evalq(environment(), data, parent)
eval(substitute(expr), e) # might (and it's known that some user code does) contain rm()
Expand All @@ -2101,7 +2103,7 @@ within.data.table = function (data, expr, ...)
transform.data.table = function (`_data`, ...)
# basically transform.data.frame with data.table instead of data.frame, and retains key
{
if (!cedta()) return(NextMethod())
if (!cedta()) return(NextMethod()) # nocov
e = eval(substitute(list(...)), `_data`, parent.frame())
tags = names(e)
inx = chmatch(tags, names(`_data`))
Expand Down Expand Up @@ -2176,7 +2178,7 @@ any_na = function(x, by=seq_along(x)) .Call(CanyNA, x, by)

na.omit.data.table = function (object, cols = seq_along(object), invert = FALSE, ...) {
# compare to stats:::na.omit.data.frame
if (!cedta()) return(NextMethod())
if (!cedta()) return(NextMethod()) # nocov
if ( !missing(invert) && is.na(as.logical(invert)) )
stop("Argument 'invert' must be logical TRUE/FALSE")
if (is.character(cols)) {
Expand Down Expand Up @@ -2761,15 +2763,15 @@ rowid = function(..., prefix=NULL) {

rowidv = function(x, cols=seq_along(x), prefix=NULL) {
if (!is.null(prefix) && (!is.character(prefix) || length(prefix) != 1L))
stop("prefix must be NULL or a character vector of length=1.")
stop("'prefix' must be NULL or a character vector of length 1.")
if (is.atomic(x)) {
if (!missing(cols) && !is.null(cols))
stop("x is a single vector, non-NULL 'cols' doesn't make sense.")
cols = 1L
x = as_list(x)
} else {
if (!length(cols))
stop("x is a list, 'cols' can not be on 0-length.")
stop("x is a list, 'cols' cannot be 0-length.")
if (is.character(cols))
cols = chmatch(cols, names(x))
cols = as.integer(cols)
Expand All @@ -2790,15 +2792,15 @@ rleid = function(..., prefix=NULL) {

rleidv = function(x, cols=seq_along(x), prefix=NULL) {
if (!is.null(prefix) && (!is.character(prefix) || length(prefix) != 1L))
stop("prefix must be NULL or a character vector of length=1.")
stop("'prefix' must be NULL or a character vector of length 1.")
if (is.atomic(x)) {
if (!missing(cols) && !is.null(cols))
stop("x is a single vector, non-NULL 'cols' doesn't make sense.")
cols = 1L
x = as_list(x)
} else {
if (!length(cols))
stop("x is a list, 'cols' can not be 0-length.")
stop("x is a list, 'cols' cannot be 0-length.")
if (is.character(cols))
cols = chmatch(cols, names(x))
cols = as.integer(cols)
Expand Down Expand Up @@ -2880,7 +2882,7 @@ isReallyReal = function(x) {
## redirect to normal DT[x == TRUE]
stub = call("==", as.symbol(col), TRUE)
}
if (length(stub[[1L]]) != 1) return(NULL) ## Whatever it is, definitely not one of the valid operators
if (length(stub[[1L]]) != 1) return(NULL) # nocov Whatever it is, definitely not one of the valid operators
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see a way to reach this branch and the comment suggests it's there as an internal default, could you confirm @MarkusBonsch?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are completely right.

operator = as.character(stub[[1L]])
if (!operator %chin% validOps$op) return(NULL) ## operator not supported
if (!is.name(stub[[2L]])) return(NULL)
Expand All @@ -2902,7 +2904,6 @@ isReallyReal = function(x) {
# the mode() checks also deals with NULL since mode(NULL)=="NULL" and causes this return, as one CRAN package (eplusr 0.9.1) relies on
return(NULL)
}
if(is.character(x[[col]]) && !operator %chin% c("==", "%in%", "%chin%")) return(NULL) ## base R allows for non-equi operators on character columns, but these can't be optimized.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant to here in current logic:

https://github.com/Rdatatable/data.table/pull/3761/files#diff-650e4a11ed4384f8e560a6ebfff4ff53L2885

if (!operator %chin% validOps$op)

if (!operator %chin% c("%in%", "%chin%")) {
# additional requirements for notjoin and NA values. Behaviour is different for %in%, %chin% compared to other operators
# RHS is of length=1 or n
Expand Down Expand Up @@ -2998,7 +2999,6 @@ isReallyReal = function(x) {
pat = paste0("(", ops, ")", collapse="|")
if (is.call(onsub) && onsub[[1L]] == "eval") {
onsub = eval(onsub[[2L]], parent.frame(2L), parent.frame(2L))
if (is.call(onsub) && onsub[[1L]] == "eval") { onsub = onsub[[2L]] }
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The eval will eliminate any level of nested eval/quote already, so I don't think this branch is possible; see:

DT <- data.table(id = 1:3, `counts(a>=0)` = 1:3, sameName = 1:3)
i <- data.table(idi = 1:3, `  weirdName>=` = 1:3, sameName = 1:3)
DT[i, on = eval(eval(quote(eval("id<=idi"))))]

After eval onsub becomes id<=idi

}
if (is.call(onsub) && as.character(onsub[[1L]]) %chin% c("list", ".")) {
spat = paste0("[ ]+(", pat, ")[ ]+")
Expand Down
2 changes: 1 addition & 1 deletion R/groupingsets.R
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ groupingsets.data.table = function(x, j, by, sets, .SDcols, id = FALSE, jj, ...)
}
}
if (id && "grouping" %chin% names(empty)) # `j` could have been evaluated to `grouping` field
stop("When using `id=TRUE` the 'j' expression must not evaluate to column named 'grouping'.")
stop("When using `id=TRUE` the 'j' expression must not evaluate to a column named 'grouping'.")
if (anyDuplicated(names(empty)) > 0L)
stop("There exists duplicated column names in the results, ensure the column passed/evaluated in `j` and those in `by` are not overlapping.")
# adding grouping column to template - aggregation level identifier
Expand Down
3 changes: 1 addition & 2 deletions R/print.data.table.R
Original file line number Diff line number Diff line change
Expand Up @@ -83,8 +83,7 @@ print.data.table = function(x, topn=getOption("datatable.print.topn"),
expression = "<expr>", ordered = "<ord>")
classes = vapply(x, function(col) class(col)[1L], "", USE.NAMES=FALSE)
abbs = unname(class_abb[classes])
if ( length(idx <- which(is.na(abbs))) )
abbs[idx] = paste0("<", classes[idx], ">")
if ( length(idx <- which(is.na(abbs))) ) abbs[idx] = paste0("<", classes[idx], ">")
toprint = rbind(abbs, toprint)
rownames(toprint)[1L] = ""
}
Expand Down
2 changes: 1 addition & 1 deletion R/setops.R
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@ all.equal.data.table = function(target, current, trim.levels=TRUE, check.attribu
paste0(names(targetTypes)[w],"(",paste(targetTypes[w],currentTypes[w],sep="!="),")")
,collapse=" ")))
}

# check key
k1 = key(target)
k2 = key(current)
Expand Down
4 changes: 2 additions & 2 deletions R/transpose.R
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ transpose = function(l, fill=NA, ignore.empty=FALSE, keep.names=NULL, make.names
}

tstrsplit = function(x, ..., fill=NA, type.convert=FALSE, keep, names=FALSE) {
if (!isTRUEorFALSE(names) && !is.character(names))
stop("'names' must be TRUE/FALSE or a character vector.")
ans = transpose(strsplit(as.character(x), ...), fill=fill, ignore.empty=FALSE)
if (!missing(keep)) {
keep = suppressWarnings(as.integer(keep))
Expand All @@ -37,8 +39,6 @@ tstrsplit = function(x, ..., fill=NA, type.convert=FALSE, keep, names=FALSE) {
if(type.convert) ans = lapply(ans, type.convert, as.is = TRUE)
if (isFALSE(names)) return(ans)
else if (isTRUE(names)) names = paste0("V", seq_along(ans))
if (!is.character(names))
stop("'names' must be TRUE/FALSE or a character vector.")
if (length(names) != length(ans)) {
str = if (missing(keep)) "ans" else "keep"
stop("length(names) (= ", length(names),
Expand Down
Loading