Completes coverage of all .R files by MichaelChirico · Pull Request #3761 · Rdatatable/data.table

MichaelChirico · 2019-08-11T05:01:48Z

Closes #3758 too.

Please see also inline notes

MichaelChirico · 2019-08-11T05:02:30Z

    }
    if (xclass == iclass) {
-      if (verbose) cat("i.",names(i)[ic],"has same type (",xclass,") as x.",names(x)[xc],". No coercion needed.\n", sep="")
+      if (verbose) cat("i.", names(i)[ic], " has same type (", xclass, ") as x.", names(x)[xc], ". No coercion needed.\n", sep="")


mainly missing a space, noticed in the verbose output for 2074.18

MichaelChirico · 2019-08-11T05:03:06Z

-                tt = grep("^eval|^[^[:alpha:]. ]",byvars,invert=TRUE,value=TRUE)
-                if (length(tt)) tt = tt[1L] else all.vars(bysubl[[jj+1L]])[1L]
+                # take the first variable that is (1) not eval (#3758) and (2) starts with a character that can't start a variable name
+                tt = grep("^eval$|^[^[:alpha:]. ]", byvars, invert=TRUE, value=TRUE)


simple fix for #3758 -- add ending anchor $ to the regex for eval

MichaelChirico · 2019-08-11T05:03:33Z

+                # take the first variable that is (1) not eval (#3758) and (2) starts with a character that can't start a variable name
+                tt = grep("^eval$|^[^[:alpha:]. ]", byvars, invert=TRUE, value=TRUE)
+                # byvars but exclude functions or `0`+`1` becomes `+`
+                tt = if (length(tt)) tt[1L] else all.vars(bysubl[[jj+1L]])[1L]


previous code I believe was doing nothing in the else case, so I'm not 100% sure of the intended behavior but I believe this is it.

MichaelChirico · 2019-08-11T05:05:47Z

                origj = j = if (name[[1L]] == "$") as.character(name[[3L]]) else eval(name[[3L]], parent.frame(), parent.frame())
                if (is.character(j)) {
-                  if (length(j)!=1L) stop("L[[i]][,:=] syntax only valid when i is length 1, but it's length %d",length(j))
+                  if (length(j)!=1L) stop("Cannot assign to an under-allocated recursively indexed list -- L[[i]][,:=] syntax is only valid when i is length 1, but it's length ", length(j))


Quite the obscure error. Especially because the first code works while the second produces this error:

opt = options(datatable.alloccol=1L) l = list(foo = list(bar = data.table(a = 1:3, b = 4:6))) l$foo$bar[ , (letters) := 16:18] l = list(foo = list(bar = data.table(a = 1:3, b = 4:6))) l[[c('foo', 'bar')]][ , (letters) := 16:18] options(opt)

MichaelChirico · 2019-08-11T05:06:36Z

+                  if (length(j)!=1L) stop("Cannot assign to an under-allocated recursively indexed list -- L[[i]][,:=] syntax is only valid when i is length 1, but it's length ", length(j))
                  j = match(j, names(k))
-                  if (is.na(j)) stop("Item '",origj,"' not found in names of list")
+                  if (is.na(j)) stop("Internal error -- item '", origj, "' not found in names of list") # nocov


I can't see a way for this to be triggered, so marking it as internal

MichaelChirico · 2019-08-11T05:07:17Z

        icols = icolsAns = integer()
      } else {
-        if (!length(leftcols)) stop("column(s) not found: ", paste(ansvars[wna],collapse=", "))
+        if (!length(leftcols)) stop("Internal error -- column(s) not found: ", paste(ansvars[wna],collapse=", ")) # nocov


Ditto above. AFAICT things that might trigger this are caught much earlier than here.

codecov · 2019-08-11T05:12:53Z

Codecov Report

Merging #3761 into master will increase coverage by 0.58%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #3761      +/-   ##
==========================================
+ Coverage   98.83%   99.41%   +0.58%     
==========================================
  Files          70       70              
  Lines       13228    13204      -24     
==========================================
+ Hits        13074    13127      +53     
+ Misses        154       77      -77

Impacted Files	Coverage Δ
R/setops.R	`100% <ø> (+0.52%)`	⬆️
R/IDateTime.R	`100% <ø> (+1.24%)`	⬆️
src/fread.c	`99.44% <ø> (+0.94%)`	⬆️
R/print.data.table.R	`100% <100%> (+4.95%)`	⬆️
R/transpose.R	`100% <100%> (ø)`	⬆️
R/data.table.R	`100% <100%> (+2.16%)`	⬆️
src/bmerge.c	`100% <100%> (+1.74%)`	⬆️
R/groupingsets.R	`100% <100%> (+3.57%)`	⬆️
R/bmerge.R	`100% <100%> (ø)`	⬆️
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 30a4fcd...8c1fe1b. Read the comment docs.

MichaelChirico · 2019-08-11T05:22:57Z

          length(irows) && !anyNA(irows) && all(irows==0L) ## anyNA() because all() returns NA (not FALSE) when irows is all-NA. TODO: any way to not check all 'irows' values?
          ))
-        if (is.atomic(jval)) jval = jval[0L] else jval = lapply(jval, `[`, 0L)
+        jval = lapply(jval, `[`, 0L)


I don't believe is.atomic(jval) is possible here.

Just above you can see this branch enters with either is.list(jval) or !missingby:

https://github.com/Rdatatable/data.table/pull/3761/files#diff-650e4a11ed4384f8e560a6ebfff4ff53L1287

if ((is.call(jsub) && is.list(jval) && jsub[[1L]] != "get" && !is.object(jval)) || !missingby)

Obviously is.atomic(jval) is impossible in the first case. You have to scroll a bit further up but I think the second case is also impossible:

https://github.com/Rdatatable/data.table/pull/3761/files#diff-650e4a11ed4384f8e560a6ebfff4ff53L1168

if (missingby || bynull || (!byjoin && !length(byval)))

So we've either got bynull or (!byjoin && !length(byval)). But I'm not sure how length(byval) can be 0 but bynull is FALSE -- e.g. by = integer() would have byval = list(integer = integer()), so it's already en-listed.

MichaelChirico · 2019-08-11T05:25:10Z

  all.logical = TRUE
  for (j in seq_len(p)) {
-    if (is.ff(X[[j]])) X[[j]] = X[[j]][]   # to bring the ff into memory, since we need to create a matrix in memory
+    if (is.ff(X[[j]])) X[[j]] = X[[j]][]   # nocov to bring the ff into memory, since we need to create a matrix in memory


we don't have ff in Suggests, so nocov. We have is.ff as a simple wrapper, so technically we could construct some object and force ff class to it & test but that's not really a test of what this branch is supposed to do. So nocov unless we want to add ff to Suggests, or if you think we should add ff to the test packages.

MichaelChirico · 2019-08-11T05:26:06Z

      stub = call("==", as.symbol(col), TRUE)
    }
-    if (length(stub[[1L]]) != 1) return(NULL) ## Whatever it is, definitely not one of the valid operators
+    if (length(stub[[1L]]) != 1) return(NULL) # nocov Whatever it is, definitely not one of the valid operators


I can't see a way to reach this branch and the comment suggests it's there as an internal default, could you confirm @MarkusBonsch?

You are completely right.

MichaelChirico · 2019-08-11T05:26:40Z

      # the mode() checks also deals with NULL since mode(NULL)=="NULL" and causes this return, as one CRAN package (eplusr 0.9.1) relies on
      return(NULL)
    }
-    if(is.character(x[[col]]) && !operator %chin% c("==", "%in%", "%chin%")) return(NULL) ## base R allows for non-equi operators on character columns, but these can't be optimized.


Redundant to here in current logic:

https://github.com/Rdatatable/data.table/pull/3761/files#diff-650e4a11ed4384f8e560a6ebfff4ff53L2885

if (!operator %chin% validOps$op)

MichaelChirico · 2019-08-11T05:28:13Z

  pat = paste0("(", ops, ")", collapse="|")
  if (is.call(onsub) && onsub[[1L]] == "eval") {
    onsub = eval(onsub[[2L]], parent.frame(2L), parent.frame(2L))
-    if (is.call(onsub) && onsub[[1L]] == "eval") { onsub = onsub[[2L]] }


The eval will eliminate any level of nested eval/quote already, so I don't think this branch is possible; see:

DT <- data.table(id = 1:3, `counts(a>=0)` = 1:3, sameName = 1:3) i <- data.table(idi = 1:3, ` weirdName>=` = 1:3, sameName = 1:3) DT[i, on = eval(eval(quote(eval("id<=idi"))))]

After eval onsub becomes id<=idi

MichaelChirico · 2019-08-11T05:29:19Z

Once merged we'll be over 99% coverage! Mission accomplished 😎

MichaelChirico · 2019-08-11T12:54:37Z

 `+.IDate` = function (e1, e2) {
  if (nargs() == 1L)
    return(e1)
+  # TODO: investigate Ops.IDate method a la Ops.difftime


As noted in ?Ops

The classes of both arguments are considered in dispatching any member of this group. For each argument its vector of classes is examined to see if there is a matching specific (preferred) or Ops method. If a method is found for just one argument or the same method is found for both, it is used. If different methods are found, there is a warning about ‘incompatible methods’: in that case or if no method is found for either argument the internal method is used.

Hence we can't really reach this branch since the incompatible methods barrier is hit first.

MichaelChirico · 2019-08-11T12:56:17Z

      if (!length(x)) return(x) else return(x[[length(x)]])  # for vectors, [[ works like [
    } else if (is.data.frame(x)) return(x[NROW(x),])
  }
+  # nocov start


I saw the if branches covered in first but not last (despite apparently neither actually being hit in tests), so just throw the whole sections in nocov to be sure. maybe surface to Jim as well...

MichaelChirico · 2019-08-11T12:57:05Z

      make.names=chmatch(make.names, names(l))
      if (is.na(make.names))
-        stop("make.names='",make.names,"' not found in names of input")
+        stop("make.names not found in names of input")


make.names original value is overwritten already. Rather than complicate the logic (i.e. by one or two extra lines), just axed here, but easy to change.

MichaelChirico · 2019-08-11T12:58:26Z

+test(1613.563, all(
+  all.equal(rbind(x,y), rbind(y,y), ignore.row.order=FALSE),
+  all.equal(rbind(x,y), rbind(y,y), ignore.row.order=TRUE),
+  all.equal(rbind(y,y), rbind(x,y), ignore.row.order=TRUE)


New test switches the order to hit the first-but-not-second-has-dups branch whose second-but-not-first counterpart is hit by the 2nd test already

…between arguments so we can see where the real spaces inside quotes are. In this instance the diff now shows where the real space was added.

mattdowle · 2019-08-12T21:58:50Z

Fantastic! I went through it and all looks good. Great comments. I added comment in the tests.Rraw for the new tests 2074.* file pointing back to this PR so we can get back to the comments if we ever need to.

Closes #3758 and completes coverage of data.table.R

45ce60e

MichaelChirico commented Aug 11, 2019

View reviewed changes

actually cover line

c4a2f9f

jangorecki reviewed Aug 11, 2019

View reviewed changes

Comment thread R/data.table.R Outdated

complete coverage of all R source files

c27521f

MichaelChirico changed the title ~~Closes #3758 and completes coverage of data.table.R~~ Closes #3758 and completes coverage of all files in R/ Aug 11, 2019

MichaelChirico commented Aug 11, 2019

View reviewed changes

Michael Chirico added 3 commits August 11, 2019 21:07

covered wrong groupingsets error

6915452

fix test

9c72433

some C coverage too while were at it

b863930

mattdowle changed the title ~~Closes #3758 and completes coverage of all files in R/~~ Completes coverage of all .R files Aug 12, 2019

Merge branch 'master' into dt_covr

5d96c71

mattdowle added this to the 1.12.4 milestone Aug 12, 2019

very minor, spaces only: in cat(..., sep='') I tend to use no spaces …

8c1fe1b

…between arguments so we can see where the real spaces inside quotes are. In this instance the diff now shows where the real space was added.

mattdowle merged commit f7e82a5 into master Aug 12, 2019

mattdowle deleted the dt_covr branch August 12, 2019 22:49

Conversation

MichaelChirico commented Aug 11, 2019 • edited by mattdowle Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Aug 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

MichaelChirico Aug 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MichaelChirico commented Aug 11, 2019

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattdowle commented Aug 12, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

MichaelChirico commented Aug 11, 2019 •

edited by mattdowle

Loading

codecov bot commented Aug 11, 2019 •

edited

Loading

MichaelChirico Aug 11, 2019 •

edited

Loading