diff --git a/NEWS.md b/NEWS.md index f0581ccb52..c91e72b81c 100644 --- a/NEWS.md +++ b/NEWS.md @@ -18,7 +18,7 @@ DT = data.table(A=1:3, B=letters[1:3]) DT[A>3, .(ITEM='A>3', A, B)] # (1) DT[A>3][, .(ITEM='A>3', A, B)] # (2) - # the above are now equivalent as expected and return: + # the above are now equivalent as expected and return: Empty data.table (0 rows and 3 cols): ITEM,A,B # Previously, (2) returned : ITEM A B @@ -30,12 +30,12 @@ 2: In as.data.table.list(jval, .named = NULL) : Item 3 has 0 rows but longest item has 1; filled with NA ``` - + ```R DT = data.table(A=1:3, B=letters[1:3], key="A") DT[.(1:3, double()), B] # new result : - character(0) + character(0) # old result : [1] "a" "b" "c" Warning message: @@ -51,7 +51,7 @@ DT[, sum(colB), keyby="colA"] DT[, sum(colB), by="colA", keyby=TRUE] # same ``` - + 7. `fwrite()` gains a new `datatable.fwrite.sep` option to change the default separator, still `","` by default. Thanks to Tony Fischetti for the PR. As is good practice in R in general, we usually resist new global options for the reason that a user changing the option for their own code can inadvertently change the behaviour of any package using `data.table` too. However, in this case, the global option affects file output rather than code behaviour. In fact, the very reason the user may wish to change the default separator is that they know a different separator is more appropriate for their data being passed to the package using `fwrite` but cannot otherwise change the `fwrite` call within that package. 8. `melt()` now supports `NA` entries when specifying a list of `measure.vars`, which translate into runs of missing values in the output. Useful for melting wide data with some missing columns, [#4027](https://github.com/Rdatatable/data.table/issues/4027). Thanks to @vspinu for reporting, and @tdhock for implementing. @@ -86,7 +86,7 @@ out_col_name = "sum_x" )] ``` - + 11. `DT[, if (...) .(a=1L) else .(a=1L, b=2L), by=group]` now returns a 1-column result with warning `j may not evaluate to the same number of columns for each group`, rather than error `'names' attribute [2] must be the same length as the vector`, [#4274](https://github.com/Rdatatable/data.table/issues/4274). Thanks to @robitalec for reporting, and Michael Chirico for the PR. 12. Typo checking in `i` available since 1.11.4 is extended to work in non-English sessions, [#4989](https://github.com/Rdatatable/data.table/issues/4989). Thanks to Michael Chirico for the PR. @@ -114,7 +114,7 @@ ```R mtcars |> DT(mpg>20, .(mean_hp=mean(hp)), by=cyl) ``` - + 23. `DT[i, nomatch=NULL]` where `i` contains row numbers now excludes `NA` and any outside the range [1,nrow], [#3109](https://github.com/Rdatatable/data.table/issues/3109) [#3666](https://github.com/Rdatatable/data.table/issues/3666). Before, `NA` rows were returned always for such values; i.e. `nomatch=0|NULL` was ignored. Thanks Michel Lang and Hadley Wickham for the requests, and Jan Gorecki for the PR. Using `nomatch=0` in this case when `i` is row numbers generates the warning `Please use nomatch=NULL instead of nomatch=0; see news item 5 in v1.12.0 (Jan 2019)`. ```R @@ -246,26 +246,26 @@ # 2: b NA # NA because there are no non-NA, naturally # no inconvenient warning ``` - + 36. `DT[, min(int64Col), by=grp]` (and `max`) would return incorrect results for `bit64::integer64` columns, [#4444](https://github.com/Rdatatable/data.table/issues/4444). Thanks to @go-see for reporting, and Michael Chirico for the PR. 37. `fread(dec=',')` was able to guess `sep=','` and return an incorrect result, [#4483](https://github.com/Rdatatable/data.table/issues/4483). Thanks to Michael Chirico for reporting and fixing. It was already an error to provide both `sep=','` and `dec=','` manually. ```R fread('A|B|C\n1|0,4|a\n2|0,5|b\n', dec=',') # no problem - + # A B C # # 1: 1 0.4 a # 2: 2 0.5 b fread('A|B,C\n1|0,4\n2|0,5\n', dec=',') - + # A|B C # old result guessed sep=',' despite dec=',' # # 1: 1|0 4 # 2: 2|0 5 - + # A B,C # now detects sep='|' correctly # # 1: 1 0.4 @@ -276,9 +276,9 @@ ``` IDateTime("20171002095500", format="%Y%m%d%H%M%S") - + # was : - # Error in charToDate(x) : + # Error in charToDate(x) : # character string is not in a standard unambiguous format # now : @@ -287,6 +287,8 @@ # 1: 2017-10-02 09:55:00 ``` +39. `DT[i, sum(b), by=grp]` (and other optimized-by-group aggregates: `mean`, `var`, `sd`, `median`, `prod`, `min`, `max`, `first`, `last`, `head` and `tail`) could segfault if `i` contained row numbers and one or more were NA, [#1994](https://github.com/Rdatatable/data.table/issues/1994). Thanks to Arun Srinivasan for reporting, and Benjamin Schwendinger for the PR. + ## NOTES 1. New feature 29 in v1.12.4 (Oct 2019) introduced zero-copy coercion. Our thinking is that requiring you to get the type right in the case of `0` (type double) vs `0L` (type integer) is too inconvenient for you the user. So such coercions happen in `data.table` automatically without warning. Thanks to zero-copy coercion there is no speed penalty, even when calling `set()` many times in a loop, so there's no speed penalty to warn you about either. However, we believe that assigning a character value such as `"2"` into an integer column is more likely to be a user mistake that you would like to be warned about. The type difference (character vs integer) may be the only clue that you have selected the wrong column, or typed the wrong variable to be assigned to that column. For this reason we view character to numeric-like coercion differently and will warn about it. If it is correct, then the warning is intended to nudge you to wrap the RHS with `as.()` so that it is clear to readers of your code that a coercion from character to that type is intended. For example : diff --git a/R/test.data.table.R b/R/test.data.table.R index 0c7fbeb23a..65a62fd0b5 100644 --- a/R/test.data.table.R +++ b/R/test.data.table.R @@ -178,7 +178,7 @@ test.data.table = function(script="tests.Rraw", verbose=FALSE, pkg=".", silent=F catf("10 longest running tests took %ds (%d%% of %ds)\n", as.integer(tt<-DT[, sum(time)]), as.integer(100*tt/(ss<-timings[,sum(time)])), as.integer(ss)) print(DT, class=FALSE) - catf("All %d tests (last %s) in %s completed ok in %s\n", ntest, env$prevtest, names(fn), timetaken(env$started.at)) + catf("All %d tests (last %.8g) in %s completed ok in %s\n", ntest, env$prevtest, names(fn), timetaken(env$started.at)) ## this chunk requires to include new suggested deps: graphics, grDevices #memtest.plot = function(.inittime) { diff --git a/inst/tests/tests.Rraw b/inst/tests/tests.Rraw index d477b13e7d..4d782e4e92 100644 --- a/inst/tests/tests.Rraw +++ b/inst/tests/tests.Rraw @@ -3257,7 +3257,7 @@ Sep,33.5,19.4,15.7,11.9,0,100.8,100.8,0,12.7,12.7,0,174.1") x[ , r := as.raw(c(0, 1))] test(1037.414, melt(x, id.vars='x1', measure.vars='r'), error="Unknown column type 'raw' for column 'r'") - + # test dispatch for non-data.table objects, #4864. if (inherits(try(getNamespace("reshape2"), silent=TRUE),"try-error")) { test(1038.001, melt(as.data.frame(DT), id.vars=1:2, measure.vars=5:6), @@ -6759,7 +6759,7 @@ if (test_xts) { " 6: 1970-01-07 6", " 7: 1970-01-08 7", " 8: 1970-01-09 8", " 9: 1970-01-10 9", "10: 1970-01-11 10")) options(old) - + # as.data.table.xts(foo) had incorrect integer index with a column name called 'x', #4897 M = xts::as.xts(matrix(1, dimnames=list("2021-05-23", "x"))) # xts:: just to be extra robust; shouldn't be needed with rm(as.xts) above test(1465.19, inherits(as.data.table(M)$index,"POSIXct")) @@ -14711,9 +14711,9 @@ DT = data.table(id = c(1L,1L,2L), v = as.raw(0:2)) test(2020.01, DT[, min(v), by=id], error="'raw' not supported by GForce min/max") test(2020.02, DT[, max(v), by=id], error="'raw' not supported by GForce min/max") test(2020.03, DT[, median(v), by=id], error="'raw' not supported by GForce median") -test(2020.04, DT[, head(v, 1), by=id], error="'raw' not supported by GForce head") -test(2020.05, DT[, tail(v, 1), by=id], error="'raw' not supported by GForce tail") -test(2020.06, DT[, v[1], by=id], error="'raw' not supported by GForce subset") +test(2020.04, DT[, head(v, 1), by=id], error="'raw' not supported by GForce head/tail/first/last/`[`") +test(2020.05, DT[, tail(v, 1), by=id], error="'raw' not supported by GForce head/tail/first/last/`[`") +test(2020.06, DT[, v[1], by=id], error="'raw' not supported by GForce head/tail/first/last/`[`") test(2020.07, DT[, sd(v), by=id], error="'raw' not supported by GForce sd") test(2020.08, DT[, var(v), by=id], error="'raw' not supported by GForce var") test(2020.09, DT[, prod(v), by=id], error="'raw' not supported by GForce prod") @@ -17062,7 +17062,7 @@ registerS3method("format_col", "complex", format_col.complex) x = data.table(z = c(1 + 3i, 2 - 1i, pi + 2.718i)) test(2130.12, x, output = '(1.0, 3.0i)') rm(format_col.complex) -registerS3method("format_col", "complex", format_col.default) +registerS3method("format_col", "complex", format_col.default) # otherwise it remains registered after test.data.table() and causes test 1610.1 to fail on the next run for example, and user display if they have complex data # haven't found a way to unregister an S3 method (tried registering NULL but there's an error that NULL isn't a function) @@ -17779,7 +17779,7 @@ test(2188.12, fifelse(c(TRUE, FALSE, TRUE, NA), NA, NA, as.Date("2020-01-01")), test(2188.13, fifelse(TRUE, 1L, 2.0, "a"), error="'na' is of type character but 'no' is double. Please") # smart error message test(2188.14, fifelse(TRUE, NA, 2, as.Date("2019-07-07")), error="'no' has different class than 'na'. Please") test(2188.15, fifelse(TRUE, NA, factor('a'), factor('a', levels = c('a','b'))), error="'no' and 'na' are both type factor but their levels are different") -test(2188.16, fifelse(c(NA, NA), 1L, 2L, NULL), c(NA_integer_, NA_integer_)) # NULL `na` is treated as NA +test(2188.16, fifelse(c(NA, NA), 1L, 2L, NULL), c(NA_integer_, NA_integer_)) # NULL `na` is treated as NA # rolling join expected output on non-matching join column has been fixed #1913 DT = data.table(ID=1:5, A=c(1.3, 1.7, 2.4, 0.9, 0.6)) @@ -17821,7 +17821,7 @@ if (test_bit64) { DT[a==1, a:=12] DT[a==2, a:=as.integer64(13)] test(2193.1, DT, data.table(a = as.integer64(c(12,13,3:10)))) - + # X[Y,,by=.EACHI] when Y contains integer64 also fixed in 1.12.4, #3779 X = data.table(x=1:3) Y = data.table(x=1:2, y=as.integer64(c(10,20))) @@ -17899,7 +17899,7 @@ setDTthreads() # restore default throttle # fwrite now allows sep="", #4817 test(2202.1, fwrite(data.frame(a="id", b=letters[1:5], c=1:5), sep=""), output = c("abc", paste0("id", letters[1:5], 1:5))) -test(2202.2, fwrite(data.frame(a="id", b=1:1e2), sep=""), +test(2202.2, fwrite(data.frame(a="id", b=1:1e2), sep=""), output = c("ab", paste0("id", 1:1e2))) test(2202.3, fwrite(data.table(a=c(NA, 2, 3.01), b=c('foo', NA, 'bar')), sep=""), output=c("ab", "foo", "2", "3.01bar")) @@ -18009,3 +18009,29 @@ test(2210.24, DT[-c(1L,0L)], data.table(x=2:4)) # codecov gap, not related to no test(2210.25, DT[-c(1L,0L), nomatch=NULL], data.table(x=2:4)) test(2210.26, DT[-c(1L,0L), nomatch=0], data.table(x=2:4), warning="Please use nomatch=NULL") +# NA in i would segfault gforce, #1994 +DT = data.table(a=1L, b=2, c="a", grp=1L) +i = c(1L,NA,NA,NA) # 3 NA to trigger segfault in var (min 3 obs) otherwise just c(1L,NA) is enough to trigger the others +funs = c("sum","mean","var","sd","median","prod","min","max","`[`","first","last","head","tail") +EVAL = function(...) { + e = paste0(...) + # cat(e,"\n") # uncomment to check the queries tested + eval(parse(text=e)) +} +testnum = 2211.0 +for (col in c("a","b","c")) { + testnum = testnum+0.1 + for (fi in seq_along(funs)) { + if (col=="c" && fi<=6L) next # first 6 funs don't support type character + f = funs[fi] + testnum = testnum+0.001 + test(testnum, EVAL("DT[i, ",f,"(",col, if(fi>8L)", 1L","), by=grp]"), # segfault before when NA in i + EVAL("DT[i][, ",f,"(",col, if(fi>8L)", 1L","), by=grp]")) # ok before by taking DT[i] subset first + if (fi<=8L) { + testnum = testnum+0.001 + test(testnum, EVAL("DT[i, ",f,"(",col,", na.rm=TRUE), by=grp]"), + EVAL("DT[i][, ",f,"(",col,", na.rm=TRUE), by=grp]")) + } + } +} + diff --git a/src/gsumm.c b/src/gsumm.c index 0fe05d1299..f806b1e3c8 100644 --- a/src/gsumm.c +++ b/src/gsumm.c @@ -235,7 +235,7 @@ void *gather(SEXP x, bool *anyNA) } else { const int *my_x = irows + b*batchSize; for (int i=0; i1) + const bool nosubset = irowslen == -1; + const int n = nosubset ? length(x) : irowslen; SEXP ans; - if (nrow != n) error(_("nrow [%d] != length(x) [%d] in %s"), nrow, n, "gtail"); + if (nrow != n) error(_("nrow [%d] != length(x) [%d] in %s"), nrow, n, first?"gfirst":"glast"); + const bool gnth = w>1; // const bool to avoid fetching grpsize[i] when not needed switch(TYPEOF(x)) { case LGLSXP: { const int *ix = LOGICAL(x); ans = PROTECT(allocVector(LGLSXP, ngrp)); int *ians = LOGICAL(ans); for (int i=0; igrpsize[i]) { ians[i]=NA_LOGICAL; continue; } + int k = first ? ff[i]+w-2 : ff[i]+grpsize[i]-w-1; if (isunsorted) k = oo[k]-1; - k = (irowslen == -1) ? k : irows[k]-1; - ians[i] = ix[k]; + ians[i] = nosubset ? ix[k] : (irows[k]==NA_INTEGER ? NA_LOGICAL : ix[irows[k]-1]); } } break; @@ -921,10 +925,10 @@ SEXP glast(SEXP x) { ans = PROTECT(allocVector(INTSXP, ngrp)); int *ians = INTEGER(ans); for (int i=0; igrpsize[i]) { ians[i]=NA_INTEGER; continue; } + int k = first ? ff[i]+w-2 : ff[i]+grpsize[i]-w-1; if (isunsorted) k = oo[k]-1; - k = (irowslen == -1) ? k : irows[k]-1; - ians[i] = ix[k]; + ians[i] = nosubset ? ix[k] : (irows[k]==NA_INTEGER ? NA_INTEGER : ix[irows[k]-1]); } } break; @@ -933,10 +937,10 @@ SEXP glast(SEXP x) { ans = PROTECT(allocVector(REALSXP, ngrp)); double *dans = REAL(ans); for (int i=0; igrpsize[i]) { dans[i]=NA_REAL; continue; } + int k = first ? ff[i]+w-2 : ff[i]+grpsize[i]-w-1; if (isunsorted) k = oo[k]-1; - k = (irowslen == -1) ? k : irows[k]-1; - dans[i] = dx[k]; + dans[i] = nosubset ? dx[k] : (irows[k]==NA_INTEGER ? NA_REAL : dx[irows[k]-1]); } } break; @@ -945,348 +949,146 @@ SEXP glast(SEXP x) { ans = PROTECT(allocVector(CPLXSXP, ngrp)); Rcomplex *dans = COMPLEX(ans); for (int i=0; igrpsize[i]) { dans[i]=NA_CPLX; continue; } + int k = first ? ff[i]+w-2 : ff[i]+grpsize[i]-w-1; if (isunsorted) k = oo[k]-1; - k = (irowslen == -1) ? k : irows[k]-1; - dans[i] = dx[k]; + dans[i] = nosubset ? dx[k] : (irows[k]==NA_INTEGER ? NA_CPLX : dx[irows[k]-1]); } } break; - case STRSXP: + case STRSXP: { + const SEXP *sx = STRING_PTR(x); ans = PROTECT(allocVector(STRSXP, ngrp)); for (int i=0; igrpsize[i]) { SET_STRING_ELT(ans, i, NA_STRING); continue; } + int k = first ? ff[i]+w-2 : ff[i]+grpsize[i]-w-1; if (isunsorted) k = oo[k]-1; - k = (irowslen == -1) ? k : irows[k]-1; - SET_STRING_ELT(ans, i, STRING_ELT(x, k)); + SET_STRING_ELT(ans, i, nosubset ? sx[k] : (irows[k]==NA_INTEGER ? NA_STRING : sx[irows[k]-1])); } - break; - case VECSXP: + } break; + case VECSXP: { + const SEXP *vx = SEXPPTR_RO(x); ans = PROTECT(allocVector(VECSXP, ngrp)); for (int i=0; igrpsize[i]) { SET_VECTOR_ELT(ans, i, ScalarLogical(NA_LOGICAL)); continue; } + int k = first ? ff[i]+w-2 : ff[i]+grpsize[i]-w-1; if (isunsorted) k = oo[k]-1; - k = (irowslen == -1) ? k : irows[k]-1; - SET_VECTOR_ELT(ans, i, VECTOR_ELT(x, k)); + SET_VECTOR_ELT(ans, i, nosubset ? vx[k] : (irows[k]==NA_INTEGER ? ScalarLogical(NA_LOGICAL) : vx[irows[k]-1])); } - break; + } break; default: - error(_("Type '%s' not supported by GForce tail (gtail). Either add the prefix utils::tail(.) or turn off GForce optimization using options(datatable.optimize=1)"), type2char(TYPEOF(x))); + error(_("Type '%s' not supported by GForce head/tail/first/last/`[`. Either add the prefix utils::head(.) or turn off GForce optimization using options(datatable.optimize=1)"), type2char(TYPEOF(x))); } copyMostAttrib(x, ans); UNPROTECT(1); return(ans); } +SEXP glast(SEXP x) { + return gfirstlast(x, false, 1); +} + SEXP gfirst(SEXP x) { - const int n = (irowslen == -1) ? length(x) : irowslen; - SEXP ans; - if (nrow != n) error(_("nrow [%d] != length(x) [%d] in %s"), nrow, n, "ghead"); - switch(TYPEOF(x)) { - case LGLSXP: { - int const *ix = LOGICAL(x); - ans = PROTECT(allocVector(LGLSXP, ngrp)); - int *ians = LOGICAL(ans); - for (int i=0; i grpsize[i]) { LOGICAL(ans)[i] = NA_LOGICAL; continue; } - int k = ff[i]+val-2; - if (isunsorted) k = oo[k]-1; - k = (irowslen == -1) ? k : irows[k]-1; - ians[i] = ix[k]; - } - } - break; - case INTSXP: { - const int *ix = INTEGER(x); - ans = PROTECT(allocVector(INTSXP, ngrp)); - int *ians = INTEGER(ans); - for (int i=0; i grpsize[i]) { INTEGER(ans)[i] = NA_INTEGER; continue; } - int k = ff[i]+val-2; - if (isunsorted) k = oo[k]-1; - k = (irowslen == -1) ? k : irows[k]-1; - ians[i] = ix[k]; - } - } - break; - case REALSXP: { - const double *dx = REAL(x); - ans = PROTECT(allocVector(REALSXP, ngrp)); - double *dans = REAL(ans); - for (int i=0; i grpsize[i]) { REAL(ans)[i] = NA_REAL; continue; } - int k = ff[i]+val-2; - if (isunsorted) k = oo[k]-1; - k = (irowslen == -1) ? k : irows[k]-1; - dans[i] = dx[k]; - } - } - break; - case CPLXSXP: { - const Rcomplex *dx = COMPLEX(x); - ans = PROTECT(allocVector(CPLXSXP, ngrp)); - Rcomplex *dans = COMPLEX(ans); - for (int i=0; i grpsize[i]) { dans[i].r = NA_REAL; dans[i].i = NA_REAL; continue; } - int k = ff[i]+val-2; - if (isunsorted) k = oo[k]-1; - k = (irowslen == -1) ? k : irows[k]-1; - dans[i] = dx[k]; - } - } break; - case STRSXP: - ans = PROTECT(allocVector(STRSXP, ngrp)); - for (int i=0; i grpsize[i]) { SET_STRING_ELT(ans, i, NA_STRING); continue; } - int k = ff[i]+val-2; - if (isunsorted) k = oo[k]-1; - k = (irowslen == -1) ? k : irows[k]-1; - SET_STRING_ELT(ans, i, STRING_ELT(x, k)); - } - break; - case VECSXP: - ans = PROTECT(allocVector(VECSXP, ngrp)); - for (int i=0; i grpsize[i]) { SET_VECTOR_ELT(ans, i, R_NilValue); continue; } - int k = ff[i]+val-2; - if (isunsorted) k = oo[k]-1; - k = (irowslen == -1) ? k : irows[k]-1; - SET_VECTOR_ELT(ans, i, VECTOR_ELT(x, k)); - } - break; - default: - error(_("Type '%s' not supported by GForce subset `[` (gnthvalue). Either add the prefix utils::head(.) or turn off GForce optimization using options(datatable.optimize=1)"), type2char(TYPEOF(x))); - } - copyMostAttrib(x, ans); - UNPROTECT(1); - return(ans); + return gfirstlast(x, true, INTEGER(valArg)[0]); } // TODO: gwhich.min, gwhich.max // implemented this similar to gmedian to balance well between speed and memory usage. There's one extra allocation on maximum groups and that's it.. and that helps speed things up extremely since we don't have to collect x's values for each group for each step (mean, residuals, mean again and then variance). -SEXP gvarsd1(SEXP x, SEXP narm, Rboolean isSD) +static SEXP gvarsd1(SEXP x, SEXP narmArg, bool isSD) { - if (!isLogical(narm) || LENGTH(narm)!=1 || LOGICAL(narm)[0]==NA_LOGICAL) error(_("na.rm must be TRUE or FALSE")); + if (!isLogical(narmArg) || LENGTH(narmArg)!=1 || LOGICAL(narmArg)[0]==NA_LOGICAL) error(_("na.rm must be TRUE or FALSE")); if (!isVectorAtomic(x)) error(_("GForce var/sd can only be applied to columns, not .SD or similar. For the full covariance matrix of all items in a list such as .SD, either add the prefix stats::var(.SD) (or stats::sd(.SD)) or turn off GForce optimization using options(datatable.optimize=1). Alternatively, if you only need the diagonal elements, 'DT[,lapply(.SD,var),by=,.SDcols=]' is the optimized way to do this.")); if (inherits(x, "factor")) error(_("var/sd is not meaningful for factors.")); const int n = (irowslen == -1) ? length(x) : irowslen; if (nrow != n) error(_("nrow [%d] != length(x) [%d] in %s"), nrow, n, "gvar"); SEXP sub, ans = PROTECT(allocVector(REALSXP, ngrp)); + double *ansd = REAL(ans); + const bool nosubset = irowslen==-1; + const bool narm = LOGICAL(narmArg)[0]; switch(TYPEOF(x)) { - case LGLSXP: case INTSXP: + case LGLSXP: case INTSXP: { sub = PROTECT(allocVector(INTSXP, maxgrpn)); // allocate once upfront - if (!LOGICAL(narm)[0]) { - for (int i=0; i DBL_MAX) REAL(ans)[i] = R_PosInf; - else if (s[i] < -DBL_MAX) REAL(ans)[i] = R_NegInf; - else REAL(ans)[i] = (double)s[i]; - } + s[thisgrp] *= elem; // no under/overflow here, s is long double (like base) + }} break; - case REALSXP: + case REALSXP: { + const double *xd = REAL(x); for (int i=0; i DBL_MAX) REAL(ans)[i] = R_PosInf; - else if (s[i] < -DBL_MAX) REAL(ans)[i] = R_NegInf; - else REAL(ans)[i] = (double)s[i]; - } + const double elem = nosubset ? xd[i] : (irows[i]==NA_INTEGER ? NA_REAL : xd[irows[i]-1]); + if (ISNAN(elem)) { + if (!narm) s[thisgrp] = NA_REAL; + continue; + } + s[thisgrp] *= elem; + }} break; default: free(s); error(_("Type '%s' not supported by GForce prod (gprod). Either add the prefix base::prod(.) or turn off GForce optimization using options(datatable.optimize=1)"), type2char(TYPEOF(x))); } + for (int i=0; i DBL_MAX) ansd[i] = R_PosInf; + else if (s[i] < -DBL_MAX) ansd[i] = R_NegInf; + else ansd[i] = (double)s[i]; + } free(s); copyMostAttrib(x, ans); UNPROTECT(1); // Rprintf(_("this gprod took %8.3f\n"), 1.0*(clock()-start)/CLOCKS_PER_SEC); return(ans); } +