Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,8 @@

19. Type `complex` is now supported by `setkey`, `setorder`, `:=`, `by=`, `keyby=`, `shift`, `dcast`, `frank`, `rowid`, `rleid`, `CJ`, `coalesce`, `unique`, and `uniqueN`, [#3690](https://github.com/Rdatatable/data.table/issues/3690). Thanks to Gareth Ward and Elio Campitelli for their reports and input. Sorting `complex` is achieved the same way as base R; i.e., first by the real part then by the imaginary part (as if the `complex` column were two separate columns of `double`). There is no plan to support joining/merging on `complex` columns until a user demonstrates a need for that.

20. `setkey`, `[key]by=` and `on=` in verbose mode (`options(datatable.verbose=TRUE)`) now detect any columns inheriting from `Date` which are stored as 8 byte double, test if any fractions are present, and if not suggest using a 4 byte integer instead (such as `data.table::IDate`) to save space and time, [#1738](https://github.com/Rdatatable/data.table/issues/1738). In future this could be upgraded to `message` or `warning` depending on feedback.

#### BUG FIXES

1. `first`, `last`, `head` and `tail` by group no longer error in some cases, [#2030](https://github.com/Rdatatable/data.table/issues/2030) [#3462](https://github.com/Rdatatable/data.table/issues/3462). Thanks to @franknarf1 for reporting.
Expand Down
6 changes: 6 additions & 0 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -15398,6 +15398,12 @@ test(2069.32, DT1[DT2, .(y = sum(b, na.rm=TRUE)), by=.EACHI, on=c(a = 'a', b="b"
DT = data.table(z = 1i)
test(2069.33, DT[DT, on = 'z'], error = "Type 'complex' not supported for joining/merging")

# forder verbose message when !isReallyReal Date, #1738
DT = data.table(d=sample(seq(as.Date("2015-01-01"), as.Date("2015-01-05"), by="days"), 20, replace=TRUE))
test(2070.01, typeof(DT$d), "double")
test(2070.02, DT[, .N, keyby=d, verbose=TRUE], output="Column 1.*date.*8 byte double.*no fractions are present.*4 byte integer.*to save space and time")


###################################
# Add new tests above this line #
###################################
Expand Down
12 changes: 10 additions & 2 deletions src/forder.c
Original file line number Diff line number Diff line change
Expand Up @@ -424,6 +424,7 @@ SEXP forder(SEXP DT, SEXP by, SEXP retGrpArg, SEXP sortGroupsArg, SEXP ascArg, S
#endif

int n_protect = 0;
const bool verbose = GetVerbose();

if (!isNewList(DT)) {
if (!isVectorAtomic(DT)) error("Input is not either a list of columns, or an atomic vector.");
Expand Down Expand Up @@ -484,7 +485,7 @@ SEXP forder(SEXP DT, SEXP by, SEXP retGrpArg, SEXP sortGroupsArg, SEXP ascArg, S
int spare=0; // the amount of bits remaining on the right of the current nradix byte
bool isReal=false;
bool complexRerun = false; // see comments below in CPLXSXP case
SEXP CplxPart = R_NilValue;
SEXP CplxPart = R_NilValue;
if (n_cplx) { CplxPart=PROTECT(allocVector(REALSXP, nrow)); n_protect++; } // one alloc is reused for each part
TEND(2);
for (int col=0; col<ncol; col++) {
Expand Down Expand Up @@ -513,9 +514,16 @@ SEXP forder(SEXP DT, SEXP by, SEXP retGrpArg, SEXP sortGroupsArg, SEXP ascArg, S
x = CplxPart;
} // !no break! so as to fall through to REAL case
case REALSXP :
if (inherits(x, "integer64")) {
if (INHERITS(x, char_integer64)) {
range_i64((int64_t *)REAL(x), nrow, &min, &max, &na_count);
} else {
if (verbose && INHERITS(x, char_Date) && INTEGER(isReallyReal(x))[0]==0) {
Rprintf("\n*** Column %d passed to forder is a date stored as an 8 byte double but no fractions are present. Please consider a 4 byte integer date such as IDate to save space and time.\n", col+1);
// Note the (slightly expensive) isReallyReal will only run when verbose is true. Prefix '***' just to make it stand out in verbose output
// In future this could be upgraded to option warning. But I figured that's what we use verbose to do (to trace problems and look for efficiencies).
// If an automatic coerce is desired (see discussion in #1738) then this is the point to do that in this file. Move the INTSXP case above to be
// next, do the coerce of Date to integer now to a tmp, and then let this case fall through to INTSXP in the same way as CPLXSXP falls through to REALSXP.
}
range_d(REAL(x), nrow, &min, &max, &na_count, &infnan_count);
if (min==0 && na_count<nrow) { min=3; max=4; } // column contains no finite numbers and is not-all NA; create dummies to yield positive min-2 later
isReal = true;
Expand Down