Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,8 @@

33. `fread(text="a,b,c")` (where input data contains no `\n` but `text=` has been used) now works instead of error `file not found: a,b,c`, [#4689](https://github.com/Rdatatable/data.table/issues/4689). Thanks to @trainormg for reporting, and @ben-schwen for the PR.

34. `na.omit(DT)` did not remove `NA` in `nanotime` columns, [#4744](https://github.com/Rdatatable/data.table/issues/4744). Thanks Jean-Mathieu Vermosen for reporting, and Michael Chirico for the PR.

## NOTES

1. New feature 29 in v1.12.4 (Oct 2019) introduced zero-copy coercion. Our thinking is that requiring you to get the type right in the case of `0` (type double) vs `0L` (type integer) is too inconvenient for you the user. So such coercions happen in `data.table` automatically without warning. Thanks to zero-copy coercion there is no speed penalty, even when calling `set()` many times in a loop, so there's no speed penalty to warn you about either. However, we believe that assigning a character value such as `"2"` into an integer column is more likely to be a user mistake that you would like to be warned about. The type difference (character vs integer) may be the only clue that you have selected the wrong column, or typed the wrong variable to be assigned to that column. For this reason we view character to numeric-like coercion differently and will warn about it. If it is correct, then the warning is intended to nudge you to wrap the RHS with `as.<type>()` so that it is clear to readers of your code that a coercion from character to that type is intended. For example :
Expand Down
5 changes: 5 additions & 0 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -17912,3 +17912,8 @@ test(2203.20, tstrsplit(w, "/", type.convert=list()), error="not support empty l
test(2204, as.data.table(mtcars, keep.rownames='model', key='model'),
setnames(setkey(as.data.table(mtcars, keep.rownames = TRUE), rn), 'rn', 'model'))

# na.omit works for nanotime, #4744
if (test_nanotime) {
DT = data.table(time=nanotime(c(1,NA,3)))
test(2205, na.omit(DT), DT[c(1,3)])
}
6 changes: 3 additions & 3 deletions src/assign.c
Original file line number Diff line number Diff line change
Expand Up @@ -710,8 +710,8 @@ const char *memrecycle(const SEXP target, const SEXP where, const int start, con
snprintf(targetDesc, 500, colnum==0 ? _("target vector") : _("column %d named '%s'"), colnum, colname);
int protecti=0;
const bool sourceIsFactor=isFactor(source), targetIsFactor=isFactor(target);
const bool sourceIsI64=isReal(source) && Rinherits(source, char_integer64);
const bool targetIsI64=isReal(target) && Rinherits(target, char_integer64);
const bool sourceIsI64=isReal(source) && INHERITS(source, char_integer64);
const bool targetIsI64=isReal(target) && INHERITS(target, char_integer64);
if (sourceIsFactor || targetIsFactor) {
if (!targetIsFactor) {
if (!isString(target) && !isNewList(target))
Expand Down Expand Up @@ -1116,7 +1116,7 @@ void writeNA(SEXP v, const int from, const int n, const bool listNA)
for (int i=from; i<=to; ++i) vd[i] = NA_INTEGER;
} break;
case REALSXP: {
if (Rinherits(v, char_integer64)) { // Rinherits covers nanotime too which inherits from integer64 via S4 extends
if (INHERITS(v, char_integer64)) {
int64_t *vd = (int64_t *)REAL(v);
for (int i=from; i<=to; ++i) vd[i] = NA_INTEGER64;
} else {
Expand Down
6 changes: 3 additions & 3 deletions src/between.c
Original file line number Diff line number Diff line change
Expand Up @@ -83,8 +83,8 @@ SEXP between(SEXP x, SEXP lower, SEXP upper, SEXP incbounds, SEXP NAboundsArg, S
} break;

case REALSXP:
if (Rinherits(x, char_integer64)) {
if (!Rinherits(lower, char_integer64) || !Rinherits(upper, char_integer64))
if (INHERITS(x, char_integer64)) {
if (!INHERITS(lower, char_integer64) || !INHERITS(upper, char_integer64))
error(_("x is integer64 but lower and/or upper are not.")); // e.g. between(int64, character, character)
const int64_t *lp = (int64_t *)REAL(lower);
const int64_t *up = (int64_t *)REAL(upper);
Expand All @@ -111,7 +111,7 @@ SEXP between(SEXP x, SEXP lower, SEXP upper, SEXP incbounds, SEXP NAboundsArg, S
}
if (verbose) Rprintf(_("between parallel processing of integer64 took %8.3fs\n"), omp_get_wtime()-tic);
} else {
if (Rinherits(lower, char_integer64) || Rinherits(upper, char_integer64))
if (INHERITS(lower, char_integer64) || INHERITS(upper, char_integer64))
error(_("x is not integer64 but lower and/or upper is integer64. Please align classes."));
const double *lp = REAL(lower);
const double *up = REAL(upper);
Expand Down
2 changes: 1 addition & 1 deletion src/coalesce.c
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ SEXP coalesce(SEXP x, SEXP inplaceArg) {
}
} break;
case REALSXP: {
if (Rinherits(first, char_integer64)) { // Rinherits() is true for nanotime
if (INHERITS(first, char_integer64)) {
int64_t *xP=(int64_t *)REAL(first), finalVal=NA_INTEGER64;
int k=0;
for (int j=0; j<nval; ++j) {
Expand Down
1 change: 0 additions & 1 deletion src/data.table.h
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,6 @@ SEXP isReallyReal(SEXP x);
bool allNA(SEXP x, bool errorForBadType);
SEXP colnamesInt(SEXP x, SEXP cols, SEXP check_dups);
bool INHERITS(SEXP x, SEXP char_);
bool Rinherits(SEXP x, SEXP char_);
SEXP copyAsPlain(SEXP x);
void copySharedColumns(SEXP x);
SEXP lock(SEXP x);
Expand Down
4 changes: 2 additions & 2 deletions src/fifelse.c
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ SEXP fifelseR(SEXP l, SEXP a, SEXP b, SEXP na) {
const double *restrict pa = na_a ? NULL : REAL(a);
const double *restrict pb = na_b ? NULL : REAL(b);
const double *restrict pna = na_n ? NULL : REAL(na);
const double na = Rinherits(a, char_integer64) ? NA_INT64_D : NA_REAL; // Rinherits() is true for nanotime
const double na = INHERITS(a, char_integer64) ? NA_INT64_D : NA_REAL;
#pragma omp parallel for num_threads(getDTthreads(len0, true))
for (int64_t i=0; i<len0; ++i) {
pans[i] = pl[i]==0 ?
Expand Down Expand Up @@ -327,7 +327,7 @@ SEXP fcaseR(SEXP na, SEXP rho, SEXP args) {
case REALSXP: {
const double *restrict pouts = REAL(outs);
double *restrict pans = REAL(ans);
const double na_double = Rinherits(outs, char_integer64) ? NA_INT64_D : NA_REAL;
const double na_double = INHERITS(outs, char_integer64) ? NA_INT64_D : NA_REAL;
const double pna = nonna ? REAL(na)[0] : na_double;
for (int64_t j=0; j<len2; ++j) {
const int64_t idx = imask ? j : p[j];
Expand Down
2 changes: 1 addition & 1 deletion src/nafill.c
Original file line number Diff line number Diff line change
Expand Up @@ -172,7 +172,7 @@ SEXP nafillR(SEXP obj, SEXP type, SEXP fill, SEXP nan_is_na_arg, SEXP inplace, S
bool hasFill = !isLogical(fill) || LOGICAL(fill)[0]!=NA_LOGICAL;
bool *isInt64 = (bool *)R_alloc(nx, sizeof(bool));
for (R_len_t i=0; i<nx; i++)
isInt64[i] = Rinherits(VECTOR_ELT(x, i), char_integer64);
isInt64[i] = INHERITS(VECTOR_ELT(x, i), char_integer64);
const void **fillp = (const void **)R_alloc(nx, sizeof(void*)); // fill is (or will be) a list of length nx of matching types, scalar values for each column, this pointer points to each of those columns data pointers
if (hasFill) {
if (nx!=length(fill) && length(fill)!=1)
Expand Down
2 changes: 1 addition & 1 deletion src/shift.c
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ SEXP shift(SEXP obj, SEXP k, SEXP fill, SEXP type)

case REALSXP : {
SEXP thisfill;
if (Rinherits(elem, char_integer64)) {
if (INHERITS(elem, char_integer64)) {
thisfill = PROTECT(allocVector(REALSXP, 1));
unsigned long long *dthisfill = (unsigned long long *)REAL(thisfill);
if (INTEGER(fill)[0] == NA_INTEGER)
Expand Down
40 changes: 19 additions & 21 deletions src/utils.c
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ bool allNA(SEXP x, bool errorForBadType) {
return true;
}
case REALSXP:
if (Rinherits(x,char_integer64)) {
if (INHERITS(x, char_integer64)) {
const int64_t *xd = (int64_t *)REAL(x);
for (int i=0; i<n; ++i) if (xd[i]!=NA_INTEGER64) {
return false;
Expand Down Expand Up @@ -152,32 +152,30 @@ inline bool INHERITS(SEXP x, SEXP char_) {
// ii) no attrib writes must be possible in other threads.
SEXP klass;
if (isString(klass = getAttrib(x, R_ClassSymbol))) {
for (int i=0; i<LENGTH(klass); i++) {
for (int i=0; i<LENGTH(klass); ++i) {
if (STRING_ELT(klass, i) == char_) return true;
}
if (char_==char_integer64) {
// package:nanotime is S4 and inherits from integer64 via S3 extends; i.e. integer64 does not appear in its R_ClassSymbol
// R's C API inherits() does not cover S4 and returns FALSE for nanotime
// R's R-level inherits() calls objects.c:inherits2 which calls attrib.c:R_data_class2 and
// then attrib.c:S4_extends which itself calls R level methods:::.extendsForS3 which then calls R level methods::extends.
// Since that chain of calls is so complicated and involves evaluating R level (not thread-safe) we
// special case nanotime here. We used to have Rinherits() as well which did call R level but couldn't be called from
// parallel regions. That became too hard to reason about two functions, #4752.
// If any other classes come to light that, like nanotime, S4 inherit from integer64, we can i) encourage them to change
// to regular S3, or ii) state we simply don't support that; i.e. nanotime was an exception, or iii) add a function that
// gets called on C entry points which loops through columns and if any are S4 calls the old Rinherits() to see if they S4
// inherit from integer64, and if so add that class to a vector that gets looped through here. That way we isolate the
// non-TS call into argument massage header code, and we can continue to use INHERITS() throughout the code base.
for (int i=0; i<LENGTH(klass); ++i) {
if (STRING_ELT(klass, i) == char_nanotime) return true;
}
}
}
return false;
}

bool Rinherits(SEXP x, SEXP char_) {
// motivation was nanotime which is S4 and inherits from integer64 via S3 extends
// R's C API inherits() does not cover S4 and returns FALSE for nanotime, as does our own INHERITS above.
// R's R-level inherits() calls objects.c:inherits2 which calls attrib.c:R_data_class2 and
// then attrib.c:S4_extends which itself calls R level methods:::.extendsForS3 which then calls R level methods::extends.
// Since that chain of calls is so complicated and involves evaluating R level anyway, let's just reuse it.
// Rinherits prefix with 'R' to signify i) it may call R level and is therefore not thread safe, and ii) includes R level inherits which covers S4.
bool ans = INHERITS(x, char_); // try standard S3 class character vector first
if (!ans && char_==char_integer64) // save the eval() for known S4 classes that inherit from integer64
ans = INHERITS(x, char_nanotime); // comment this out to test the eval() works for nanotime
if (!ans && IS_S4_OBJECT(x)) { // if it's not S4 we can save the overhead of R eval()
SEXP vec = PROTECT(ScalarString(char_)); // TODO: cover this branch by making two new test S4 classes: one that
SEXP call = PROTECT(lang3(sym_inherits, x, vec)); // does inherit from integer64 and one that doesn't
ans = LOGICAL(eval(call, R_GlobalEnv))[0]==1;
UNPROTECT(2);
}
return ans;
}

SEXP copyAsPlain(SEXP x) {
// v1.12.2 and before used standard R duplicate() to do this. But duplicate() is not guaranteed to not return an ALTREP.
// e.g. ALTREP 'wrapper' on factor column (with materialized INTSXP) in package VIM under example(hotdeck)
Expand Down