Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -254,7 +254,7 @@

15. `rbind` and `rbindlist` of zero-row items now retain (again) the unused levels of any (zero-length) factor columns, [#3508](https://github.com/Rdatatable/data.table/issues/3508). This was a regression in v1.12.2 just for zero-row items. Unused factor levels were already retained for items having `nrow>=1`. Thanks to Gregory Demin for reporting.

16. `rbind` and `rbindlist` of an item containing an ordered factor with levels containing an `NA` (as opposed to an NA integer) could segfault, [#3601](https://github.com/Rdatatable/data.table/issues/3601). This was a a regression in v1.12.2. Thanks to Damian Betebenner for reporting.
16. `rbind` and `rbindlist` of an item containing an ordered factor with levels containing an `NA` (as opposed to an NA integer) could segfault, [#3601](https://github.com/Rdatatable/data.table/issues/3601). This was a a regression in v1.12.2. Thanks to Damian Betebenner for reporting. Also a related segfault when recycling a length-1 factor column, [#3662](https://github.com/Rdatatable/data.table/issues/3662).

17. `example(":=", local=TRUE)` now works rather than error, [#2972](https://github.com/Rdatatable/data.table/issues/2972). Thanks @vlulla for the report.

Expand Down
9 changes: 9 additions & 0 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -16124,6 +16124,15 @@ test(2112.13, fwrite(DT, scipen=999), output=c(
"3.14159265358979"
))

# rbindlist segfault / malformed factor when recycling length-1 factor columns, #3662
test(2113.1, rbindlist(list(data.table(a=as.factor(1:2), b=as.factor(2:3)),
list(a=as.factor(3L), b=as.factor(4:5)))),
data.table(a=as.factor(INT(1,2,3,3)), b=as.factor(2:5)))
test(2113.2, rbindlist(list(list(a=as.factor(1:2), b=factor(), c=as.factor(3), d=as.factor(4)),
list(a="3", b=as.factor(3L), c=c("4","5"), d=character()))),
data.table(a=as.factor(c(1,2,3,3)), b=as.factor(c(NA,NA,3L,3L)), c=as.factor(c(3L,3:5)), d=as.factor(c(4L,4L,NA,NA))),
warning="Column 2 ['b'] of item 1 is length 0. This (and 1 other like it) has been filled with NA")


###################################
# Add new tests above this line #
Expand Down
39 changes: 26 additions & 13 deletions src/rbindlist.c
Original file line number Diff line number Diff line change
Expand Up @@ -439,21 +439,34 @@ SEXP rbindlist(SEXP l, SEXP usenamesArg, SEXP fillArg, SEXP idcolArg)
}
int *targetd = INTEGER(target);
if (isFactor(thisCol)) {
// loop through levels. If all i == truelength(i) then just do a memcpy. Otherwise hop via the integer map.
bool nohop = true;
for (int k=0; k<n; ++k) {
SEXP s = thisColStrD[k];
if (s!=NA_STRING && -TRUELENGTH(s)!=k+1) { nohop=false; break; }
}
if (nohop) memcpy(targetd+ansloc, INTEGER(thisCol), thisnrow*SIZEOF(thisCol));
else {
int *id = INTEGER(thisCol);
for (int r=0; r<thisnrow; r++)
targetd[ansloc+r] = id[r]==NA_INTEGER ? NA_INTEGER : -TRUELENGTH(thisColStrD[id[r]-1]);
const int *id = INTEGER(thisCol);
if (length(thisCol)<=1) {
// recycle length-1, or NA-fill length-0
const int val = (length(thisCol)==1 && id[0]!=NA_INTEGER) ? -TRUELENGTH(thisColStrD[id[0]-1]) : NA_INTEGER;
for (int r=0; r<thisnrow; ++r) targetd[ansloc+r] = val;
} else {
// length(thisCol)==thisnrow alreay checked before this truelength-clobber region
// If all i==truelength(i) then just do a memcpy since hop is identity. Otherwise hop via the integer map.
bool hop = false;
for (int k=0; k<n; ++k) {
SEXP s = thisColStrD[k];
if (s!=NA_STRING && -TRUELENGTH(s)!=k+1) { hop=true; break; }
}
if (hop) {
for (int r=0; r<thisnrow; ++r)
targetd[ansloc+r] = id[r]==NA_INTEGER ? NA_INTEGER : -TRUELENGTH(thisColStrD[id[r]-1]);
} else {
memcpy(targetd+ansloc, id, thisnrow*SIZEOF(thisCol));
}
}
} else {
SEXP *sd = STRING_PTR(thisColStr);
for (int r=0; r<thisnrow; r++) targetd[ansloc+r] = sd[r]==NA_STRING ? NA_INTEGER : -TRUELENGTH(sd[r]);
const SEXP *sd = STRING_PTR(thisColStr);
if (length(thisCol)<=1) {
const int val = (length(thisCol)==1 && sd[0]!=NA_STRING) ? -TRUELENGTH(sd[0]) : NA_INTEGER;
for (int r=0; r<thisnrow; ++r) targetd[ansloc+r] = val;
} else {
for (int r=0; r<thisnrow; ++r) targetd[ansloc+r] = sd[r]==NA_STRING ? NA_INTEGER : -TRUELENGTH(sd[r]);
}
}
}
ansloc += thisnrow;
Expand Down