diff --git a/NEWS.md b/NEWS.md index 104d0655c1..6226c17afa 100644 --- a/NEWS.md +++ b/NEWS.md @@ -254,7 +254,7 @@ 15. `rbind` and `rbindlist` of zero-row items now retain (again) the unused levels of any (zero-length) factor columns, [#3508](https://github.com/Rdatatable/data.table/issues/3508). This was a regression in v1.12.2 just for zero-row items. Unused factor levels were already retained for items having `nrow>=1`. Thanks to Gregory Demin for reporting. -16. `rbind` and `rbindlist` of an item containing an ordered factor with levels containing an `NA` (as opposed to an NA integer) could segfault, [#3601](https://github.com/Rdatatable/data.table/issues/3601). This was a a regression in v1.12.2. Thanks to Damian Betebenner for reporting. +16. `rbind` and `rbindlist` of an item containing an ordered factor with levels containing an `NA` (as opposed to an NA integer) could segfault, [#3601](https://github.com/Rdatatable/data.table/issues/3601). This was a a regression in v1.12.2. Thanks to Damian Betebenner for reporting. Also a related segfault when recycling a length-1 factor column, [#3662](https://github.com/Rdatatable/data.table/issues/3662). 17. `example(":=", local=TRUE)` now works rather than error, [#2972](https://github.com/Rdatatable/data.table/issues/2972). Thanks @vlulla for the report. diff --git a/inst/tests/tests.Rraw b/inst/tests/tests.Rraw index 4553c6181a..97ab9ea49a 100644 --- a/inst/tests/tests.Rraw +++ b/inst/tests/tests.Rraw @@ -16124,6 +16124,15 @@ test(2112.13, fwrite(DT, scipen=999), output=c( "3.14159265358979" )) +# rbindlist segfault / malformed factor when recycling length-1 factor columns, #3662 +test(2113.1, rbindlist(list(data.table(a=as.factor(1:2), b=as.factor(2:3)), + list(a=as.factor(3L), b=as.factor(4:5)))), + data.table(a=as.factor(INT(1,2,3,3)), b=as.factor(2:5))) +test(2113.2, rbindlist(list(list(a=as.factor(1:2), b=factor(), c=as.factor(3), d=as.factor(4)), + list(a="3", b=as.factor(3L), c=c("4","5"), d=character()))), + data.table(a=as.factor(c(1,2,3,3)), b=as.factor(c(NA,NA,3L,3L)), c=as.factor(c(3L,3:5)), d=as.factor(c(4L,4L,NA,NA))), + warning="Column 2 ['b'] of item 1 is length 0. This (and 1 other like it) has been filled with NA") + ################################### # Add new tests above this line # diff --git a/src/rbindlist.c b/src/rbindlist.c index 78d0cc1099..b785543160 100644 --- a/src/rbindlist.c +++ b/src/rbindlist.c @@ -439,21 +439,34 @@ SEXP rbindlist(SEXP l, SEXP usenamesArg, SEXP fillArg, SEXP idcolArg) } int *targetd = INTEGER(target); if (isFactor(thisCol)) { - // loop through levels. If all i == truelength(i) then just do a memcpy. Otherwise hop via the integer map. - bool nohop = true; - for (int k=0; k