In the previous data.table versions (before 1.12.2) factor levels were preserved during rbind even if data.table has zero rows. See example:
# Minimal reproducible example
library(data.table)
DT = data.table(f = factor(c("a", "b", "c")))
# all ok, rbind works as expected
str(rbind(DT, DT))
# Classes ‘data.table’ and 'data.frame': 6 obs. of 1 variable:
# $ f: Factor w/ 3 levels "a","b","c": 1 2 3 1 2 3
# - attr(*, ".internal.selfref")=<externalptr>
zeroDT = DT[FALSE,]
str(zeroDT)
# Classes ‘data.table’ and 'data.frame': 0 obs. of 1 variable:
# $ f: Factor w/ 3 levels "a","b","c":
# - attr(*, ".internal.selfref")=<externalptr>
# zero rows data.table
# unexpected 0 factor levels
str(rbind(zeroDT, zeroDT))
# Classes ‘data.table’ and 'data.frame': 0 obs. of 1 variable:
# $ f: Factor w/ 0 levels:
# - attr(*, ".internal.selfref")=<externalptr>
str(rbindlist(list(zeroDT, zeroDT)))
# Classes ‘data.table’ and 'data.frame': 0 obs. of 1 variable:
# $ f: Factor w/ 0 levels:
# - attr(*, ".internal.selfref")=<externalptr>
# Output of sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 19.1
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so
locale:
[1] LC_CTYPE=ru_RU.UTF-8 LC_NUMERIC=C LC_TIME=ru_RU.UTF-8 LC_COLLATE=ru_RU.UTF-8
[5] LC_MONETARY=ru_RU.UTF-8 LC_MESSAGES=ru_RU.UTF-8 LC_PAPER=ru_RU.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=ru_RU.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.12.2
loaded via a namespace (and not attached):
[1] compiler_3.5.3 tools_3.5.3 packrat_0.5.0
In the previous data.table versions (before 1.12.2) factor levels were preserved during
rbindeven if data.table has zero rows. See example:#Minimal reproducible example#Output of sessionInfo()