In-place merge in the form of dt1[dt2, x := y, on = .(col1, col2)] is useful when dt1 is very large. It also supports merging multiple columns from dt2 using `:=`(x1 = y1, x2 = y2). However, when I need to merge many columns from dt2 to dt1, it seems only possible to explicitly list all columns rather than dynamically determine the column names via a character vector like done with .SD, or otherwise I need to use meta-programming facilities to generate an expression and evaluate it.
One simple example is as follows. A practice use case is when dt1 and dt2 is very large and using merge will cause copy that is very slow and may exceed memory limit (which is exactly why in-place operations are introduced)
library(data.table)
d1 <- data.table(id = 1:10)
for (i in 1:10) {
d1[, paste0("x", i) := rnorm(.N)]
}
d2 <- data.table(id = 3:6)
for (i in 1:5) {
d2[, paste0("y", i) := rnorm(.N)]
}
d1[d2, paste0("z", 1:5) := list(y1, y2, y3, y4, y5), on = "id"]
Another similar problem is to in-place merge all columns of d2 without specifying source and target columns names.
In-place merge in the form of
dt1[dt2, x := y, on = .(col1, col2)]is useful whendt1is very large. It also supports merging multiple columns fromdt2using`:=`(x1 = y1, x2 = y2). However, when I need to merge many columns fromdt2todt1, it seems only possible to explicitly list all columns rather than dynamically determine the column names via a character vector like done with.SD, or otherwise I need to use meta-programming facilities to generate an expression and evaluate it.One simple example is as follows. A practice use case is when
dt1anddt2is very large and usingmergewill cause copy that is very slow and may exceed memory limit (which is exactly why in-place operations are introduced)Another similar problem is to in-place merge all columns of
d2without specifying source and target columns names.