This works:
library(data.table)
dt1 = data.table(v1 = c("a", "b"), v2 = 1:2, v3 = 3:4)
dt2 = data.table(v1 = c("b", "c"), v4 = 5:6)
dt = merge(dt1, dt2)
dt
# Key: <v1>
# v1 v2 v3 v4
# <char> <int> <int> <int>
# 1: b 2 4 5
But if I make a non-overlapping v2 common across the two, an empty DT is created without error, which probably makes sense:
dt2 = data.table(v1 = c("b", "c"), v2 = 5:6, v4 = 5:6)
dt = merge(dt1, dt2)
dt
# Empty data.table (0 rows and 4 cols): v1,v2,v3,v4
The problem is that similar behaviour occurs when "on" is supplied instead of "by" (this may happen given that dt1[dt2, on = "v1"] is an alternative), only a warning is issued. An error would be desirable.
dt = merge(dt1, dt2, on = "v1")
# Message d'avis :
# Dans merge.data.table(dt1, dt2, on = "v1") :
# Unknown argument 'on' has been passed.
dt
# Empty data.table (0 rows and 4 cols): v1,v2,v3,v4
Session info :
> sessionInfo()
R version 4.4.1 (2024-06-14 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26100)
Matrix products: default
locale:
[1] LC_COLLATE=French_France.utf8 LC_CTYPE=French_France.utf8
[3] LC_MONETARY=French_France.utf8 LC_NUMERIC=C
[5] LC_TIME=French_France.utf8
time zone: Europe/Paris
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.16.0
loaded via a namespace (and not attached):
[1] compiler_4.4.1 cli_3.6.3 tools_4.4.1 rstudioapi_0.16.0
This works:
But if I make a non-overlapping v2 common across the two, an empty DT is created without error, which probably makes sense:
The problem is that similar behaviour occurs when "on" is supplied instead of "by" (this may happen given that
dt1[dt2, on = "v1"]is an alternative), only a warning is issued. An error would be desirable.Session info :