Skip to content

Using 'on' style join with two different column names #2383

@peterlittlejohn

Description

@peterlittlejohn

Dear data.table team,

I came across a small issue with the use of 'on' in table joins. Or am I misunderstanding the documentation (pasted in below posting):

`

R version 3.3.2 (2016-10-31) -- "Sincere Pumpkin Patch"
Platform: x86_64-redhat-linux-gnu (64-bit)
library(data.table)
data.table 1.10.4

d1 <- data.table(A=c(1,2,4),B=c("a","b","d"))
d2 <- data.table(A=c(1,1,4))
d1[d2,on="A"]
A B
1: 1 a
2: 1 a
3: 4 d`

All works well. But now when I try a different column name in d2, it does not work:

`

d1 <- data.table(A=c(1,2,4),B=c("a","b","d"))
d2 <- data.table(W=c(1,1,4))
d1[d2,on=c(x="A",y="W")]
Error in [.data.table(d1, d2, on = c(x = "A", y = "W")) :
Column(s) [x,y] not found in x
`
Am I misunderstanding the documentation (?data.table), which states (in one bullet point on the use of "on"):

  • As a named character vector, e.g., X[Y, on=c(x="a", y="b")]. This is useful when column names to join by are different between the two tables.

Given my understanding over this sentence in the documentation, the code above should produce the identical output as the first example.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions