Skip to content

multi-column non-equi join produces invalid results #2360

@ethanbsmith

Description

@ethanbsmith

I'm trying to do self non-equi join on a date and a numeric column. The single column version of both non-equi joins produce the expected results, but when i try both columns in the on clause, i get some
very funky output.

Sample script to reproduce the problem.

    d <- data.table(index = as.Date(c('2017-08-25', '2017-08-28', '2017-08-29', '2017-08-30', '2017-08-31', '2017-09-01', '2017-09-05', '2017-09-06', '2017-09-07', '2017-09-08', '2017-09-11', '2017-09-12', '2017-09-13')),
        High = c(52.85, 51.81, 51.86, 52.29, 52.59, 52.77, 51.9, 50.77, 50.69, 50.45, 50.67, 51.07, 51.105),
        FwdHi = c(51.81, 51.86, 52.29, 51.9, 50.77, 50.69, 50.45, 50.45, 50.45, 50.67, NA, NA, NA))
    setkey(d, index)
    d[d, .(FirstDate = min(x.index)), on = .(index > index), by = .EACHI]
    d[d, .(FirstDate = min(x.index)), on = .(FwdHi > High), by = .EACHI]
    d[d, .(FirstDate = min(x.index)), on = .(index > index, FwdHi > High), by = .EACHI]

The final line produces an FirstDate = "1970-01-1"
image

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugnon-equi joinsrolling, overlapping, non-equi joins

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions