diff --git a/DESCRIPTION b/DESCRIPTION index 3e878e84bd..00ec8c3bb6 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -60,7 +60,8 @@ Authors@R: c( person("Jens Peder","Meldgaard", role="ctb"), person("Vaclav","Tlapak", role="ctb"), person("Kevin","Ushey", role="ctb"), - person("Dirk","Eddelbuettel", role="ctb")) + person("Dirk","Eddelbuettel", role="ctb"), + person("Ben","Schwen", role="ctb")) Depends: R (>= 3.1.0) Imports: methods Suggests: bit64 (>= 4.0.0), bit (>= 4.0.4), curl, R.utils, xts, nanotime, zoo (>= 1.8-1), yaml, knitr, rmarkdown diff --git a/NEWS.md b/NEWS.md index df1b96436c..3c62f3c261 100644 --- a/NEWS.md +++ b/NEWS.md @@ -8,6 +8,8 @@ 1. If `fread()` discards a single line footer, the warning message which includes the discarded text now displays any non-ASCII characters correctly on Windows, [#4747](https://github.com/Rdatatable/data.table/issues/4747). Thanks to @shrektan for reporting and the PR. +2. `fintersect()` now retains the order of the first argument as reasonably expected, rather than retaining the order of the second argument, [#4716](https://github.com/Rdatatable/data.table/issues/4716). Thanks to Michel Lang for reporting, and Ben Schwen for the PR. + ## NOTES 1. Compiling from source no longer requires `zlib` header files to be available, [#4844](https://github.com/Rdatatable/data.table/pull/4844). The output suggests installing `zlib` headers, and how (e.g. `zlib1g-dev` on Ubuntu) as before, but now proceeds with `gzip` compression disabled in `fwrite`. Upon calling `fwrite(DT, "file.csv.gz")` at runtime, an error message suggests to reinstall `data.table` with `zlib` headers available. This does not apply to users on Windows or Mac who install the pre-compiled binary package from CRAN. diff --git a/R/setops.R b/R/setops.R index 1ac949601b..b6dcd7b0b2 100644 --- a/R/setops.R +++ b/R/setops.R @@ -62,11 +62,12 @@ fintersect = function(x, y, all=FALSE) { if (all) { x = shallow(x)[, ".seqn" := rowidv(x)] y = shallow(y)[, ".seqn" := rowidv(y)] - jn.on = c(".seqn",setdiff(names(x),".seqn")) - x[y, .SD, .SDcols=setdiff(names(x),".seqn"), nomatch=NULL, on=jn.on] + jn.on = c(".seqn",setdiff(names(y),".seqn")) + # fixes #4716 by preserving order of 1st (uses y[x] join) argument instead of 2nd (uses x[y] join) + y[x, .SD, .SDcols=setdiff(names(y),".seqn"), nomatch=NULL, on=jn.on] } else { - z = funique(y) # fixes #3034. When .. prefix in i= is implemented (TODO), this can be x[funique(..y), on=, multi=] - x[z, nomatch=NULL, on=names(x), mult="first"] + z = funique(x) # fixes #3034. When .. prefix in i= is implemented (TODO), this can be x[funique(..y), on=, multi=] + y[z, nomatch=NULL, on=names(y), mult="first"] } } diff --git a/inst/tests/tests.Rraw b/inst/tests/tests.Rraw index a6013a3a11..2b44b3038f 100644 --- a/inst/tests/tests.Rraw +++ b/inst/tests/tests.Rraw @@ -17258,3 +17258,6 @@ if (identical(x, enc2native(x))) { test(2162.2, fread(text = txt2, encoding = 'Latin-1'), out, warning="Discarded single-line footer: <>") } + +# fintersect now preserves order of first argument like intersect, #4716 +test(2163, fintersect(data.table(x=c("b", "c", "a")), data.table(x=c("a","c")))$x, c("c", "a"))