Skip to content

uniqueN(by=character()) returns nrow, as does by=NULL#4595

Merged
mattdowle merged 9 commits intomasterfrom
uniqueN-empty-b
Jun 17, 2021
Merged

uniqueN(by=character()) returns nrow, as does by=NULL#4595
mattdowle merged 9 commits intomasterfrom
uniqueN-empty-b

Conversation

@MichaelChirico
Copy link
Copy Markdown
Member

Closes #4594

Not 100% sure this is the 'perfect' fix, but I think consistency with by=NULL makes sense, at least.

@codecov
Copy link
Copy Markdown

codecov bot commented Jul 10, 2020

Codecov Report

Merging #4595 (9faf6f2) into master (ebc5bc3) will increase coverage by 0.17%.
The diff coverage is 100.00%.

❗ Current head 9faf6f2 differs from pull request most recent head e424e62. Consider uploading reports for the commit e424e62 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4595      +/-   ##
==========================================
+ Coverage   99.43%   99.61%   +0.17%     
==========================================
  Files          73       73              
  Lines       14460    14120     -340     
==========================================
- Hits        14379    14065     -314     
+ Misses         81       55      -26     
Impacted Files Coverage Δ
R/duplicated.R 100.00% <100.00%> (ø)
src/fmelt.c 99.00% <0.00%> (-0.42%) ⬇️
src/ijoin.c 95.29% <0.00%> (-0.16%) ⬇️
src/fsort.c 95.83% <0.00%> (-0.10%) ⬇️
R/like.R 100.00% <0.00%> (ø)
src/cj.c 100.00% <0.00%> (ø)
R/fcast.R 100.00% <0.00%> (ø)
R/frank.R 100.00% <0.00%> (ø)
R/fread.R 100.00% <0.00%> (ø)
R/merge.R 100.00% <0.00%> (ø)
... and 39 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ebc5bc3...e424e62. Read the comment docs.

@ColeMiller1
Copy link
Copy Markdown
Contributor

ColeMiller1 commented Jul 10, 2020

Should unique.data.table() behave similar as well as any other duplicated type of operations by taking by = character(0L) as NULL? If so, uniqueN() and unique() need work as well. I would lean towards resolving everything in the same PR for consistency...

library(data.table)
DT <- data.table(idx=1:4, value="val")

uniqueN(DT, by = character(0))
#> Error in forderv(x, by = by, retGrp = TRUE, na.last = if (!na.rm) FALSE else NA): Internal error: DT has 2 columns but 'by' is either not integer or is length 0
uniqueN(DT, by = NULL)
#> [1] 4

unique(DT, by = character(0L))
#> Error in forderv(x, by = by, sort = FALSE, retGrp = TRUE): Internal error: DT has 2 columns but 'by' is either not integer or is length 0
unique(DT, by = NULL)
#>    idx value
#> 1:   1   val
#> 2:   2   val
#> 3:   3   val
#> 4:   4   val

duplicated(DT, by = character(0L))
#> logical(0)
duplicated(DT, by = NULL)
#> [1] FALSE FALSE FALSE FALSE

anyDuplicated(DT, by = character(0L))
#> [1] 0
anyDuplicated(DT, by = NULL)
#> [1] 0

Comment thread inst/tests/tests.Rraw
error = 'specify non existing column*.*y')
test(1962.0061, duplicated(data.table(NULL)), logical(0L))
test(1962.0062, duplicated(data.table(a = 1L), by = character()), logical())
test(1962.006, duplicated(data.table(NULL)), logical(0L))
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this duplicated(., by=character()) was tested against for coverage before, we might count that as a breaking change?

cc @ColeMiller1

if so, maybe want to sneak it into the upcoming release as "potentially breaking"

@mattdowle mattdowle added this to the 1.14.1 milestone Jun 17, 2021
@mattdowle
Copy link
Copy Markdown
Member

We seem to have lost Travis CI, and therefore coverage.
I saw the news about Travis ... have we hit a quota or something?
@jangorecki any magic you can work?

@mattdowle mattdowle merged commit 80bb6b3 into master Jun 17, 2021
@mattdowle mattdowle deleted the uniqueN-empty-b branch June 17, 2021 07:29
@jangorecki jangorecki modified the milestones: 1.14.9, 1.15.0 Oct 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

uniqueN() fails for zero-length vectors in by

4 participants