Skip to content

Sets clearer warning for keyby in ad-hoc columns#3685

Merged
mattdowle merged 6 commits intoRdatatable:masterfrom
eliocamp:fix-warning
Jul 16, 2019
Merged

Sets clearer warning for keyby in ad-hoc columns#3685
mattdowle merged 6 commits intoRdatatable:masterfrom
eliocamp:fix-warning

Conversation

@eliocamp
Copy link
Copy Markdown
Contributor

@eliocamp eliocamp commented Jul 7, 2019

Hi!
I wanted to contribute with some easy fixes to start learning about the codebase.

This fixes #2763.

I changed the appropriate test. I tested locally and got some errors unrelated to my change (since they were related to dates, I think they might have to do with my locale?).

@MichaelChirico
Copy link
Copy Markdown
Member

Thanks for this! I'm afraid I misled in the issue about what is going on. Here's the relevant code block:

if (keyby) {
  cnames = as.character(bysubl)[-1L]
  cnames = gsub('^`|`$', '', cnames)  # the wrapping backticks that were added above can be removed now, #3378
  if (all(cnames %chin% names(x))) {
    if (verbose) {last.started.at=proc.time();cat("setkey() after the := with keyby= ... ");flush.console()}
    setkeyv(x,cnames)  # TO DO: setkey before grouping to get memcpy benefit.
    if (verbose) {cat(timetaken(last.started.at),"\n"); flush.console()}
  }
  else warning(":= keyby not straightforward character column names or list() of column names, treating as a by:",paste(cnames,collapse=","),"\n")
}
return(suppPrint(x))

So normal behavior is: columns from by are %chin% names(x), so we can simply run setkeyv. If not, there's no way to setkey to columns that don't exist.

Hence the else branch simply throws this warning and then return()s suppPrint(x). So this part isn't true (sorry!) "Skipping to setkey() following computation in j" since no setkey is done.

@MichaelChirico
Copy link
Copy Markdown
Member

Don't forget to add an entry to NEWS :)

@MichaelChirico
Copy link
Copy Markdown
Member

I'm wondering now whether this actually rises to the level of being a warning or is something that can just be reported in verbose? @jangorecki thoughts?

@codecov
Copy link
Copy Markdown

codecov bot commented Jul 7, 2019

Codecov Report

Merging #3685 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #3685   +/-   ##
=======================================
  Coverage   98.25%   98.25%           
=======================================
  Files          69       69           
  Lines       13088    13088           
=======================================
  Hits        12860    12860           
  Misses        228      228
Impacted Files Coverage Δ
R/data.table.R 97.87% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dbb0d0b...cefbd53. Read the comment docs.

@codecov
Copy link
Copy Markdown

codecov bot commented Jul 7, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@00cbb11). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #3685   +/-   ##
=========================================
  Coverage          ?   98.25%           
=========================================
  Files             ?       69           
  Lines             ?    13085           
  Branches          ?        0           
=========================================
  Hits              ?    12857           
  Misses            ?      228           
  Partials          ?        0
Impacted Files Coverage Δ
R/data.table.R 97.87% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 00cbb11...e27cc21. Read the comment docs.

@eliocamp
Copy link
Copy Markdown
Contributor Author

eliocamp commented Jul 7, 2019

I deleted the second sentence, then!

I'm the end result is not a keyed data.table, so I think it's important to at least throw a warning to alert the user that the result is not as expected.

Comment thread R/data.table.R Outdated
Copy link
Copy Markdown
Member

@MichaelChirico MichaelChirico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! thanks!

@mattdowle
Copy link
Copy Markdown
Member

The new warning is keyby can only be used with existing columns, treating as by=.

But that's not true:

> DT = data.table(a=1:2, b=1:10)
> DT
        a     b
    <int> <int>
 1:     1     1
 2:     2     2
 3:     1     3
 4:     2     4
 5:     1     5
 6:     2     6
 7:     1     7
 8:     2     8
 9:     1     9
10:     2    10
> DT[,sum(a),by=b%%2L]
       b    V1
   <int> <int>
1:     1     5
2:     0    10
> DT[,sum(a),keyby=b%%2L]
       b    V1
   <int> <int>
1:     0    10
2:     1     5
> key(.Last.value)
[1] "b"
> 

It's keyby= in combination with := that this warning refers to. The old warning message referred to := but the new one doesn't.

@mattdowle mattdowle added this to the 1.12.4 milestone Jul 16, 2019
@mattdowle mattdowle merged commit ff7686c into Rdatatable:master Jul 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Confusing warning message

3 participants