Skip to content

Should never recycle the zero-length vector#4262

Merged
mattdowle merged 22 commits intomasterfrom
fix3727
Apr 26, 2021
Merged

Should never recycle the zero-length vector#4262
mattdowle merged 22 commits intomasterfrom
fix3727

Conversation

@shrektan
Copy link
Copy Markdown
Member

Closes #3727

1. non atomic like list() is going to be recycled and not the "zero-length" we are referring to (we can regard list() is length 1 here)
2. NULL element is going to be discarded in later loop
@jangorecki

This comment has been minimized.

@codecov
Copy link
Copy Markdown

codecov bot commented Feb 27, 2020

Codecov Report

Merging #4262 (15514b0) into master (7fbb471) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #4262   +/-   ##
=======================================
  Coverage   99.42%   99.42%           
=======================================
  Files          73       73           
  Lines       14446    14447    +1     
=======================================
+ Hits        14363    14364    +1     
  Misses         83       83           
Impacted Files Coverage Δ
R/as.data.table.R 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7fbb471...15514b0. Read the comment docs.

@shrektan

This comment has been minimized.

@shrektan shrektan requested a review from mattdowle May 9, 2020 06:06
@shrektan
Copy link
Copy Markdown
Member Author

shrektan commented May 9, 2020

@mattdowle It would be great if this patch can get your approval first, as I'm developping an internal package where DT[J('a', x), ...] is used a lot while x could possible be a zero-length date... The current behavior forces me to handle the length-0 x case specially, literally everywhere... If you agree with this patch's behavior, I can ensure I'm good to depend on this patch, which will save my lots of efforts...

Thanks a lot.

@shrektan

This comment has been minimized.

@jangorecki
Copy link
Copy Markdown
Member

jangorecki commented May 9, 2020

1967.35 and 2139.1 are currently failing in this PR, it is quite likely that some revdeps might be failing as well, blocking release process. IF the change is going to be incorporated, it needs to be made in a less breaking manner. Maybe hold on with farther changes till we all agree on behavior.


I am having the very same problem in #4370. I need to recycle there as well, but my preference is to recycle 0 and 1 rows data.table to bigger one, quite opposite what you are trying to push here.
Good references for you can be data.frame function, and for me cbind, they both seems to be consistent on recycling, 0 rows throws error.

d1 = 1L
d2 = 1:2

data.frame(d2, d1)
#  d2 d1
#1  1  1
#2  2  1
cbind(d2, d1)
#     d2 d1
#[1,]  1  1
#[2,]  2  1

@shrektan
Copy link
Copy Markdown
Member Author

shrektan commented May 9, 2020

The problem is:

Is recyling the zero-length element meaningful, since there's no value at all? Reclying it with NA is not necessarily a reasonable one.

That's why data.frame, cbind throws error while tibble() gives the zero-row table.

In addition, as I've mentioned in #3727, recycling the zero-length element will trigger a warning of data.table. Someone may accidentally depend this behavior, but I doubt anyone would do it intentionally... Anyway, I will take an investigation on the failed tests.

@jangorecki
Copy link
Copy Markdown
Member

Agree, you have my vote for recycling to 0 rows. It is easy to escape that by DT[J('a', x[1L]), ...], the other way around is not that easy.

@jangorecki jangorecki added the WIP label May 18, 2020
@jangorecki

This comment has been minimized.

Comment thread R/as.data.table.R Outdated
@MichaelChirico

This comment has been minimized.

@shrektan

This comment has been minimized.

@shrektan shrektan removed the WIP label May 23, 2020
@MichaelChirico
Copy link
Copy Markdown
Member

LGTM

@jangorecki
Copy link
Copy Markdown
Member

This email seems slightly related: https://stat.ethz.ch/pipermail/r-devel/2020-May/079572.html

@shrektan shrektan changed the title Recycle the scalar when only scalar and zero-length vector exists Should never recycle the zero-length vector May 26, 2020
@shrektan shrektan added this to the 1.13.5 milestone Nov 29, 2020
Copy link
Copy Markdown
Member

@mattdowle mattdowle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your examples really helped, thanks.

@mattdowle mattdowle merged commit 18528fa into master Apr 26, 2021
@mattdowle mattdowle deleted the fix3727 branch April 26, 2021 23:20
@mattdowle
Copy link
Copy Markdown
Member

mattdowle commented Apr 29, 2021

I reran all revdeps to check the impact. After resolving unrelated issues, only one possible impact (RstoxData) which is comforting. Even that one might be something else in dev and not this change. I'll file a separate issue.

> status()
Installed data.table to be tested against: 1.14.1 2021-04-28 04:39:30
CRAN:
 ERROR   :    2 : RstoxData traitdataform 
 WARNING :    1 : fwildclusterboot 
 NOTE    :  350 
 OK      :  698 
 TOTAL   : 1051 / 1051

> cran()
tools::CRAN_check_results() returned 226,625 rows in 5.584s elapsed (1.249s cpu) 
R-release is used for revdep checking so comparing to CRAN results for R-release
            Package r-release-linux-x86_64 r-release-macos-x86_64 r-release-windows-ix86+x86_64
1:        RstoxData                     OK                   NOTE                            OK
2:    traitdataform                  ERROR                     OK                         ERROR
3: fwildclusterboot                   WARN                     OK                          WARN

Comment thread inst/tests/tests.Rraw
@jangorecki jangorecki modified the milestones: 1.14.9, 1.15.0 Oct 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

data.table(double(0), 1) returns a zero-row table

5 participants