Skip to content

Fixed .shallow to consistently retain keys and indices.#2337

Merged
mattdowle merged 10 commits intoRdatatable:masterfrom
MarkusBonsch:shallow_fix
Sep 11, 2017
Merged

Fixed .shallow to consistently retain keys and indices.#2337
mattdowle merged 10 commits intoRdatatable:masterfrom
MarkusBonsch:shallow_fix

Conversation

@MarkusBonsch
Copy link
Copy Markdown
Contributor

Closes #2336.

Keys and indices are retained correctly by shallow if retain.key = TRUE and cols != NULL.
The largest possible part of an index is retained.
If the original index was on x1, x2, x3, and `cols = c("x1", "x2"),
the shallow copy gets the same index for c("x1", "x2") (if there was a native index on x1 and x2, this is kept)

Tests have been added.

A benchmark (code at end of this post) shows that there is no negative speed impact:

master PR
33 ms 33 ms

The improved key retainment makes it possible to clean setattr calls in foverlaps.R while still passing all tests.

Code for benchmark

library(data.table)
library(microbenchmark)

nrow <- 1e6
ncol <- 20
nindex <- 20
ncopycol <- 10

DT <- data.table(x1 = rnorm(nrow))
setindex(DT, x1)
index <- c("x1")
for(col in seq_len(ncol-1)){
  DT[, paste0("x", col + 1) := rnorm(nrow)]
  if(col < nindex){
    index <- c(index, paste0("x", col+1))
    setindexv(DT, index)
  }
}

test <- microbenchmark(shallow = data.table:::.shallow(DT, 
                                                       cols = paste0("x", seq_len(ncopycol)), 
                                                       retain.key = TRUE), 
                       times = 100, 
                       unit = "ms")

@MarkusBonsch
Copy link
Copy Markdown
Contributor Author

Gets a problem on foverlaps test in travis that I don't get locally. Will need to investigate.

@MarkusBonsch MarkusBonsch reopened this Sep 8, 2017
@MarkusBonsch MarkusBonsch reopened this Sep 9, 2017
@codecov-io
Copy link
Copy Markdown

codecov-io commented Sep 9, 2017

Codecov Report

Merging #2337 into master will increase coverage by 0.05%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2337      +/-   ##
==========================================
+ Coverage   91.09%   91.14%   +0.05%     
==========================================
  Files          61       61              
  Lines       11785    11804      +19     
==========================================
+ Hits        10735    10759      +24     
+ Misses       1050     1045       -5
Impacted Files Coverage Δ
R/foverlaps.R 94.3% <100%> (-0.11%) ⬇️
R/data.table.R 97.45% <100%> (+0.02%) ⬆️
src/rbindlist.c 88.93% <0%> (+0.19%) ⬆️
src/forder.c 94.47% <0%> (+0.52%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3c1b6d0...52622cc. Read the comment docs.

@MarkusBonsch
Copy link
Copy Markdown
Contributor Author

Initially, my local tests didn't include fOverlaps since genomicRanges was not installed. Now, everything should be fixed.

mattdowle added a commit that referenced this pull request Sep 11, 2017
… to logic changes only (concerns shallow and keys)
@mattdowle mattdowle merged commit 61a25ba into Rdatatable:master Sep 11, 2017
@mattdowle mattdowle added this to the v1.10.6 milestone Sep 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants