diff --git a/NEWS.md b/NEWS.md index 12078fc12b..9edb65a03e 100644 --- a/NEWS.md +++ b/NEWS.md @@ -103,6 +103,8 @@ unit = "s") 10. When updating by reference, the use of `get` could result in columns being re-ordered silently, [#4089](https://github.com/Rdatatable/data.table/issues/4089). Thanks to @dmongin for reporting and Cole Miller for the fix. +11. `copy()` now overallocates deeply nested lists of `data.table`s, [#4205](https://github.com/Rdatatable/data.table/issues/4205). Thanks to @d-sci for reporting and the PR. + ## NOTES 0. Retrospective license change permission was sought from and granted by 4 contributors who were missed in [PR#2456](https://github.com/Rdatatable/data.table/pull/2456), [#4140](https://github.com/Rdatatable/data.table/pull/4140). We had used [GitHub's contributor page](https://github.com/Rdatatable/data.table/graphs/contributors) which omits 3 of these due to invalid email addresses, unlike GitLab's contributor page which includes the ids. The 4th omission was a PR to a script which should not have been excluded; a script is code too. We are sorry these contributors were not properly credited before. They have now been added to the contributors list as displayed on CRAN. All the contributors of code to data.table hold its copyright jointly; your contributions belong to you. You contributed to data.table when it had a particular license at that time, and you contributed on that basis. This is why in the last license change, all contributors of code were consulted and each had a veto. diff --git a/R/data.table.R b/R/data.table.R index 86d3726d14..9292ee940b 100644 --- a/R/data.table.R +++ b/R/data.table.R @@ -2345,21 +2345,18 @@ copy = function(x) { newx = .Call(Ccopy,x) # copies at length but R's duplicate() also copies truelength over. # TO DO: inside Ccopy it could reset tl to 0 or length, but no matter as selfrefok detects it # TO DO: revisit duplicate.c in R 3.0.3 and see where it's at - if (!is.data.table(x)) { - # fix for #1476. TODO: find if a cleaner fix is possible.. - if (is.list(x)) { - anydt = vapply_1b(x, is.data.table, use.names=FALSE) - if (sum(anydt)) { - newx[anydt] = lapply(newx[anydt], function(x) { - .Call(C_unlock, x) - setalloccol(x) - }) - } + + reallocate = function(y) { + if (is.data.table(y)) { + .Call(C_unlock, y) + setalloccol(y) + } else if (is.list(y)) { + y[] = lapply(y, reallocate) } - return(newx) # e.g. in as.data.table.list() the list is copied before changing to data.table + y } - .Call(C_unlock, newx) - setalloccol(newx) + + reallocate(newx) } .shallow = function(x, cols = NULL, retain.key = FALSE, unlock = FALSE) { diff --git a/inst/tests/tests.Rraw b/inst/tests/tests.Rraw index 6767e52998..38b65cba21 100644 --- a/inst/tests/tests.Rraw +++ b/inst/tests/tests.Rraw @@ -8277,6 +8277,8 @@ test(1606, fread("2,\n1,a,b", fill=TRUE), data.table(V1=2:1, V2=c("","a"), V3=c( dt = data.table(resp=c(1:5)) wide = copy(list(metrics = dt))$metrics # copy here copies the list of data.table and therefore didn't overallocate before.. test(1607, wide[, id := .I], data.table(resp = 1:5, id = 1:5)) +wide = copy(list(a = list(b = dt)))$a$b # check again on doubly nested list +test(1607.1, wide[, id := .I], data.table(resp = 1:5, id = 1:5)) # better fix for #1462, + improved error message (if this better fix fails) # no need for quote="" and sep="\t"..