Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,8 @@ unit = "s")

10. When updating by reference, the use of `get` could result in columns being re-ordered silently, [#4089](https://github.com/Rdatatable/data.table/issues/4089). Thanks to @dmongin for reporting and Cole Miller for the fix.

11. `copy()` now overallocates deeply nested lists of `data.table`s, [#4205](https://github.com/Rdatatable/data.table/issues/4205). Thanks to @d-sci for reporting and the PR.

## NOTES

0. Retrospective license change permission was sought from and granted by 4 contributors who were missed in [PR#2456](https://github.com/Rdatatable/data.table/pull/2456), [#4140](https://github.com/Rdatatable/data.table/pull/4140). We had used [GitHub's contributor page](https://github.com/Rdatatable/data.table/graphs/contributors) which omits 3 of these due to invalid email addresses, unlike GitLab's contributor page which includes the ids. The 4th omission was a PR to a script which should not have been excluded; a script is code too. We are sorry these contributors were not properly credited before. They have now been added to the contributors list as displayed on CRAN. All the contributors of code to data.table hold its copyright jointly; your contributions belong to you. You contributed to data.table when it had a particular license at that time, and you contributed on that basis. This is why in the last license change, all contributors of code were consulted and each had a veto.
Expand Down
23 changes: 10 additions & 13 deletions R/data.table.R
Original file line number Diff line number Diff line change
Expand Up @@ -2345,21 +2345,18 @@ copy = function(x) {
newx = .Call(Ccopy,x) # copies at length but R's duplicate() also copies truelength over.
# TO DO: inside Ccopy it could reset tl to 0 or length, but no matter as selfrefok detects it
# TO DO: revisit duplicate.c in R 3.0.3 and see where it's at
if (!is.data.table(x)) {
# fix for #1476. TODO: find if a cleaner fix is possible..
if (is.list(x)) {
anydt = vapply_1b(x, is.data.table, use.names=FALSE)
if (sum(anydt)) {
newx[anydt] = lapply(newx[anydt], function(x) {
.Call(C_unlock, x)
setalloccol(x)
})
}

reallocate = function(y) {
if (is.data.table(y)) {
.Call(C_unlock, y)
setalloccol(y)
} else if (is.list(y)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess is.recursive would be more appropriate here? IIUC it would apply to anything for which the lapply(y, reallocate) part would go through

Copy link
Copy Markdown
Member

@mattdowle mattdowle Feb 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thinking, but is.recursive is, unlike its name suggests, more like !is.atomic. For example, is.recursive(mean) is true! Feels right to me to use a direct is.list to target what's in mind, and postpone an expansion to is.recurisve until we know we need to and have an example and test.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good observation, I wish it was in the doc s!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?is.recursive does include "Most types of objects are regarded as recursive. Exceptions are ..." but then I'm not really sure what it is useful for vs !is.atomic.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just happened upon the source for this as well:

/* is.recursive */
	switch(TYPEOF(CAR(args))) {
	case VECSXP:
	case LISTSXP:
	case CLOSXP:
	case ENVSXP:
	case PROMSXP:
	case LANGSXP:
	case SPECIALSXP:
	case BUILTINSXP:
	case DOTSXP:
	case ANYSXP:
	case EXPRSXP:
	// Not recursive, as long as not subsettable (on the R level)
	// case EXTPTRSXP:
	// case BCODESXP:
	// case WEAKREFSXP:
	    LOGICAL0(ans)[0] = 1;
	    break;
	default:
	    LOGICAL0(ans)[0] = 0;
	    break;
	}

y[] = lapply(y, reallocate)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if you have something like list(data.table(), list(list(list())))?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eventually it either runs into an is.data.table branch or by default just does nothing right?

}
return(newx) # e.g. in as.data.table.list() the list is copied before changing to data.table
y
}
.Call(C_unlock, newx)
setalloccol(newx)

reallocate(newx)
}

.shallow = function(x, cols = NULL, retain.key = FALSE, unlock = FALSE) {
Expand Down
2 changes: 2 additions & 0 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -8277,6 +8277,8 @@ test(1606, fread("2,\n1,a,b", fill=TRUE), data.table(V1=2:1, V2=c("","a"), V3=c(
dt = data.table(resp=c(1:5))
wide = copy(list(metrics = dt))$metrics # copy here copies the list of data.table and therefore didn't overallocate before..
test(1607, wide[, id := .I], data.table(resp = 1:5, id = 1:5))
wide = copy(list(a = list(b = dt)))$a$b # check again on doubly nested list
test(1607.1, wide[, id := .I], data.table(resp = 1:5, id = 1:5))

# better fix for #1462, + improved error message (if this better fix fails)
# no need for quote="" and sep="\t"..
Expand Down