In previous versions of data.table, dt1 <- dt[TRUE] creates a shallow copy of dt so that dt and dt1 have different memory addresses. It would be safe to add columns to dt1 without influencing dt with no cost of copying any column in it. This is particularly useful when dt is extremely large and different scripts need to use it to compute different columns without copying it.
library(data.table)
dt <- data.table(id = 1:10)
dt1 <- dt[TRUE]
dt1[, x := 1]
dt2 <- dt[TRUE]
dt2[, x := 2]
dt
#> id
#> 1: 1
#> 2: 2
#> 3: 3
#> 4: 4
#> 5: 5
#> 6: 6
#> 7: 7
#> 8: 8
#> 9: 9
#> 10: 10
dt1
#> id x
#> 1: 1 1
#> 2: 2 1
#> 3: 3 1
#> 4: 4 1
#> 5: 5 1
#> 6: 6 1
#> 7: 7 1
#> 8: 8 1
#> 9: 9 1
#> 10: 10 1
dt2
#> id x
#> 1: 1 2
#> 2: 2 2
#> 3: 3 2
#> 4: 4 2
#> 5: 5 2
#> 6: 6 2
#> 7: 7 2
#> 8: 8 2
#> 9: 9 2
#> 10: 10 2
address(dt)
#> [1] "0x7f8655ef4200"
address(dt1)
#> [1] "0x7f8655f40000"
address(dt2)
#> [1] "0x7f8655eae200"
3937881 changes this behavior and dt[TRUE] will not shallow copy dt so that the following code does not work any more.
library(data.table)
dt <- data.table(id = 1:10)
dt1 <- dt[TRUE]
dt1[, x := 1]
dt2 <- dt[TRUE]
dt2[, x := 2]
dt
#> id x
#> 1: 1 2
#> 2: 2 2
#> 3: 3 2
#> 4: 4 2
#> 5: 5 2
#> 6: 6 2
#> 7: 7 2
#> 8: 8 2
#> 9: 9 2
#> 10: 10 2
dt1
#> id x
#> 1: 1 2
#> 2: 2 2
#> 3: 3 2
#> 4: 4 2
#> 5: 5 2
#> 6: 6 2
#> 7: 7 2
#> 8: 8 2
#> 9: 9 2
#> 10: 10 2
dt2
#> id x
#> 1: 1 2
#> 2: 2 2
#> 3: 3 2
#> 4: 4 2
#> 5: 5 2
#> 6: 6 2
#> 7: 7 2
#> 8: 8 2
#> 9: 9 2
#> 10: 10 2
address(dt)
#> [1] "0x7fb92930be00"
address(dt1)
#> [1] "0x7fb92930be00"
address(dt2)
#> [1] "0x7fb92930be00"
Currently data.table:::shallow is not exported so there's no way to use public API to shallow copy a data.table without losing its key. (.subset(dt, ...) and then setDT will shallow copy dt but its key will lose).
In previous versions of data.table,
dt1 <- dt[TRUE]creates a shallow copy ofdtso thatdtanddt1have different memory addresses. It would be safe to add columns todt1without influencingdtwith no cost of copying any column in it. This is particularly useful whendtis extremely large and different scripts need to use it to compute different columns without copying it.3937881 changes this behavior and
dt[TRUE]will not shallow copydtso that the following code does not work any more.Currently
data.table:::shallowis not exported so there's no way to use public API to shallow copy a data.table without losing its key. (.subset(dt, ...)and thensetDTwill shallow copydtbut its key will lose).