Luke Tierney contacted me about a memory issue in R-devel. Tests are passing both on CRAN and Travis/Appveyor, but memory usage is higher in some cases. As before the garbage collector turns off.
Not sure when exactly the R-devel change was made, but some time in the last month or so. It might be that the R-devel change causes the new memory issue, or that the R-devel change reveals the problem that exists already (possibly even with data.table-release on R-release). In any case, it needs to be fixed.
Luke wrote :
I distilled the issue in 'constellation' down to the attached file
from the examples. If I run this the memory usage is much bigger than
previously and the gc() output is garbled. It's a
multi-threading issue again; everything looks fine with
OMP_NUM_THREADS=1. With mutlipe threads you are
calling DATAPTR from threads other than the main one and that creates
a race on setting the R_GCEnabled flag, so eventually it is getting
stuck on off. I instrumented the places where the GC is disabled and
tracked this as the first one from a thread other than the main one:
#4 0x00007ffff78ba9d5 in DATAPTR (x=x@entry=0x1167458)
at ../../../R/src/include/Rinlinedfuns.h:106
#5 0x00007fffea02c4c8 in subsetVectorRaw (target=0x4529590, source=0x1167458,
idx=0x4175210, any0orNA=FALSE) at subset.c:44
#6 0x00007fffea02c7a6 in subsetDT (x=<optimized out>, rows=<optimized out>,
cols=<optimized out>) at subset.c:272
Given Luke's detail, it's easy to see the problem in the code without needing to reproduce. All R API usage needs taking outside all parallel regions as Luke requested before. I delayed doing that last time due to time pressure, and now it needs tackling. Last time I just did the first step which was to ensure that DATAPTR inside parallel regions did not receive ALTREP.
This warrants accelerating release of 1.12.0, not least because it impacts Luke.
$ grep "omp.*parallel" *.c
$ grep ALTREP *.c
Luke Tierney contacted me about a memory issue in R-devel. Tests are passing both on CRAN and Travis/Appveyor, but memory usage is higher in some cases. As before the garbage collector turns off.
Not sure when exactly the R-devel change was made, but some time in the last month or so. It might be that the R-devel change causes the new memory issue, or that the R-devel change reveals the problem that exists already (possibly even with data.table-release on R-release). In any case, it needs to be fixed.
Luke wrote :
Given Luke's detail, it's easy to see the problem in the code without needing to reproduce. All R API usage needs taking outside all parallel regions as Luke requested before. I delayed doing that last time due to time pressure, and now it needs tackling. Last time I just did the first step which was to ensure that DATAPTR inside parallel regions did not receive ALTREP.
This warrants accelerating release of 1.12.0, not least because it impacts Luke.
$ grep "omp.*parallel" *.c$ grep ALTREP *.c