THANK YOU FOR ALL YOUR GREAT WORK! You enable my research efforts to be efficient on a daily basis.
I have searched: news, issues, milestones, stackoverflow, all to no avail.
What happens is, somehow one of my variables accumulates an ever-growing string of ""'s after being read and re-written by fread. This goes on, until the value assumes a size that causes fwrite to crash:
fwrite(DT , "testDT.csv")
*** caught segfault ***
address 0x7f9ce33ff000, cause 'memory not mapped'
Traceback:
1: fwrite(DT, "testDT.csv")
I have worked around this problem by deleting the many repetitive ""'s using gsub, but I don't know why they arise in the first place. I suspect there is something in the way fread and fwrite handles ""'s (I am using R and data.table to sort references,(I admit this is not a very frequent use case) and my reference programs may be putting in some of my authors have unusual letters in their names).
I have tested this both in the data.table version from CRAN and the development version, installed as described on your home page.
Thank you for your time!
# [Minimal reproducible example]
library(data.table)
DT <- fread("testDT.txt") ##Using .txt as GitHub does not like .csv
testDT.txt
fwrite("DT" , "testDT.csv")
DT <- fread("testDT.csv")
DT
##You will notice that the number of ""'s has grown now - this increases by each cycle.
# Output of sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.2 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_DK.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_DK.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_DK.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_DK.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.12.3
loaded via a namespace (and not attached):
[1] compiler_3.5.3 tools_3.5.3
THANK YOU FOR ALL YOUR GREAT WORK! You enable my research efforts to be efficient on a daily basis.
I have searched: news, issues, milestones, stackoverflow, all to no avail.
What happens is, somehow one of my variables accumulates an ever-growing string of ""'s after being read and re-written by fread. This goes on, until the value assumes a size that causes fwrite to crash:
fwrite(DT , "testDT.csv")
*** caught segfault ***
address 0x7f9ce33ff000, cause 'memory not mapped'
Traceback:
1: fwrite(DT, "testDT.csv")
I have worked around this problem by deleting the many repetitive ""'s using gsub, but I don't know why they arise in the first place. I suspect there is something in the way fread and fwrite handles ""'s (I am using R and data.table to sort references,(I admit this is not a very frequent use case) and my reference programs may be putting in some of my authors have unusual letters in their names).
I have tested this both in the data.table version from CRAN and the development version, installed as described on your home page.
Thank you for your time!
#[Minimal reproducible example]library(data.table)
DT <- fread("testDT.txt") ##Using .txt as GitHub does not like .csv
testDT.txt
fwrite("DT" , "testDT.csv")
DT <- fread("testDT.csv")
DT
##You will notice that the number of ""'s has grown now - this increases by each cycle.
#Output of sessionInfo()R version 3.5.3 (2019-03-11)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.2 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_DK.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_DK.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_DK.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_DK.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.12.3
loaded via a namespace (and not attached):
[1] compiler_3.5.3 tools_3.5.3