I'm testing the latest development version of data.table and find that fread does not respect integer64 = "double" for some of my data. There's no such problem in the release version.
The test code is:
dt <- fread("test_data.txt", sep = "|", integer64 = "double")
but the resulted data.table still has integer64 column:
> str(dt)
lasses ‘data.table’ and 'data.frame': 2583 obs. of 15 variables:
$ V1 : num 0.1 0.01 0.04 0.04 0.02 NA 0.01 0.01 0.03 0.02 ...
$ V2 : num 0.0509 0.01 0.0138 0.0248 0.0141 ...
$ V3 : num 0.0451 0.01 0.0153 0.0445 0.0133 ...
$ V4 : num 0.0386 0.01 0.0153 0.0557 0.0129 ...
$ V5 :integer64 1020396 55949051 4935942 3668540 11818540 30119787 115742884 0 ...
$ V6 : num 1734596 60721591 10311588 1711172 15439786 ...
$ V7 : num 1541020 46302203 5275696 1276379 13350567 ...
$ V8 : num 1408261 42321135 3844999 1173221 11545387 ...
$ V9 : num 9282004 84280297 15006800 14062030 81537656 ...
$ V10: logi NA NA NA NA NA NA ...
$ V11: logi NA NA NA NA NA NA ...
$ V12: logi NA NA NA NA NA NA ...
$ V13: logi NA NA NA NA NA NA ...
$ V14: logi NA NA NA NA NA NA ...
$ V15: num 2.35e+09 2.45e+09 1.11e+09 2.28e+09 5.38e+09 ...
- attr(*, ".internal.selfref")=<externalptr>
The data is attached below:
test_data.txt
The same happens on both macOS and Ubuntu as I tested.
Here's my session info:
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.3
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.10.5
loaded via a namespace (and not attached):
[1] bit_1.1-12 httr_1.3.1 compiler_3.4.3 R6_2.2.2 tools_3.4.3 withr_2.1.1 curl_3.1 yaml_2.1.16
[9] memoise_1.1.0 bit64_0.9-7 knitr_1.19 git2r_0.21.0 digest_0.6.15 devtools_1.13.4
R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.3 LTS
Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.10.5
loaded via a namespace (and not attached):
[1] compiler_3.4.3 tools_3.4.3 yaml_2.1.16
I'm testing the latest development version of data.table and find that
freaddoes not respectinteger64 = "double"for some of my data. There's no such problem in the release version.The test code is:
but the resulted
data.tablestill hasinteger64column:The data is attached below:
test_data.txt
The same happens on both macOS and Ubuntu as I tested.
Here's my session info: