Skip to content

fread occasionally reads in differently rounded non-exact fp numbers than base R #4461

@gmbecker

Description

@gmbecker

This is likely a non-issue (I do understand that these numbers are not meaningfully different). And Apologies if I missed mention of this in the documentation or prevoius issues (I did look).

There are certain values (I ran into one in the wild) where fread and read.table (which agrees with R's parser) parse a string representing a floating point number into equivalent but non-identical byte-representations.

Note this will mean that caching cannot be trusted to stay non-stale when upgrading read.table calls to fread, where the docs and a naive-understanding of what is happening would suggest they could.

Reproducible example:

library(data.table)
## data.table 1.12.8 using 12 threads (see ?getDTthreads).  Latest news: r-datatable.com
exchar = "0.8060667366"
exnum = 0.8060667366
rtres = read.table(text = exchar)
rtres
##          V1
## 1 0.8060667
rtval = rtres[1,1]

identical(rtval, exnum)
## [1] TRUE

frres = fread(text = c(exchar, exchar))
frres
##           V1
## 1: 0.8060667
## 2: 0.8060667
frval = frres[1,V1]

identical(frval, exnum)
## [1] FALSE

sprintf("%1.17f", rtval)
## [1] "0.80606673659999994"
sprintf("%1.17f", frval)
## [1] "0.80606673660000006"

> sessionInfo()
## R version 3.6.1 (2019-07-05)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: CentOS Linux 7 (Core)

## Matrix products: default
## BLAS:   <snip>/R-3.6.1/lib64/R/lib/libRblas.so
## LAPACK: <snip>/R-3.6.1/lib64/R/lib/libRlapack.so

## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     

## other attached packages:
## [1] data.table_1.12.8

## loaded via a namespace (and not attached):
## [1] compiler_3.6.1 tools_3.6.1   

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions