Skip to content

when nrows = 0 in fread, field types returned are not correct #4029

@michaelpaulhirsch

Description

@michaelpaulhirsch

Hi All,
First off, thanks for your work on an amazing package.

In the documentation for fread, under the nrows argument; the following suggests to me that it is the intention for fread to return a 0 row data.table with correct classes.

'nrows=0' returns the column names and typed empty columns determined by the large sample; useful for a dry run of a large file or to quickly check format consistency of a set of files before starting to read any of them.

However in the reprex below, all columns are returned as logicals.

library(data.table)
tmp <- tempfile(fileext = ".csv")    
fwrite(iris, tmp)    
str(fread(tmp, nrows = 0))    
#> Classes 'data.table' and 'data.frame':   0 obs. of  5 variables:
#>  $ Sepal.Length: logi 
#>  $ Sepal.Width : logi 
#>  $ Petal.Length: logi 
#>  $ Petal.Width : logi 
#>  $ Species     : logi 
#>  - attr(*, ".internal.selfref")=<externalptr>

Whereas, I was expecting the same output as the following:

str(fread(tmp)[0])
#> Classes ‘data.table’ and 'data.frame':	0 obs. of  5 variables:
#>  $ Sepal.Length: num 
#>  $ Sepal.Width : num 
#>  $ Petal.Length: num 
#>  $ Petal.Width : num 
#>  $ Species     : chr 
#>  - attr(*, ".internal.selfref")=<externalptr> 
sessionInfo()
#> R version 3.6.1 (2019-07-05)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 18.04.3 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
#>  [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
#> [10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] data.table_1.12.6
#> 
#> loaded via a namespace (and not attached):
#> [1] compiler_3.6.1 tools_3.6.1   

Thanks!

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions