Description
When reading a CSV file that contains empty quoted strings (""), I expect these values to be parsed as NA_character_ if explicitly defined in the na.strings argument. However, fread() does not display the NAs accordingly. Instead, the fields are shown as empty strings in the output.
Minimal reproducible example
temp_file <- tempfile(fileext = ".csv")
writeLines(
text = c(
'"Date Example","Question 1","Question 2","Site: Country"',
'"12/5/2012","Yes","Yes","Chile"',
'"","","","Virgin Islands, British"'
),
con = temp_file
)
# Missing values not shown as NA by default
data.table::fread(temp_file)
#> Date Example Question 1 Question 2 Site: Country
#> <char> <char> <char> <char>
#> 1: 12/5/2012 Yes Yes Chile
#> 2: Virgin Islands, British
# Missing values still not shown as NA, even with na.strings = '""'
data.table::fread(temp_file, na.strings = '""')
#> Date Example Question 1 Question 2 Site: Country
#> <char> <char> <char> <char>
#> 1: 12/5/2012 Yes Yes Chile
#> 2: Virgin Islands, British
unlink(temp_file)
sessionInfo()
#> R version 4.2.3 (2023-03-15 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 26100)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=English_United States.utf8
#> [2] LC_CTYPE=English_United States.utf8
#> [3] LC_MONETARY=English_United States.utf8
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.utf8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.31 withr_2.5.2 R.methodsS3_1.8.2 lifecycle_1.0.4
#> [5] magrittr_2.0.3 reprex_2.0.2 evaluate_0.23 rlang_1.1.3
#> [9] cli_3.6.2 data.table_1.17.99 rstudioapi_0.15.0 fs_1.6.3
#> [13] R.utils_2.12.2 R.oo_1.25.0 vctrs_0.6.5 styler_1.10.2
#> [17] rmarkdown_2.25 tools_4.2.3 R.cache_0.16.0 glue_1.7.0
#> [21] purrr_1.0.1 xfun_0.39 yaml_2.3.10 fastmap_1.1.1
#> [25] compiler_4.2.3 htmltools_0.5.5 knitr_1.45
Created on 2025-05-13 with reprex v2.0.2
Expected behavior
I expect the second row’s empty quoted fields ("") to be shown as NA values when na.strings = '""' is provided. Instead, they are displayed as empty strings.
Thank you for your great work on data.table!
Description
When reading a CSV file that contains empty quoted strings (
""), I expect these values to be parsed asNA_character_if explicitly defined in thena.stringsargument. However,fread()does not display theNAs accordingly. Instead, the fields are shown as empty strings in the output.Minimal reproducible example
Created on 2025-05-13 with reprex v2.0.2
Expected behavior
I expect the second row’s empty quoted fields (
"") to be shown asNAvalues whenna.strings = '""'is provided. Instead, they are displayed as empty strings.Thank you for your great work on
data.table!