fwrite() cannot handle umlauts (and presumably all non-ASCII chars) in file names and paths on Windows (here with LC_COLLATE=German_Germany.1252 but from my experience this will also be a problem in other non-UTF-8 locales).
When the umlaut is in the file name, fwrite writes the file, but with a faulty file name.
library(data.table)
setwd(tempdir())
DF = data.frame(A=1:3, B=c("foo","A,Name","baz"))
fwrite(DF, "töst.csv")
list.files(pattern = "\\.csv")
#> [1] "töst.csv"
When the umlaut is in the path, fwrite cannot write the file at all.
dir.create("ä")
data.table::fwrite(DF, "ä/test.csv")
#> Error in data.table::fwrite(DF, "ä/test.csv"): No such file or directory: 'ä/test.csv'. Unable to create new file for writing (it does not exist already). Do you have permission to write here, is there space on the disk and does the path exist?
I looked at and debug-ed the R code and it seems to me that up until line 67 the file argument is encoded as “UTF-8” (as it should IMO) and looks fine. So my guess would be that the file path’s encoding goes wrong in the CfwriteR code.
From looking at the characters that should be “ö” or “ä” respectively, the problem seems to be that CfwriteR get’s a UTF-8 string but handles it as if it were encoded as latin-1, see this table.
If the error were in the R code, I would solve it with a file <- Encoding("UTF-8") line, but I do not know how this is done in C.
Session info
devtools::session_info()
#> Session info -------------------------------------------------------------
#> setting value
#> version R version 3.5.1 (2018-07-02)
#> system x86_64, mingw32
#> ui RTerm
#> language en
#> collate German_Germany.1252
#> tz Europe/Berlin
#> date 2018-09-27
#> Packages -----------------------------------------------------------------
#> package * version date source
#> base * 3.5.1 2018-07-02 local
#> compiler 3.5.1 2018-07-02 local
#> data.table * 1.11.6 2018-09-19 CRAN (R 3.5.1)
#> datasets * 3.5.1 2018-07-02 local
#> devtools 1.13.6 2018-06-27 CRAN (R 3.5.1)
#> digest 0.6.17 2018-09-12 CRAN (R 3.5.1)
#> evaluate 0.11 2018-07-17 CRAN (R 3.5.1)
#> graphics * 3.5.1 2018-07-02 local
#> grDevices * 3.5.1 2018-07-02 local
#> htmldeps 0.1.1 2018-07-30 Github (rstudio/htmldeps@c1023e0)
#> htmltools 0.3.6 2017-04-28 CRAN (R 3.5.1)
#> knitr 1.20 2018-02-20 CRAN (R 3.5.1)
#> magrittr 1.5 2014-11-22 CRAN (R 3.5.1)
#> memoise 1.1.0 2017-04-21 CRAN (R 3.5.1)
#> methods * 3.5.1 2018-07-02 local
#> Rcpp 0.12.18 2018-07-23 CRAN (R 3.5.1)
#> rmarkdown 1.10.13 2018-09-04 Github (rstudio/rmarkdown@19008bf)
#> stats * 3.5.1 2018-07-02 local
#> stringi 1.2.4 2018-07-20 CRAN (R 3.5.1)
#> stringr 1.3.1 2018-05-10 CRAN (R 3.5.1)
#> tools 3.5.1 2018-07-02 local
#> utils * 3.5.1 2018-07-02 local
#> withr 2.1.2 2018-03-15 CRAN (R 3.5.1)
#> yaml 2.2.0 2018-07-25 CRAN (R 3.5.1)
fwrite()cannot handle umlauts (and presumably all non-ASCII chars) in file names and paths on Windows (here withLC_COLLATE=German_Germany.1252but from my experience this will also be a problem in other non-UTF-8 locales).When the umlaut is in the file name,
fwritewrites the file, but with a faulty file name.When the umlaut is in the path,
fwritecannot write the file at all.I looked at and
debug-ed the R code and it seems to me that up until line 67 thefileargument is encoded as “UTF-8” (as it should IMO) and looks fine. So my guess would be that the file path’s encoding goes wrong in theCfwriteRcode.From looking at the characters that should be “ö” or “ä” respectively, the problem seems to be that
CfwriteRget’s a UTF-8 string but handles it as if it were encoded as latin-1, see this table.If the error were in the R code, I would solve it with a
file <- Encoding("UTF-8")line, but I do not know how this is done in C.Session info