First off, thank you for this fantastic package. It effortlessly powers many of my data caving adventures.
Sometimes, it's necessary to distinguish between null (NA) and empty ("") strings, and I'm trying to establish a pipeline that preserves this distinction with minimal markup. This doesn't currently work. To work, fread() would need to distinguish between and "", and fwrite() ideally would quote empty strings when quote = "auto".
Consider this data.table:
dt <- data.table::data.table(chr = c(NA, "", "a"), num = c(NA, NA, 1))
Here is the fwrite output with quote="auto":
csv <- paste(
capture.output(
data.table::fwrite(dt, quote = "auto")
),
collapse = "\n"
))
cat(csv)
The empty string is not quoted, and thus indistinguishable from the null string. They are both read back in as empty strings:
chr num
1: NA
2: NA
3: a 1
If instead we force quotes, the distinction is kept between the null and empty strings:
csv_quoted <- paste(
capture.output(
data.table::fwrite(dt, quote = TRUE)
),
collapse = "\n"
)
cat(csv_quoted)
However, there is no way to read them back in as such. Either they are both empty:
data.table::fread(csv_quoted)
chr num
1: NA
2: NA
3: a 1
Or both null:
data.table::fread(csv_quoted, na.strings = "")
chr num
1: NA NA
2: NA NA
3: a 1
First off, thank you for this fantastic package. It effortlessly powers many of my data caving adventures.
Sometimes, it's necessary to distinguish between null (
NA) and empty ("") strings, and I'm trying to establish a pipeline that preserves this distinction with minimal markup. This doesn't currently work. To work,fread()would need to distinguish betweenand"", andfwrite()ideally would quote empty strings whenquote = "auto".Consider this data.table:
Here is the fwrite output with
quote="auto":The empty string is not quoted, and thus indistinguishable from the null string. They are both read back in as empty strings:
If instead we force quotes, the distinction is kept between the null and empty strings:
However, there is no way to read them back in as such. Either they are both empty:
Or both null: