Hi:
I'm trying to load a 1.6GB csv file with data.table::fread. The process fails at some point complaining about a specific line:
Read 76.2% of 5288107 rowsError in fread("2015-03.csv") :
Expecting 50 cols, but line 4128650 contains text after processing all cols.
Try again with fill=TRUE. Another reason could be that fread's logic in distinguishing one or more fields having embedded sep=','
and/or (unescaped) '\n' characters within unbalanced unescaped quotes has failed.
If quote='' doesn't help, please file an issue to figure out if the logic could be improved.
I have checked that the offending line is ok (with csvfix and also https://csvlint.io/). I have included this line in the following example file (which contains the header, a non-failing line and then the failing line):
test.txt
As you can see, it has some non-trivial quoting and escaping
Do you think the fread csv logic could be extended to be able to deal with things like this?
Thanks a lot, best regards!
Hi:
I'm trying to load a 1.6GB csv file with data.table::fread. The process fails at some point complaining about a specific line:
I have checked that the offending line is ok (with csvfix and also https://csvlint.io/). I have included this line in the following example file (which contains the header, a non-failing line and then the failing line):
test.txt
As you can see, it has some non-trivial quoting and escaping
Do you think the fread csv logic could be extended to be able to deal with things like this?
Thanks a lot, best regards!