fread is having difficulty with a tab-separated file that features lines that are partially quoted. To illustrate (str_works confirms embedded quotes as the cause):
str_fails = 'L1\tsome\tunquoted\tstuff\nL2\tsome\t"half" quoted\tstuff\nL3\tthis\t"should work"\tok thought'
str_works = gsub('"', '', str_fails)
fread(str_works, sep='\t', header=F, skip=0L)
# V1 V2 V3 V4
#1: L1 some unquoted stuff
#2: L2 some half quoted stuff
#3: L3 this should work ok thought
fread(str_fails, sep='\t', header=F, skip=0L)
# Error in fread(str_fails, sep = "\t", header = F, skip = 0L) :
# Expected sep (' ') but
# ' ends field 3 on line 1 when detecting types: L2 some "half" quoted stuff`
Would it be possible to add a quote argument to mitigate, or alternatively an automated way to identify embedded quotes during the quote classification code?
freadis having difficulty with a tab-separated file that features lines that are partially quoted. To illustrate (str_works confirms embedded quotes as the cause):Would it be possible to add a quote argument to mitigate, or alternatively an automated way to identify embedded quotes during the quote classification code?