-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Handle missing values for delimited text files when Nullhandling is enabled #8779
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
0562c99
e671219
e69c221
3d190fd
0ba91e1
d123ffa
769e232
4cef922
375ea1f
6cb53b4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -814,9 +814,7 @@ private ParseSpec getParseSpec(TimestampSpec timestampSpec, DimensionsSpec dimen | |
| private String getUnparseableTimestampString() | ||
| { | ||
| return ParserType.STR_CSV.equals(parserType) | ||
| ? (USE_DEFAULT_VALUE_FOR_NULL | ||
| ? "Unparseable timestamp found! Event: {t=bad_timestamp, dim1=foo, dim2=null, met1=6}" | ||
| : "Unparseable timestamp found! Event: {t=bad_timestamp, dim1=foo, dim2=, met1=6}") | ||
| ? "Unparseable timestamp found! Event: {t=bad_timestamp, dim1=foo, dim2=null, met1=6}" | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @nishantmonu51 Please let me know if you think there is a better way to do this.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this should be ok. @nishantmonu51 do you have something in your mind? |
||
| : "Unparseable timestamp found! Event: {t=bad_timestamp, dim1=foo, met1=6}"; | ||
| } | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -161,7 +161,7 @@ public void setup() throws Exception | |
| "2011-01-12T00:00:00.000Z,product_1,t1\tt2\tt3,u1\tu2", | ||
| "2011-01-13T00:00:00.000Z,product_2,t3\tt4\tt5,u3\tu4", | ||
| "2011-01-14T00:00:00.000Z,product_3,t5\tt6\tt7,u1\tu5", | ||
| "2011-01-14T00:00:00.000Z,product_4,,u2" | ||
| "2011-01-14T00:00:00.000Z,product_4,\"\",u2" | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is this change for?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is to preserve the original test, since previously csv parser would always produce empty strings here, but after this change in null compatible mode it will produce a null
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh I see.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for clarifying this. I was too late 💤 |
||
| }; | ||
|
|
||
| for (String row : rows) { | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this could just be the all the time mode, since I think it wouldn't really change anything for the default mode either since the values would be coerced to ''.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intention was to preserve backward compatibility but thinking about it, the empty separators flag might not impact the default mode. I'll do a few tests and raise a PR to make
CSVReaderNullFieldIndicator.EMPTY_SEPARATORSthe default mode if it goes well.