Record column name for exceptions while writing frames in RowBasedFrameWriter#16130
Conversation
There was a problem hiding this comment.
Thanks for the PR!
- I have added a few comments about the relevance of the new error class.
- Can you please add a test case where the exception gets generated?
- Also, what does it mean for MSQ, should MSQ retry if it encounters such exceptions or not? It seems like retrying shouldn't help here, but per my understanding, MSQ will still treat errors like these as
UnknownFaults,and will retry. We should update that logic in MSQ (check how we do it forInvalidNullByte)
There was a problem hiding this comment.
I doubt that we need this class anywhere. Can't we use a generic RuntimeException with the proper error message? Or better yet, use DruidException, so that we can analyze the message.
There was a problem hiding this comment.
I think we can make a MSQ fault out of this. That way we donot have to adjust any retry logic.
There was a problem hiding this comment.
Converting FrameFieldWriterException now to an MSQFault type
| ) | ||
| { | ||
| super(StringUtils.format( | ||
| super(column, StringUtils.format( |
There was a problem hiding this comment.
Feels weird that we have this extra layer of indirection, to what will essentially call the RuntimeException. If we don't extract the relevant fields (.getColumn()) from the FrameFieldWriterException anywhere, or use check for it's instanceof, then I suppose we can get rid of it.
There was a problem hiding this comment.
Converting to MSQFault now for which the specific exception class is required. InvalidNullByteException now extends RuntimeException directly.
Tried multiple routes incl insert via external inline tables and insert via joins, but an error isn't generated. Therefore not adding the test case.
The conversion to |
LakshSingla
left a comment
There was a problem hiding this comment.
minor comment, overall the changes LGTM
| FrameFieldWriterException.builder() | ||
| .source(ffre.getSource()) | ||
| .rowNumber(ffre.getRowNumber()) | ||
| .column(ffre.getColumn()) | ||
| .errorMsg(ffre.getErrorMsg()) | ||
| .build(), |
There was a problem hiding this comment.
Let's use the normal constructor since we are using all the parameters in a single place. Invalid null byte exception has the builder because it populates different information at different places up the call stack.
There was a problem hiding this comment.
IIUC, the ask is to remove the builder from FrameFieldWriterException. But we do build it progressively in ScanQueryFrameProcessor.
There was a problem hiding this comment.
Let's add this class to MSQFaultsSerdeTest for completeness.
There was a problem hiding this comment.
Done. I've also added the description of the fault in reference.md and removed the term frame from the fault code and log message since that is user-facing.
| @JsonTypeName(FrameFieldWriterFault.CODE) | ||
| public class FrameFieldWriterFault extends BaseMSQFault | ||
| { | ||
| static final String CODE = "FieldWriterError"; |
There was a problem hiding this comment.
This doesn't match the class name, can you please update the class as well? Also, wrt the naming convention of other faults, we don't have "Error" in the suffix. I think something like "InvalidFieldFault" might be more suitable, and the name could be "InvalidField". I am open to other naming conventions as well.
There was a problem hiding this comment.
Went with your suggestion. We do have UnknownFault->UnknownError though, but InvalidField sounded fine.
| CODE, | ||
| StringUtils.format( | ||
| "Error[%s] while writing frame for source[%s], rowNumber[%d] column[%s]", | ||
| "Error[%s] while writing a field for source[%s], rowNumber[%d], column[%s].", |
There was a problem hiding this comment.
nit: Duplicate error message as FrameFieldWriterException , can extract the format string out. It will make the updates easier.
There was a problem hiding this comment.
Now propagate the RE's message directly to the Fault class.
| .source(ffre.getSource()) | ||
| .rowNumber(ffre.getRowNumber()) | ||
| .column(ffre.getColumn()) | ||
| .errorMsg(ffre.getErrorMsg()) |
There was a problem hiding this comment.
We should also add the problematic field.
There was a problem hiding this comment.
The field value isn't available at the place the exception starts -- which is RowBasedFrameWriter#writeDataUsingFieldWriters
LakshSingla
left a comment
There was a problem hiding this comment.
LGTM. Failures are code coverage checks - but there isn't a known dataset that will trigger this. Since the changes are isolated, they shouldn't affect the current code.
Current Runtime Exceptions generated while writing frames only include the exception itself without including the name of the column they were encountered in, for e.g.
java.lang.RuntimeException: java.lang.ClassCastException: class java.lang.Long cannot be cast to class java.lang.String (java.lang.Long and java.lang.String are in module java.base of loader 'bootstrap')Start including additional details for easier debugging.
New Exception:
[InvalidFieldError](https://druid.apache.org/docs/latest/multi-stage-query/reference.html#error_InvalidFieldError): Error[class java.lang.Long cannot be cast to class java.lang.String (java.lang.Long and java.lang.String are in module java.base of loader 'bootstrap')] while writing field for source[external input source: LocalInputSource{baseDir="null", filter=null, files=[foo]}], rowNumber[1], column[arrayString].Tested the generated exception by raising it explicitly in a custom build.