Fix serialization in TaskReportFileWriters.#12938
Conversation
For some reason, serializing a Map<String, TaskReport> would omit the "type" field. Explicitly sending each value through the ObjectMapper fixes this, because the type information does not get lost.
|
Thanks for the fix! LGTM post CI (non-binding) |
|
Interesting. There's another similar problem #12912 that type info is lost in the serialized json string caused by jackson when serializing Since we experience such problem here, I was wondering if we can add a serializer to jackson to handle |
|
@FrankChen021 It does sound like the same problem. When looking into this issue (with TaskReportFileWriter) I was indeed wondering if it was a bug in Jackson, since it seems like in principle the builtin MapSerializer could do the same thing. Due to your comment I looked into this a bit more. I just tried it on latest Jackson (4.13.3), behavior is still the same. I raised an issue with jackson-databind with a small test case: FasterXML/jackson-databind#3583. Hopefully a response to this issue would give us some insight about the best way to approach the problem. If it takes some time to figure out a comprehensive solution, IMO we could commit two shorter term solutions in the meantime, so we don't have the current bugs with task reports and emitters. Then when we find a comprehensive solution we can unwind the shorter term fixes. |
|
@gianm There's a comment on this problem several years ago: FasterXML/jackson-databind#699 (comment) I think jackson won't fix it although it really seems like a bug from user side. |
|
Hmm. I don't understand the comment on that issue, because it's clear (from the gist I posted) that Jackson does have all the info it needs to do the serialization properly. The writeAsEntries case works with the exact same map and no additional type hints. |
|
Thanks for the review. I'll merge this and keep an eye on the linked Jackson issue. If it turns out to be closed as not-a-bug, we could explore making our own version of MapSerializer. |
For some reason, serializing a Map<String, TaskReport> would omit the "type" field. Explicitly sending each value through the ObjectMapper fixes this, because the type information does not get lost. This eliminates the need for a hack in IngestionStatsAndErrorsTaskReport.