Skip to content

Fix serialization in TaskReportFileWriters.#12938

Merged
gianm merged 2 commits intoapache:masterfrom
gianm:fix-task-report-serialization
Aug 24, 2022
Merged

Fix serialization in TaskReportFileWriters.#12938
gianm merged 2 commits intoapache:masterfrom
gianm:fix-task-report-serialization

Conversation

@gianm
Copy link
Copy Markdown
Contributor

@gianm gianm commented Aug 22, 2022

For some reason, serializing a Map<String, TaskReport> would omit the "type" field. Explicitly sending each value through the ObjectMapper fixes this, because the type information does not get lost. This eliminates the need for a hack in IngestionStatsAndErrorsTaskReport.

For some reason, serializing a Map<String, TaskReport> would omit the
"type" field. Explicitly sending each value through the ObjectMapper
fixes this, because the type information does not get lost.
@LakshSingla
Copy link
Copy Markdown
Contributor

Thanks for the fix! LGTM post CI (non-binding)

@FrankChen021
Copy link
Copy Markdown
Member

FrankChen021 commented Aug 23, 2022

Interesting. There's another similar problem #12912 that type info is lost in the serialized json string caused by jackson when serializing Map object. In that problem, I proposed to use JsonSerializer annotation to a customized map to hide the serialization detail.

Since we experience such problem here, I was wondering if we can add a serializer to jackson to handle Map object in global so that we don't deal with such problem case by case.

@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Aug 24, 2022

@FrankChen021 It does sound like the same problem. When looking into this issue (with TaskReportFileWriter) I was indeed wondering if it was a bug in Jackson, since it seems like in principle the builtin MapSerializer could do the same thing.

Due to your comment I looked into this a bit more. I just tried it on latest Jackson (4.13.3), behavior is still the same. I raised an issue with jackson-databind with a small test case: FasterXML/jackson-databind#3583. Hopefully a response to this issue would give us some insight about the best way to approach the problem.

If it takes some time to figure out a comprehensive solution, IMO we could commit two shorter term solutions in the meantime, so we don't have the current bugs with task reports and emitters. Then when we find a comprehensive solution we can unwind the shorter term fixes.

@FrankChen021
Copy link
Copy Markdown
Member

@gianm There's a comment on this problem several years ago: FasterXML/jackson-databind#699 (comment)

I think jackson won't fix it although it really seems like a bug from user side.

@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Aug 24, 2022

Hmm. I don't understand the comment on that issue, because it's clear (from the gist I posted) that Jackson does have all the info it needs to do the serialization properly. The writeAsEntries case works with the exact same map and no additional type hints.

@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Aug 24, 2022

Thanks for the review. I'll merge this and keep an eye on the linked Jackson issue. If it turns out to be closed as not-a-bug, we could explore making our own version of MapSerializer.

@gianm gianm merged commit 35aaaa9 into apache:master Aug 24, 2022
@gianm gianm deleted the fix-task-report-serialization branch August 24, 2022 15:11
@abhishekagarwal87 abhishekagarwal87 added this to the 24.0.0 milestone Aug 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants