feat: json_merge expression and sql function#15926
Conversation
e5cd24f to
b2a109a
Compare
fcc07f4 to
8a7b81d
Compare
| throw JsonMergeExprMacro.this.validationFailed( | ||
| "invalid input expected %s but got %s instead", | ||
| ExpressionType.STRING, | ||
| arg.type() |
There was a problem hiding this comment.
The error is misleading since the issue is that the value is null. I am ok with strict validation, assuming there is a way to get past these null values somehow.
There was a problem hiding this comment.
I could tolerate null in the first argument, so essentially just return the 2nd argument unmodified.
| ); | ||
| } | ||
|
|
||
| try { |
There was a problem hiding this comment.
there could be a nice optimization where any literals are parsed into a map (again assuming that parsing string to json would happen inside readValue) only once.
| if (arg.type().is(ExprType.COMPLEX)) { | ||
| try { | ||
| return jsonMapper.writeValueAsString(unwrap(arg)); | ||
| } | ||
| catch (JsonProcessingException e) { | ||
| throw JsonMergeExprMacro.this.processingFailed(e, "bad complex input [%s]", arg.asString()); | ||
| } | ||
| } |
There was a problem hiding this comment.
is there a way to avoid the hop of json -> string -> json when arg is already a json?
There was a problem hiding this comment.
There isn't a way for the second parameter, because I'm using the Jackson updater, but for the first there is. However, that leads to the first variable being used again somehow on the second runs, that's the reason for the test case testJsonMergeOverflow(). I'm not sure why this is happening but essentially for the second run you get {"blah":"blahblah","blah2":"blahblah2"}.
Happy to take any pointers here. I suspect I would need to copy the object in memory to avoid the reference in which case I'm not sure how much of an optimisation it really would be.
| return jsonMapper.writeValueAsString(unwrap(arg)); | ||
| } | ||
| catch (JsonProcessingException e) { | ||
| throw JsonMergeExprMacro.this.processingFailed(e, "bad complex input [%s]", arg.asString()); |
There was a problem hiding this comment.
The error message should also tell what is bad about this input and what column does this message correspond to.
|
btw have you looked at #15521 (flatting arrays of objects using UNNEST and JSON_QUERY_ARRAY) ? |
That looks great, but it's the other way I'm going. Using that example I would end up with |
|
You are right. This function has broader usage. I think when you say "arrays of objects", you don't mean arrays in a single record but rather multiple records. Since |
|
This pull request has been marked as stale due to 60 days of inactivity. |
|
This pull request has been marked as stale due to 60 days of inactivity. |
|
This pull request/issue has been closed due to lack of activity. If you think that |
|
I've rebased this so we can revive now |
|
@lkm - are you going to revive this PR? |
|
@abhishekagarwal87 I don't think I have the permissions to do that. I been running this in prod for a while and happy with the functionality |
|
You might have to open a separate PR since I can't figure out a way to re-open it myself. |
|
FWIW the PR looks good to me in the current shape so we should be able to merge your PR pretty quick. |
|
Thanks. I've created a new one here: #17081 |
Description
This add a new
json_mergeexpression which takes two or more arguments to merge into a single json structure. It also added the corresponding SQL function. This is useful for transforming data in druid, for instance I use it for flattening arrays of objects using fold, e.g.fold((a, acc) -> json_merge(acc, json_object(json_value(a, '$.key'), json_value(a, '$.value'))), arrayofkv, json_object())Fixed the bug ...
Renamed the class ...
Added a forbidden-apis entry ...
Release note
Key changed/added classes in this PR
json_mergeexpression functionjson_mergesql functionThis PR has: