add JSON_QUERY_ARRAY function to pluck ARRAY<COMPLEX<json>> out of COMPLEX<json>#15521
Merged
clintropolis merged 4 commits intoapache:masterfrom Dec 8, 2023
Merged
Conversation
Member
Author
abhishekagarwal87
approved these changes
Dec 8, 2023
Contributor
abhishekagarwal87
left a comment
There was a problem hiding this comment.
some minor comments. LGTM otherwise. Very useful capability.
| @Override | ||
| public Expr apply(List<Expr> args) | ||
| { | ||
| if (args.get(1).isLiteral()) { |
Contributor
There was a problem hiding this comment.
we should add some validation on args count.
| } | ||
|
|
||
| @Override | ||
| public SqlTypeMappingRule getTypeMappingRule() |
Contributor
There was a problem hiding this comment.
can you add a comment here as to why this is overridden?
Member
Author
There was a problem hiding this comment.
oops, didn't mean to commit, was experimenting with CAST and forgot to remove this
10 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Description
This PR adds
JSON_QUERY_ARRAYwhich is sort of likeJSON_QUERYbut instead of returningCOMPLEX<json>for any value extracted from some json path, instead returnsARRAY<COMPLEX<json>>. This is currently done purely withExpressionVirtualColumnvia aDirectOperatorConversionrather than using the specializedNestedFieldVirtualColumnused byJSON_VALUEandJSON_QUERY, mostly because there isn't a lot of room for optimization yet, and I would rather wait until the future if we introduce specialized array column selectors than trying to extend the existing selectors of this virtual column to also handle arrays of objects.Similar to other array handling, values which are not arrays will be coerced into single element arrays, though I am open to discussion on this, since it would seem equally valid to handle them as null values...
This allows for a lot of useful stuff like using
UNNESTon arrays of objects, to transform an array of json objects into rows of json objects.For example, using some data sourced from a discussion in a community slack thread, which has top level arrays of objects (would also work with nested arrays of objects at some path)
We can use
JSON_QUERY_ARRAYto do stuff like translate it to a separate row per object:and further use
JSON_VALUEto extract values from these objects and do stuff like group or aggregate on them:Will add docs in a follow-up PR.
Release note
Added
JSON_QUERY_ARRAYwhich is similar toJSON_QUERYexcept the return type is alwaysARRAY<COMPLEX<json>>instead ofCOMPLEX<json>. Essentially, this function allows extracting arrays of objects from nested data and performing operations such asUNNEST,ARRAY_LENGTH,ARRAY_SLICE, or any other available ARRAY operations.This PR has: