Skip to content

Conversation

@xinyual
Copy link
Contributor

@xinyual xinyual commented Apr 17, 2025

Description

Here we add json related functions to align with spark

Here is the argument and description of functions

Functions argument description return type implementation
json json(value: ANY) return value if value is a json, otherwise, return null, e.g. json("123") = "123", json(json_object("1", "2")) = {"1": "2"}, json(1) = null the first argument type implement by ourselves
json_valid json_valid(value: ANY) return true if value is a json, otherwise, return false, e.g. json("123") = true, json(json_object("1", "2")) = true, json(1) = false boolean use calcite SqlStdOperatorTable.IS_JSON_VALUE
json_object json_object(key1: String, value1: ANY, key2: string, value2 ....) create an object using key value pairs, the key must be string string use calcite SqlStdOperatorTable.JSON_OBJECT
json_array json_array(value1: ANY, value2: ANY, ...) create an array using values string use SqlStdOperatorTable.JSON_ARRAY
json_array_length (this align with spark since splunk doesn't have this function) json_array_length(value: STRING) parse the string to json array and return size, if can't be transferred, return null integer implement by ourselves
json_extract json_extract(target: ANY, path1: STRING, path2:STRING...) it first transfer target to json, then extract value using paths. If only one path, return the value, otherwise, return the list of values. If one path cannot find value, return null as the result for this path. The path use "{}" to represent index for array, for e.g., demo = {"a": [{"b": 1}, {"b": 2}]}, json_extract(demo, "a{0}.b") = 1. "{}" means "{*}" if one path, we return string, otherwise, return list of string use calcite JsonFunctions::json_value, modify path to align with splunk
json_delete json_delete(target: ANY, path1: STRING, path2:STRING...) it first transfer target to json, then delete value using paths. return the object after deleting. If a path cannot find any value, skip it. string. align with splunk use calcite JsonFunctions::json_remove, modify path to align with splunk
json_set json_set(target: ANY, path1: STRING, value1: ANY, path2:STRING, value2:ANY...) it first transfer target to json, then set value to target path. The value can be set only when the parent node of target path is an object. Otherwise, skip it. For example, demo = {"a": [{"b": 1}, {"b": 2}]}, json_set(demo, "a{0}.b", 3) = {"a": [{"b": 3}, {"b": 2}]}. json_set(demo, "a{0}.b.d", 3) = {"a": [{"b": 1}, {"b": 2}]}. string.align with splunk use calcite JsonFunctions::json_set, modify path to align with splunk
json_append json_append(target: ANY, path1: STRING, value1: ANY, path2:STRING, value2:ANY...) it first transfer target to json, then append value to target path if the path points to an array. Otherwise, skip it. For example, demo = {"a": [{"b": 1}, {"b": 2}]}, json_append(demo, "a", 3) = {"a": [{"b": 1}, {"b": 2}, 3]}. json_append(demo, "a{0}.b", 3) = {"a": [{"b": 1}, {"b": 2}]}. string. align with splunk use calcite JsonFunctions::json_insert, modify path to align with splunk
json_extend json_extend(target: ANY, path1: STRING, value1: ANY, path2:STRING, value2:ANY...) it first transfer target to json, then extend value to target path if the path points to an array. Otherwise, skip it. For example, demo = {"a": [{"b": 1}, {"b": 2}]}, json_extend(demo, "a", json_array(1, 2)) = {"a": [{"b": 1}, {"b": 2}, 1, 2]}. json_extend(demo, "a{0}.b", 3) = {"a": [{"b": 1}, {"b": 2}]}. string. align with splunk use calcite JsonFunctions::json_insert, modify path to align with splunk
json_keys json_keys(target: ANY) it first transfer target to json object, then give the keys, otherwise, return null string use calcite JsonFunctions.jsonKeys

Related Issues

Resolves #[Issue number to be closed when this PR is merged]
#3565

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

xinyual added 5 commits April 17, 2025 15:02
Signed-off-by: xinyual <xinyual@amazon.com>
Signed-off-by: xinyual <xinyual@amazon.com>
Signed-off-by: xinyual <xinyual@amazon.com>
Signed-off-by: xinyual <xinyual@amazon.com>
Signed-off-by: xinyual <xinyual@amazon.com>
@LantaoJin
Copy link
Member

LantaoJin commented Apr 22, 2025

Why we didn't leverage the Calcite builtin JSON functions, for instance SqlStdOperatorTable.JSON_ARRAY

@xinyual
Copy link
Contributor Author

xinyual commented Apr 22, 2025

Why we didn't leverage the Calcite builtin JSON functions, for instance SqlStdOperatorTable.JSON_ARRAY

I have listed the functions above and try to reuse calcite code

Signed-off-by: xinyual <xinyual@amazon.com>
@LantaoJin
Copy link
Member

CI failures CalcitePPLAppendcolTest > ... FAILED is not related.

@xinyual xinyual force-pushed the addJsonFunctions branch from 03931e1 to 2a0bad8 Compare June 6, 2025 02:03
@xinyual
Copy link
Contributor Author

xinyual commented Jun 6, 2025

Force push to revert useless change and resolve DCO problems.

xinyual added 3 commits June 6, 2025 10:05
Signed-off-by: xinyual <xinyual@amazon.com>
Signed-off-by: xinyual <xinyual@amazon.com>
Signed-off-by: xinyual <xinyual@amazon.com>
@penghuo penghuo merged commit 05aa651 into opensearch-project:main Jun 9, 2025
22 checks passed
penghuo pushed a commit that referenced this pull request Jun 16, 2025
* add json functions

Signed-off-by: xinyual <xinyual@amazon.com>

* add new functions

Signed-off-by: xinyual <xinyual@amazon.com>

* add json related function

Signed-off-by: xinyual <xinyual@amazon.com>

* add all functions

Signed-off-by: xinyual <xinyual@amazon.com>

* add json functions

Signed-off-by: xinyual <xinyual@amazon.com>

* fix IT

Signed-off-by: xinyual <xinyual@amazon.com>

* fix error message

Signed-off-by: xinyual <xinyual@amazon.com>

* add json

Signed-off-by: xinyual <xinyual@amazon.com>

* try field

Signed-off-by: xinyual <xinyual@amazon.com>

* add json

Signed-off-by: xinyual <xinyual@amazon.com>

* update to splunk for extract

Signed-off-by: xinyual <xinyual@amazon.com>

* align with splunk

Signed-off-by: xinyual <xinyual@amazon.com>

* fi original IT

Signed-off-by: xinyual <xinyual@amazon.com>

* change array and json function

Signed-off-by: xinyual <xinyual@amazon.com>

* return string

Signed-off-by: xinyual <xinyual@amazon.com>

* apply spotless

Signed-off-by: xinyual <xinyual@amazon.com>

* revert useless change

Signed-off-by: xinyual <xinyual@amazon.com>

* apply spotless

Signed-off-by: xinyual <xinyual@amazon.com>

* revert useless change

Signed-off-by: xinyual <xinyual@amazon.com>

* revert useless change

Signed-off-by: xinyual <xinyual@amazon.com>

* align with splunk

Signed-off-by: xinyual <xinyual@amazon.com>

* remove useless change

Signed-off-by: xinyual <xinyual@amazon.com>

* remove useless change

Signed-off-by: xinyual <xinyual@amazon.com>

* remove useless change

Signed-off-by: xinyual <xinyual@amazon.com>

* remove useless change

Signed-off-by: xinyual <xinyual@amazon.com>

* remove useless change

Signed-off-by: xinyual <xinyual@amazon.com>

* change extract implementation

Signed-off-by: xinyual <xinyual@amazon.com>

* apply spotles

Signed-off-by: xinyual <xinyual@amazon.com>

* add more IT

Signed-off-by: xinyual <xinyual@amazon.com>

* remove to json string

Signed-off-by: xinyual <xinyual@amazon.com>

* remove useless change

Signed-off-by: xinyual <xinyual@amazon.com>

* add UT

Signed-off-by: xinyual <xinyual@amazon.com>

* add some doc

Signed-off-by: xinyual <xinyual@amazon.com>

* add docs

Signed-off-by: xinyual <xinyual@amazon.com>

* fix doc

Signed-off-by: xinyual <xinyual@amazon.com>

* add json path

Signed-off-by: xinyual <xinyual@amazon.com>

* optimize doc

Signed-off-by: xinyual <xinyual@amazon.com>

* optimize doc

Signed-off-by: xinyual <xinyual@amazon.com>

* optimize doc

Signed-off-by: xinyual <xinyual@amazon.com>

* optimize doc

Signed-off-by: xinyual <xinyual@amazon.com>

* optimize doc

Signed-off-by: xinyual <xinyual@amazon.com>

* optimize doc

Signed-off-by: xinyual <xinyual@amazon.com>

* optimize doc

Signed-off-by: xinyual <xinyual@amazon.com>

* add version and limitation to docs

Signed-off-by: xinyual <xinyual@amazon.com>

* remove original implementation of V2

Signed-off-by: xinyual <xinyual@amazon.com>

* revert useless change

Signed-off-by: xinyual <xinyual@amazon.com>

* remove useless UTs

Signed-off-by: xinyual <xinyual@amazon.com>

* Ignore useless IT

Signed-off-by: xinyual <xinyual@amazon.com>

* update doc

Signed-off-by: xinyual <xinyual@amazon.com>

* add type checker for json

Signed-off-by: xinyual <xinyual@amazon.com>

---------

Signed-off-by: xinyual <xinyual@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

calcite calcite migration releated

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants