Skip to content

[FEATURE] Add json_object constructor function to PPL #3208

@acarbonetto

Description

@acarbonetto

Is your feature request related to a problem?

As part of the RFC to add JSON functions, the json_object function would be useful to construct json objects from multi-values, scalars, and complex objects (and arrays).

What solution would you like?

To consider: Returning JSON objects OR json-encoded strings

  • Returning JSON objects reduces number of serialize/de-serialize actions on the plugin, and would slightly increase performance. Using JSON objects is compatible with AWS Athena SQL language, opensearch-spark PPL, and Splunk.
  • Returning JSON encoded strings reduces the complexity for the language, and removes the necessity for the user to call to_json_string or casting to String to output fields or call string-operations.

Tasks to complete:

  • Introduce the json_object(key, scalar_value [, key, scalar_value]) -> json_object function, where scalar_value is a scalar expression.
  • Introduce the json_object(key, value [, key, value]*) -> json_object function, where value may be another json_object.
  • Introduce the json_object(key, value [, key, value]*) -> json_object function, where value can be a field expression (for multi-valued fields, take the first value).
  • Introduce the json_object(key, value [, key, value]*) -> json_object function, where value can be of json_array.
### `JSON_OBJECT`

**Description**

`json_object(<key>, <value>[, <key>, <value>]...)` returns a JSON object from key-value pairs.

**Argument type:**
- A \<key\> must be STRING.
- A \<value\> can be a scalar, another json object, or json array type.  Note: scalar fields will be treated as single-value.  Use `json_array` to construct an array value from a multi-value. 

**Return type:** JSON Object

Example:

    os> source=people | eval result = json_object('key', 123.45) | fields result
    fetched rows / total rows = 1/1
    +------------------+
    | result           |
    +------------------+
    | {"key":123.45}   |
    +------------------+

    os> source=people | eval result = json_object('outer', json_object('inner', 123.45)) | fields result
    fetched rows / total rows = 1/1
    +------------------------------+
    | result                       |
    +------------------------------+
    | {"outer":{"inner":123.45}}   |
    +------------------------------+

    os> source=people | eval result = json_object('array_doc', json_array(123.45, "string", true, null)) | fields result
    fetched rows / total rows = 1/1
    +------------------------------+
    | result                       |
    +------------------------------+
    | {"array_doc":[123.45, "string", true, null]}   |
    +------------------------------+

What alternatives have you considered?

N/A

Do you have any additional context?

opensearch-project/opensearch-spark#780 - PR to add feature to opensearch-spark PPL

Metadata

Metadata

Assignees

No one assigned

    Labels

    PPLPiped processing languageenhancementNew feature or request

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions