Skip to content

document arrays in sql#12549

Merged
techdocsmith merged 13 commits intoapache:masterfrom
clintropolis:document-sql-array-stuffs
Apr 18, 2023
Merged

document arrays in sql#12549
techdocsmith merged 13 commits intoapache:masterfrom
clintropolis:document-sql-array-stuffs

Conversation

@clintropolis
Copy link
Copy Markdown
Member

These were intentionally not documented, but I think now its appropriate to mention them, especially since #12078 was added.

@clintropolis
Copy link
Copy Markdown
Member Author

hmm, thinking a bit more about this, i think there are some holes missing in array functionality that need to be addressed by documentation (or fixed in code) before we add this to the website docs. The biggest holes are probably around filtering, like using the dedicate array functions is cool, but iirc is not fully cool to use like = or in or similar things since the predicates don't handle them yet. We added an "object" predicate to DruidPredicateFactory to be able to handle this case (and also complex dimensions), but afaik haven't actually wired it up to anything yet

Comment thread docs/querying/sql-multivalue-string-functions.md Outdated
Comment thread docs/querying/sql-multivalue-string-functions.md Outdated
Comment thread docs/querying/sql-multivalue-string-functions.md Outdated
Comment thread docs/querying/sql-data-types.md Outdated
Comment thread docs/querying/sql-data-types.md Outdated
Comment thread docs/querying/sql-data-types.md Outdated
Comment thread docs/querying/sql-data-types.md Outdated
Comment thread docs/querying/sql-data-types.md Outdated
Comment thread docs/querying/sql-data-types.md Outdated
Comment thread docs/querying/sql-data-types.md Outdated
Comment thread docs/querying/sql-data-types.md Outdated
Comment thread docs/querying/sql-data-types.md Outdated
Comment thread docs/querying/sql-data-types.md Outdated
Comment thread docs/querying/sql-data-types.md Outdated
Comment thread docs/querying/sql-data-types.md Outdated
Comment thread docs/querying/sql-array-functions.md Outdated
Comment thread docs/querying/sql-array-functions.md Outdated
Comment thread docs/querying/sql-array-functions.md Outdated
Copy link
Copy Markdown
Contributor

@ektravel ektravel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some suggestions mainly around removing passive voice and future tense.

Comment thread docs/querying/sql-array-functions.md
Comment thread docs/querying/sql-array-functions.md
Comment thread docs/querying/sql-array-functions.md Outdated
Comment thread docs/querying/sql-data-types.md
Comment thread docs/querying/sql-data-types.md
Copy link
Copy Markdown
Contributor

@ektravel ektravel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few more suggestions, mainly changing passive voice to active. Otherwise, LGTM from the docs perspective.

clintropolis and others added 6 commits February 9, 2023 14:12
Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>
Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>
Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>
Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>
Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>
Comment thread docs/querying/sql-data-types.md Outdated
Comment on lines 34 to 35
Druid natively supports five basic column types: "long" (64 bit signed int), "float" (32 bit float), "double" (64 bit
float) "string" (UTF-8 encoded strings and string arrays), and "complex" (catch-all for more exotic data types like
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Druid natively supports five basic column types: "long" (64 bit signed int), "float" (32 bit float), "double" (64 bit
float) "string" (UTF-8 encoded strings and string arrays), and "complex" (catch-all for more exotic data types like
Druid natively supports the following basic column types: "long" (64-bit signed int), "float" (32-bit float), "double" (64-bit
float) "string" (UTF-8 encoded strings and string arrays), "complex" (catch-all for more exotic data types like

Copy link
Copy Markdown
Contributor

@ektravel ektravel Feb 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Druid natively supports five basic column types: "long" (64 bit signed int), "float" (32 bit float), "double" (64 bit
float) "string" (UTF-8 encoded strings and string arrays), and "complex" (catch-all for more exotic data types like
Druid natively supports the following column types:
- "long": 64-bit signed int
- "float": 32-bit float
- "double": 64-bit float
- "string": UTF-8 encoded strings and string arrays
- "complex": non-standard data types, such as nested JSON, hyperUnique and approxHistogram aggregators, and DataSketches aggregators
- "array": array of any supported column types

Consider presenting this information as a list to improve readability.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the list formatting for readability; don't forget to remove L36 if using this option

Comment thread docs/querying/sql-data-types.md Outdated
|BIGINT|LONG|`0`|Druid LONG columns (except `__time`) are reported as BIGINT|
|TIMESTAMP|LONG|`0`, meaning 1970-01-01 00:00:00 UTC|Druid's `__time` column is reported as TIMESTAMP. Casts between string and timestamp types assume standard SQL formatting, e.g. `2000-01-02 03:04:05`, _not_ ISO8601 formatting. For handling other formats, use one of the [time functions](sql-scalar.md#date-and-time-functions).|
|DATE|LONG|`0`, meaning 1970-01-01|Casting TIMESTAMP to DATE rounds down the timestamp to the nearest day. Casts between string and date types assume standard SQL formatting, e.g. `2000-01-02`. For handling other formats, use one of the [time functions](sql-scalar.md#date-and-time-functions).|
|ARRAY|ARRAY|Druid native array types work as SQL arrays, and multi-value strings can be converted to arrays. See the [`ARRAY` details](#arrays).|
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're missing a column value. You have three columns but should have four. The third column should be default value, and the fourth column is notes.

Comment thread docs/querying/sql-data-types.md Outdated
Comment on lines +98 to +99
multi-value functions. You can convert multi-value dimensions to standard SQL arrays either by explicitly converting
them with `MV_TO_ARRAY`, or implicitly when used within the [array functions](./sql-array-functions.md). Arrays may
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
multi-value functions. You can convert multi-value dimensions to standard SQL arrays either by explicitly converting
them with `MV_TO_ARRAY`, or implicitly when used within the [array functions](./sql-array-functions.md). Arrays may
multi-value functions. You can convert multi-value dimensions to standard SQL arrays either explicitly by converting
them with `MV_TO_ARRAY` or implicitly when used within the [array functions](./sql-array-functions.md). Arrays may

Comment thread docs/querying/sql-data-types.md Outdated
The behavior of Druid [multi-value string dimensions](multi-value-dimensions.md) varies depending on the context of their usage.

When used with standard `VARCHAR` functions, which expect a single input value per row, Druid maps the function across all values in the row.
the function across all values in the row. If the row is null or empty, the function will recieve `NULL` as its input,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
the function across all values in the row. If the row is null or empty, the function will recieve `NULL` as its input,
If the row is null or empty, the function will receive `NULL` as its input,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
the function across all values in the row. If the row is null or empty, the function will recieve `NULL` as its input,
the function across all values in the row. If the row is null or empty, the function receives `NULL` as its input,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to avoid future tense when possible.

Comment thread docs/querying/sql-data-types.md Outdated
When converted to `ARRAY` or used with [array functions](./sql-array-functions.md), multi-value strings behave as standard SQL arrays and can no longer
be manipulated with non-array functions.

Druid serializes multi-value `VARCHAR` results as a JSON string of the array, if the value was not grouped on. If the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Druid serializes multi-value `VARCHAR` results as a JSON string of the array, if the value was not grouped on. If the
Druid serializes multi-value `VARCHAR` results as a JSON string of the array, if grouping was not applied on the value. If the

Comment thread docs/querying/sql-array-functions.md Outdated
Comment on lines +50 to +51
|`ARRAY_OFFSET_OF(arr, expr)`|Returns the 0 based index of the first occurrence of `expr` in the array, or `-1` or `null` if `druid.generic.useDefaultValueForNull=false` if no matching elements exist in the array.|
|`ARRAY_ORDINAL_OF(arr, expr)`|Returns the 1 based index of the first occurrence of `expr` in the array, or `-1` or `null` if `druid.generic.useDefaultValueForNull=false` if no matching elements exist in the array.|
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
|`ARRAY_OFFSET_OF(arr, expr)`|Returns the 0 based index of the first occurrence of `expr` in the array, or `-1` or `null` if `druid.generic.useDefaultValueForNull=false` if no matching elements exist in the array.|
|`ARRAY_ORDINAL_OF(arr, expr)`|Returns the 1 based index of the first occurrence of `expr` in the array, or `-1` or `null` if `druid.generic.useDefaultValueForNull=false` if no matching elements exist in the array.|
|`ARRAY_OFFSET_OF(arr, expr)`|Returns the 0-based index of the first occurrence of `expr` in the array. If no matching elements exist in the array, returns `-1` or `null` if `druid.generic.useDefaultValueForNull=false`.|
|`ARRAY_ORDINAL_OF(arr, expr)`|Returns the 1-based index of the first occurrence of `expr` in the array. If no matching elements exist in the array, returns `-1` or `null` if `druid.generic.useDefaultValueForNull=false`.|

Comment thread docs/querying/sql-array-functions.md Outdated
|`ARRAY_PREPEND(expr, arr)`|Prepends `expr` to `arr`. The type of `arr` determines the resulting array type.|
|`ARRAY_APPEND(arr, expr)`|Appends `expr` to `arr`. The type of `arr` determines the resulting array type.|
|`ARRAY_CONCAT(arr1, arr2)`|Concatenates two arrays. The type of `arr1` determines the resulting array type.|
|`ARRAY_SLICE(arr, start, end)`|Returns the subarray of `arr` from the 0 based index `start` (inclusive) to `end` (exclusive). Returns `null`, if `start` is less than 0, greater than length of `arr`, or less than `end`.|
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
|`ARRAY_SLICE(arr, start, end)`|Returns the subarray of `arr` from the 0 based index `start` (inclusive) to `end` (exclusive). Returns `null`, if `start` is less than 0, greater than length of `arr`, or less than `end`.|
|`ARRAY_SLICE(arr, start, end)`|Returns the subarray of `arr` from the 0-based index `start` (inclusive) to `end` (exclusive). Returns `null` if `start` is less than 0, greater than length of `arr`, or less than `end`.|

Comment thread docs/querying/sql-array-functions.md Outdated
|`ARRAY_PREPEND(expr, arr)`|Prepends `expr` to `arr`. The type of `arr` determines the resulting array type.|
|`ARRAY_APPEND(arr, expr)`|Appends `expr` to `arr`. The type of `arr` determines the resulting array type.|
|`ARRAY_CONCAT(arr1, arr2)`|Concatenates two arrays. The type of `arr1` determines the resulting array type.|
|`ARRAY_SLICE(arr, start, end)`|Returns the subarray of `arr` from the 0 based index `start` (inclusive) to `end` (exclusive). Returns `null`, if `start` is less than 0, greater than length of `arr`, or less than `end`.|
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should that say if start is greater than end?

Comment thread docs/querying/sql-array-functions.md Outdated
|`ARRAY_ORDINAL_OF(arr, expr)`|Returns the 1 based index of the first occurrence of `expr` in the array, or `-1` or `null` if `druid.generic.useDefaultValueForNull=false` if no matching elements exist in the array.|
|`ARRAY_PREPEND(expr, arr)`|Prepends `expr` to `arr`. The type of `arr` determines the resulting array type.|
|`ARRAY_APPEND(arr, expr)`|Appends `expr` to `arr`. The type of `arr` determines the resulting array type.|
|`ARRAY_CONCAT(arr1, arr2)`|Concatenates two arrays. The type of `arr1` determines the resulting array type.|
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggests that arr1 goes before arr2, or could make more explicit

Suggested change
|`ARRAY_CONCAT(arr1, arr2)`|Concatenates two arrays. The type of `arr1` determines the resulting array type.|
|`ARRAY_CONCAT(arr1, arr2)`|Concatenates `arr2` to `arr1`. The type of `arr1` determines the resulting array type.|

Comment thread docs/querying/sql-array-functions.md Outdated
|`ARRAY_LENGTH(arr)`|Returns length of array expression.|
|`ARRAY_OFFSET(arr, long)`|Returns the array element at the 0 based index supplied, or null for an out of range index.|
|`ARRAY_ORDINAL(arr, long)`|Returns the array element at the 1 based index supplied, or null for an out of range index.|
|`ARRAY_CONTAINS(arr, expr)`|Returns 1 if the array contains the element specified by `expr`, or contains all elements specified by `expr` if `expr` is an array, else 0.|
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description is pretty dense, trying to unpack a bit:

Suggested change
|`ARRAY_CONTAINS(arr, expr)`|Returns 1 if the array contains the element specified by `expr`, or contains all elements specified by `expr` if `expr` is an array, else 0.|
|`ARRAY_CONTAINS(arr, expr)`|If `expr` is a single element, returns 1 if `arr` contains the element. If `expr` is an array, returns 1 if `arr` contains all elements specified by `expr`. Otherwise returns 0.|

Comment thread docs/querying/sql-array-functions.md Outdated

This page describes the operations you can perform on arrays using [Druid SQL](./sql.md). See [`ARRAY` data type documentation](./sql-data-types.md#arrays) for additional details.

All 'array' references in the array function documentation can refer to multi-value string columns or `ARRAY` literals. These functions are largely
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
All 'array' references in the array function documentation can refer to multi-value string columns or `ARRAY` literals. These functions are largely
All array references in the array function documentation can refer to multi-value string columns or `ARRAY` literals. These functions are largely

Copy link
Copy Markdown
Contributor

@ektravel ektravel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a suggestion on how to list out the supported column types.

@vogievetsky
Copy link
Copy Markdown
Contributor

it looks like you are adding a new page of functions, could you please make sure to add it here https://github.com/apache/druid/blob/master/web-console/script/create-sql-docs.js#L65 so that the console can read it?

@clintropolis clintropolis added this to the 26.0 milestone Apr 1, 2023
Copy link
Copy Markdown
Contributor

@techdocsmith techdocsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tables LGTM
#14104 for follow-up

@techdocsmith techdocsmith merged commit f6a0888 into apache:master Apr 18, 2023
@clintropolis clintropolis deleted the document-sql-array-stuffs branch April 18, 2023 19:40
clintropolis added a commit to clintropolis/druid that referenced this pull request Apr 18, 2023
* document arrays in sql

* adjustments

* Update docs/querying/sql-array-functions.md

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* Update docs/querying/sql-data-types.md

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* Update docs/querying/sql-data-types.md

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* Update docs/querying/sql-array-functions.md

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* Update docs/querying/sql-array-functions.md

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* Update sql-array-functions.md

* fix stuff

* fix spelling

---------

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>
clintropolis added a commit that referenced this pull request Apr 19, 2023
* document arrays in sql

* adjustments

* Update docs/querying/sql-array-functions.md



* Update docs/querying/sql-data-types.md



* Update docs/querying/sql-data-types.md



* Update docs/querying/sql-array-functions.md



* Update docs/querying/sql-array-functions.md



* Update sql-array-functions.md

* fix stuff

* fix spelling

---------

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants