From 0b54fdaccfac1c378dec38534ea243dad7c0cbf5 Mon Sep 17 00:00:00 2001 From: Clint Wylie Date: Fri, 4 Aug 2023 17:50:15 -0700 Subject: [PATCH 1/6] document new filters and stuff --- docs/ingestion/ingestion-spec.md | 2 +- docs/ingestion/schema-design.md | 2 + docs/querying/filters.md | 739 ++++++++++++++++++++--------- docs/querying/sql-query-context.md | 1 + website/.spelling | 1 + website/static/css/custom.css | 12 - 6 files changed, 527 insertions(+), 230 deletions(-) diff --git a/docs/ingestion/ingestion-spec.md b/docs/ingestion/ingestion-spec.md index 43baf601bdbf..6ed2f0a838f9 100644 --- a/docs/ingestion/ingestion-spec.md +++ b/docs/ingestion/ingestion-spec.md @@ -186,7 +186,7 @@ Treat `__time` as a millisecond timestamp: the number of milliseconds since Jan ### `dimensionsSpec` The `dimensionsSpec` is located in `dataSchema` → `dimensionsSpec` and is responsible for -configuring [dimensions](./schema-model.md#dimensions). An example `dimensionsSpec` is: +configuring [dimensions](./schema-model.md#dimensions). You can either manually specify the dimensions or take advantage of schema auto-discovery where you allow Druid to infer all or some of the schema for your data. This means that you don't have to explicitly specify your dimensions and their type. diff --git a/docs/ingestion/schema-design.md b/docs/ingestion/schema-design.md index 6d385c7b60e7..e59515b465ad 100644 --- a/docs/ingestion/schema-design.md +++ b/docs/ingestion/schema-design.md @@ -261,6 +261,8 @@ native boolean types, Druid ingests these values as strings if `druid.expression the [array functions](../querying/sql-array-functions.md) or [UNNEST](../querying/sql-functions.md#unnest). Nested columns can be queried with the [JSON functions](../querying/sql-json-functions.md). +We also highly recommend setting `druid.generic.useDefaultValueForNull=false` when using these columns since it also enables out of the box `ARRAY` type filtering. If this is not set to true, setting `sqlUseBoundsAndSelectors` to `false` on the [SQL query context](../querying/sql-query-context.md) can enable `ARRAY` filtering. + Mixed type columns are stored in the _least_ restrictive type that can represent all values in the column. For example: - Mixed numeric columns are `DOUBLE` diff --git a/docs/querying/filters.md b/docs/querying/filters.md index 82fdb8116886..26a21bede4ed 100644 --- a/docs/querying/filters.md +++ b/docs/querying/filters.md @@ -37,261 +37,219 @@ Apache Druid supports the following types of filters. The simplest filter is a selector filter. The selector filter will match a specific dimension with a specific value. Selector filters can be used as the base filters for more complex Boolean expressions of filters. -The grammar for a SELECTOR filter is as follows: +| Property | Description | Required | +| -------- | ----------- | -------- | +| `type` | Must be "selector".| Yes | +| `dimension` | Input column or virtual column name to filter. | Yes | +| `value` | String value to match. | No, if not specified the filter will match NULL values. | +| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | -``` json -"filter": { "type": "selector", "dimension": , "value": } -``` - -This is the equivalent of `WHERE = ''` or `WHERE IS NULL` -(if the `value` is `null`). +The selector filter is limited to only being able to match against `STRING` (single and multi-valued), `LONG`, `FLOAT`, +`DOUBLE` types. Use the newer null and equality filters to match against `ARRAY` or `COMPLEX` types. -The selector filter supports the use of extraction functions, see [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. +When the selector filter matches against numeric inputs, the string `value` will be best effort coerced into a numeric value. -## Column comparison filter - -The column comparison filter is similar to the selector filter, but instead compares dimensions to each other. For example: +### Example: equivalent of `WHERE someColumn = 'hello'` ``` json -"filter": { "type": "columnComparison", "dimensions": [, ] } +{ "type": "selector", "dimension": "someColumn", "value": "hello" } ``` -This is the equivalent of `WHERE = `. -`dimensions` is list of [DimensionSpecs](./dimensionspecs.md), making it possible to apply an extraction function if needed. - -## Regular expression filter - -The regular expression filter is similar to the selector filter, but using regular expressions. It matches the specified dimension with the given pattern. The pattern can be any standard [Java regular expression](http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html). +### Example: equivalent of `WHERE someColumn IS NULL`. ``` json -"filter": { "type": "regex", "dimension": , "pattern": } +{ "type": "selector", "dimension": "someColumn", "value": null } ``` -The regex filter supports the use of extraction functions, see [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. +## Equality Filter -## Logical expression filters +The equality filter is a replacement for the selector filter with the ability to match against any type of column. The equality filter intends to have more SQL compatible behavior than the selector filter and so cannot match NULL values, use the null filter instead. -### AND +Druid's SQL planner will by default use the equality filter instead of selector filter whenever `druid.generic.useDefaultValueForNull=false`, or if `sqlUseBoundAndSelectors` is set to false on the [SQL query context](./sql-query-context.md). -The grammar for an AND filter is as follows: +| Property | Description | Required | +| -------- | ----------- | -------- | +| `type` | Must be "equality".| Yes | +| `column` | Input column or virtual column name to filter. | Yes | +| `matchValueType` | String specifying the type of value to match, e.g. `STRING`, `LONG`, `DOUBLE`, `FLOAT`, `ARRAY`, `ARRAY` etc. The `matchValueType` indicates how Druid should interpret the `matchValue` to assist in converting to the type of the `column` being matched. | Yes | +| `matchValue` | Value to match, must not me null. | Yes | -``` json -"filter": { "type": "and", "fields": [, , ...] } +### Example: equivalent of `WHERE someColumn = 'hello'` + +```json +{ "type": "equality", "column": "someColumn", "matchValueType": "STRING", "matchValue": "hello" } ``` -The filters in fields can be any other filter defined on this page. +### Example: equivalent of `WHERE someNumericColumn = 1.23` -### OR +```json +{ "type": "equality", "column": "someNumericColumn", "matchValueType": "DOUBLE", "matchValue": 1.23 } +``` -The grammar for an OR filter is as follows: +### Example: equivalent of `WHERE someArrayColumn = ARRAY[1, 2, 3]` -``` json -"filter": { "type": "or", "fields": [, , ...] } +```json +{ "type": "equality", "column": "someArrayColumn", "matchValueType": "ARRAY", "matchValue": [1, 2, 3] } ``` -The filters in fields can be any other filter defined on this page. -### NOT +## Null Filter + +The null filter is a partial replacement for the selector filter dedicated to matching NULL values. + +| Property | Description | Required | +| -------- | ----------- | -------- | +| `type` | Must be "null".| Yes | +| `column` | Input column or virtual column name to filter. | Yes | -The grammar for a NOT filter is as follows: +### Example: equivalent of `WHERE someColumn IS NULL` ```json -"filter": { "type": "not", "field": } +{ "type": "null", "column": "someColumn" } ``` -The filter specified at field can be any other filter defined on this page. -## JavaScript filter +## Column comparison filter -The JavaScript filter matches a dimension against the specified JavaScript function predicate. The filter matches values for which the function returns true. +The column comparison filter is similar to the selector filter, but instead compares dimensions to each other. For example: -The function takes a single argument, the dimension value, and returns either true or false. +| Property | Description | Required | +| -------- | ----------- | -------- | +| `type` | Must be "selector".| Yes | +| `dimensions` | List of [`DimensionSpec`](./dimensionspecs.md) to compare. | Yes | -```json -{ - "type" : "javascript", - "dimension" : , - "function" : "function(value) { <...> }" -} -``` +`dimensions` is list of [DimensionSpecs](./dimensionspecs.md), making it possible to apply an extraction function if needed. -**Example** -The following matches any dimension values for the dimension `name` between `'bar'` and `'foo'` +Note that the column comparison filter converts all values to strings prior to comparison, which can allow differently typed input columns to still match each other without any sort of casting required. -```json +### Example: equivalent of `WHERE someColumn = someLongColumn`. + +``` json { - "type" : "javascript", - "dimension" : "name", - "function" : "function(x) { return(x >= 'bar' && x <= 'foo') }" + "type": "columnComparison", + "dimensions": [ + "someColumn", + { + "type" : "default", + "dimension" : someLongColumn, + "outputType": "LONG" + } + ] } ``` -The JavaScript filter supports the use of extraction functions, see [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. -> JavaScript-based functionality is disabled by default. Please refer to the Druid [JavaScript programming guide](../development/javascript.md) for guidelines about using Druid's JavaScript functionality, including instructions on how to enable it. +## Logical expression filters -## Extraction filter +### AND -> The extraction filter is now deprecated. The selector filter with an extraction function specified -> provides identical functionality and should be used instead. +| Property | Description | Required | +| -------- | ----------- | -------- | +| `type` | Must be "and".| Yes | +| `fields` | List of filter JSON objects, such as any other filter defined on this page or provided by extensions. | Yes | -Extraction filter matches a dimension using some specific [Extraction function](./dimensionspecs.md#extraction-functions). -The following filter matches the values for which the extraction function has transformation entry `input_key=output_value` where -`output_value` is equal to the filter `value` and `input_key` is present as dimension. -**Example** -The following matches dimension values in `[product_1, product_3, product_5]` for the column `product` +#### Example: equivalent of `WHERE someColumn = 'a' AND otherColumn = 1234 AND anotherColumn IS NULL` -```json +``` json { - "filter": { - "type": "extraction", - "dimension": "product", - "value": "bar_1", - "extractionFn": { - "type": "lookup", - "lookup": { - "type": "map", - "map": { - "product_1": "bar_1", - "product_5": "bar_1", - "product_3": "bar_1" - } - } - } - } + "type": "and", + "fields": [ + { "type": "equality", "column": "someColumn", "matchValue": "a", "matchValueType": "STRING" }, + { "type": "equality", "column": "otherColumn", "matchValue": 1234, "matchValueType": "LONG" }, + { "type": "null", "column": "anotherColumn" } + ] } ``` -## Search filter +### OR -Search filters can be used to filter on partial string matches. +| Property | Description | Required | +| -------- | ----------- | -------- | +| `type` | Must be "or".| Yes | +| `fields` | List of filter JSON objects, such as any other filter defined on this page or provided by extensions. | Yes | -```json +#### Example: equivalent of `WHERE someColumn = 'a' OR otherColumn = 1234 OR anotherColumn IS NULL` + +``` json { - "filter": { - "type": "search", - "dimension": "product", - "query": { - "type": "insensitive_contains", - "value": "foo" - } - } + "type": "or", + "fields": [ + { "type": "equality", "column": "someColumn", "matchValue": "a", "matchValueType": "STRING" }, + { "type": "equality", "column": "otherColumn", "matchValue": 1234, "matchValueType": "LONG" }, + { "type": "null", "column": "anotherColumn" } + ] } ``` -|property|description|required?| -|--------|-----------|---------| -|type|This String should always be "search".|yes| -|dimension|The dimension to perform the search over.|yes| -|query|A JSON object for the type of search. See [search query spec](#search-query-spec) for more information.|yes| -|extractionFn|[Extraction function](#filtering-with-extraction-functions) to apply to the dimension|no| - -The search filter supports the use of extraction functions, see [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. - -### Search query spec - -#### Contains - -|property|description|required?| -|--------|-----------|---------| -|type|This String should always be "contains".|yes| -|value|A String value to run the search over.|yes| -|caseSensitive|Whether two string should be compared as case sensitive or not|no (default == false)| - -#### Insensitive Contains +### NOT -|property|description|required?| -|--------|-----------|---------| -|type|This String should always be "insensitive_contains".|yes| -|value|A String value to run the search over.|yes| +| Property | Description | Required | +| -------- | ----------- | -------- | +| `type` | Must be "not".| Yes | +| `field` | Filter JSON objects, such as any other filter defined on this page or provided by extensions. | Yes | -Note that an "insensitive_contains" search is equivalent to a "contains" search with "caseSensitive": false (or not -provided). +#### Example: equivalent of `WHERE someColumn IS NOT NULL` -#### Fragment +```json +{ "type": "not", "field": { "type": "null", "column": "someColumn" }} +``` -|property|description|required?| -|--------|-----------|---------| -|type|This String should always be "fragment".|yes| -|values|A JSON array of String values to run the search over.|yes| -|caseSensitive|Whether strings should be compared as case sensitive or not. Default: false(insensitive)|no| ## In filter +The in filter can match input rows against a set of values, where a match occurs if the value is contained in the set. -In filter can be used to express the following SQL query: - -```sql - SELECT COUNT(*) AS 'Count' FROM `table` WHERE `outlaw` IN ('Good', 'Bad', 'Ugly') -``` +| Property | Description | Required | +| -------- | ----------- | -------- | +| `type` | Must be "in".| Yes | +| `dimension` | Input column or virtual column name to filter. | Yes | +| `values` | List of string value to match. | Yes | +| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | -The grammar for a "in" filter is as follows: - -```json -{ - "type": "in", - "dimension": "outlaw", - "values": ["Good", "Bad", "Ugly"] -} -``` - -The "in" filter supports the use of extraction functions, see [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. If an empty `values` array is passed to the "in" filter, it will simply return an empty result. -If the `dimension` is a multi-valued dimension, the "in" filter will return true if one of the dimension values is -in the `values` array. - If the `values` array contains `null`, the "in" filter matches null values. This differs from the SQL IN filter, which does not match NULL values. -## Like filter - -Like filters can be used for basic wildcard searches. They are equivalent to the SQL LIKE operator. Special characters -supported are "%" (matches any number of characters) and "\_" (matches any one character). - -|property|type|description|required?| -|--------|-----------|---------|---------| -|type|String|This should always be "like".|yes| -|dimension|String|The dimension to filter on|yes| -|pattern|String|LIKE pattern, such as "foo%" or "___bar".|yes| -|escape|String|An escape character that can be used to escape special characters.|no| -|extractionFn|[Extraction function](#filtering-with-extraction-functions)| Extraction function to apply to the dimension|no| - -Like filters support the use of extraction functions, see [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. - -This Like filter expresses the condition `last_name LIKE "D%"` (i.e. last_name starts with "D"). +### Example: equivalent of `WHERE `outlaw` IN ('Good', 'Bad', 'Ugly')` ```json { - "type": "like", - "dimension": "last_name", - "pattern": "D%" + "type": "in", + "dimension": "outlaw", + "values": ["Good", "Bad", "Ugly"] } ``` + ## Bound filter Bound filters can be used to filter on ranges of dimension values. It can be used for comparison filtering like greater than, less than, greater than or equal to, less than or equal to, and "between" (if both "lower" and "upper" are set). -|property|type|description|required?| -|--------|-----------|---------|---------| -|type|String|This should always be "bound".|yes| -|dimension|String|The dimension to filter on|yes| -|lower|String|The lower bound for the filter|no| -|upper|String|The upper bound for the filter|no| -|lowerStrict|Boolean|Perform strict comparison on the lower bound (">" instead of ">=")|no, default: false| -|upperStrict|Boolean|Perform strict comparison on the upper bound ("<" instead of "<=")|no, default: false| -|ordering|String|Specifies the sorting order to use when comparing values against the bound. Can be one of the following values: "lexicographic", "alphanumeric", "numeric", "strlen", "version". See [Sorting Orders](./sorting-orders.md) for more details.|no, default: "lexicographic"| -|extractionFn|[Extraction function](#filtering-with-extraction-functions)| Extraction function to apply to the dimension|no| +| Property | Description | Required | +| -------- | ----------- | -------- | +| `type` | Must be "bound". | Yes | +| `dimension` | Input column or virtual column name to filter. | Yes | +| `lower` | The lower bound string match value for the filter. | No | +| `upper`| The upper bound string match value for the filter. | No | +| `lowerStrict` | Boolean indicating whether to perform strict comparison on the `lower` bound (">" instead of ">="). | No, default: `false` | +| `upperStrict` | Boolean indicating whether to perform strict comparison on the upper bound ("<" instead of "<="). | No, default: `false`| +| `ordering` | String that specifies the sorting order to use when comparing values against the bound. Can be one of the following values: `"lexicographic"`, `"alphanumeric"`, `"numeric"`, `"strlen"`, `"version"`. See [Sorting Orders](./sorting-orders.md) for more details. | No, default: `"lexicographic"`| +| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | -Bound filters support the use of extraction functions, see [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. +When the bound filter matches against numeric inputs, the string `lower` and `upper` bound values will be best effort coerced into a numeric value when using the `"numeric"` mode of ordering. -The following bound filter expresses the condition `21 <= age <= 31`: +The bound filter is limited to only being able to match against `STRING` (single and multi-valued), `LONG`, `FLOAT`, +`DOUBLE` types. Use the newer range to match against `ARRAY` or `COMPLEX` types. + +Also note that the bound filter will match null values if no lower bound is specified. Use the range filter instead if SQL compatible behavior is desired. + +### Example: equivalent to `WHERE 21 <= age <= 31`: ```json { @@ -303,7 +261,7 @@ The following bound filter expresses the condition `21 <= age <= 31`: } ``` -This filter expresses the condition `foo <= name <= hoo`, using the default lexicographic sorting order. +### Example: equivalent to `WHERE 'foo' <= name <= 'hoo'`, using the default lexicographic sorting order. ```json { @@ -314,7 +272,7 @@ This filter expresses the condition `foo <= name <= hoo`, using the default lexi } ``` -Using strict bounds, this filter expresses the condition `21 < age < 31` +### Example: equivalent to `WHERE 21 < age < 31` ```json { @@ -328,7 +286,7 @@ Using strict bounds, this filter expresses the condition `21 < age < 31` } ``` -The user can also specify a one-sided bound by omitting "upper" or "lower". This filter expresses `age < 31`. +### Example: equivalent to `WHERE age < 31`. ```json { @@ -340,7 +298,7 @@ The user can also specify a one-sided bound by omitting "upper" or "lower". This } ``` -Likewise, this filter expresses `age >= 18` +### Example: equivalent to `WHERE age >= 18` ```json { @@ -352,18 +310,154 @@ Likewise, this filter expresses `age >= 18` ``` +## Range filter + +The range filter is a replacement for the bound filter with the ability to compare against any type of column. The range filter has more SQL compliant behavior, and will never match NULL values, even if no lower bound is specified. + +Druid's SQL planner will by default use the range filter instead of bound filter whenever `druid.generic.useDefaultValueForNull=false`, or if `sqlUseBoundAndSelectors` is set to false on the [SQL query context](./sql-query-context.md). + +| Property | Description | Required | +| -------- | ----------- | -------- | +| `type` | Must be "range".| Yes | +| `column` | Input column or virtual column name to filter. | Yes | +| `matchValueType` | String specifying the type of bounds to match, e.g. `STRING`, `LONG`, `DOUBLE`, `FLOAT`, `ARRAY`, `ARRAY` etc. The `matchValueType` indicates how Druid should interpret the `matchValue` to assist in converting to the type of the `column` being matched and also defines the type of comparison to use when matching values. | Yes | +| `lower` | Lower bound value to match. | No, at least one of `lower` or `upper` must not me null. | +| `upper` | Upper bound value to match. | No, at least one of `lower` or `upper` must not me null. | +| `lowerOpen` | Boolean indicating if lower bound is open in the interval of values defined by the range (">" instead of ">="). | No | +| `upperOpen` | Boolean indicating if upper bound is open on the interval of values defined by range ("<" instead of "<="). | No | + +### Example: equivalent to `WHERE 21 <= age <= 31`: + +```json +{ + "type": "range", + "column": "age", + "matchValueType": "LONG", + "lower": 21, + "upper": 31 +} +``` + +### Example: equivalent to `WHERE 'foo' <= name <= 'hoo'`, using STRING comparison. + +```json +{ + "type": "range", + "column": "name", + "matchValueType": "STRING", + "lower": "foo", + "upper": "hoo" +} +``` + +### Example: equivalent to `WHERE 21 < age < 31` + +```json +{ + "type": "range", + "column": "age", + "matchValueType": "LONG", + "lower": "21", + "lowerOpen": true, + "upper": "31" , + "upperOpen": true +} +``` + +### Example: equivalent to `WHERE age < 31`. + +```json +{ + "type": "range", + "column": "age", + "matchValueType": "LONG", + "upper": "31" , + "upperOpen": true +} +``` + +### Example: equivalent to `WHERE age >= 18` + +```json +{ + "type": "range", + "column": "age", + "matchValueType": "LONG", + "lower": 18 +} +``` + +### Example: equivalent to `WHERE ARRAY['a','b','c'] < arrayColumn < ARRAY['d','e','f']`, using ARRAY comparison. + +```json +{ + "type": "range", + "column": "name", + "matchValueType": "ARRAY", + "lower": ["a","b","c"], + "lowerOpen": true, + "upper": ["d","e","f"], + "upperOpen": true +} +``` + + +## Like filter + +Like filters can be used for basic wildcard searches. They are equivalent to the SQL LIKE operator. Special characters +supported are "%" (matches any number of characters) and "\_" (matches any one character). + +| Property | Description | Required | +| -------- | ----------- | -------- | +| `type` | Must be "like".| Yes | +| `dimension` | Input column or virtual column name to filter. | Yes | +| `pattern` | String LIKE pattern, such as "foo%" or "___bar".| Yes | +| `escape`| A string escape character that can be used to escape special characters. | No | +| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | + +Like filters support the use of extraction functions, see [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. + +### Example: equivalent of `WHERE last_name LIKE "D%"` (last_name starts with "D") + +```json +{ + "type": "like", + "dimension": "last_name", + "pattern": "D%" +} +``` + +## Regular expression filter + +The regular expression filter is similar to the selector filter, but using regular expressions. It matches the specified dimension with the given pattern. + +| Property | Description | Required | +| -------- | ----------- | -------- | +| `type` | Must be "regex".| Yes | +| `dimension` | Input column or virtual column name to filter. | Yes | +| `pattern` | String pattern to match - any standard [Java regular expression](http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html). | Yes | +| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | + +Note that it is often more optimal to use a like filter instead of a regex for simple matching of prefixes. + +### Example: matches values that start with "50." + +``` json +{ "type": "regex", "dimension": "someColumn", "pattern": ^50.* } +``` + ## Interval filter The Interval filter enables range filtering on columns that contain long millisecond values, with the boundaries specified as ISO 8601 time intervals. It is suitable for the `__time` column, long metric columns, and dimensions with values that can be parsed as long milliseconds. This filter converts the ISO 8601 intervals to long millisecond start/end ranges and translates to an OR of Bound filters on those millisecond ranges, with numeric comparison. The Bound filters will have left-closed and right-open matching (i.e., start <= time < end). -|property|type|description|required?| -|--------|-----------|---------|---------| -|type|String|This should always be "interval".|yes| -|dimension|String|The dimension to filter on|yes| -|intervals|Array|A JSON array containing ISO-8601 interval strings. This defines the time ranges to filter on.|yes| -|extractionFn|[Extraction function](#filtering-with-extraction-functions)| Extraction function to apply to the dimension|no| +| Property | Description | Required | +| -------- | ----------- | -------- | +| `type` | Must be "interval". | Yes | +| `dimension` | Input column or virtual column name to filter. | Yes | +| `intervals` | A JSON array containing ISO-8601 interval strings. This defines the time ranges to filter on. | Yes | +| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | The interval filter supports the use of extraction functions, see [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. @@ -410,6 +504,162 @@ The filter above is equivalent to the following OR of Bound filters: } ``` + +## True filter +The true filter is a filter which matches all values. It can be used to temporarily disable other filters without removing the filter. + +```json +{ "type" : "true" } +``` + +## False filter +The false filter matches no values. It can be used to force a query to match nothing. + +```json +{"type": "false" } +``` + + +## Search filter + +Search filters can be used to filter on partial string matches. + +```json +{ + "filter": { + "type": "search", + "dimension": "product", + "query": { + "type": "insensitive_contains", + "value": "foo" + } + } +} +``` + +| Property | Description | Required | +| -------- | ----------- | -------- | +| `type` | Must be "search". | Yes | +| `dimension` | Input column or virtual column name to filter. | Yes | +| `query`| A JSON object for the type of search. See [search query spec](#search-query-spec) for more information. | Yes | +| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | + +The search filter supports the use of extraction functions, see [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. + +### Search query spec + +#### Contains + +| Property | Description | Required | +| -------- | ----------- | -------- | +| `type` | Must be "contains". | Yes | +| `value` |A String value to run the search over. | Yes | +| `caseSensitive` | Whether two string should be compared as case sensitive or not. | No, default is false (insensitive) | + +#### Insensitive Contains + +| Property | Description | Required | +| -------- | ----------- | -------- | +| `type` | Must be "insensitive_contains". | Yes | +| `value` | A String value to run the search over. | Yes | + +Note that an "insensitive_contains" search is equivalent to a "contains" search with "caseSensitive": false (or not +provided). + +#### Fragment + +| Property | Description | Required | +| -------- | ----------- | -------- | +| `type` | Must be "fragment". | Yes | +| `values` | A JSON array of String values to run the search over. | Yes | +| `caseSensitive` | Whether strings should be compared as case sensitive or not. | No, default is false (insensitive) | + + + +## Expression filter + +The expression filter allows for the implementation of arbitrary conditions, leveraging the Druid expression system. This filter allows for complete flexibility, but it might be less performant than a combination of the other filters on this page due to the fact that not always being able to take advantage of the same optimizations others might benefit from. + +| Property | Description | Required | +| -------- | ----------- | -------- | +| `type` | Must be "expression" | Yes | +| `expression` | Expression string to evaluate into true or false. See the [Druid expression system](math-expr.md) for more details. | Yes | + +### Example: expression based matching + +```json +{ + "type" : "expression" , + "expression" : "((product_type == 42) && (!is_deleted))" +} +``` + + +## JavaScript filter + +The JavaScript filter matches a dimension against the specified JavaScript function predicate. The filter matches values for which the function returns true. + +| Property | Description | Required | +| -------- | ----------- | -------- | +| `type` | Must be "javascript" | Yes | +| `dimension` | Input column or virtual column name to filter. | Yes | +| `function` | JavaScript function which accepts a single argument, the dimension value, and returns either true or false. | Yes | +| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | + +### Example: matching any dimension values for the dimension `name` between `'bar'` and `'foo'` + +```json +{ + "type" : "javascript", + "dimension" : "name", + "function" : "function(x) { return(x >= 'bar' && x <= 'foo') }" +} +``` + +The JavaScript filter supports the use of extraction functions, see [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. + +> JavaScript-based functionality is disabled by default. Please refer to the Druid [JavaScript programming guide](../development/javascript.md) for guidelines about using Druid's JavaScript functionality, including instructions on how to enable it. + + +## Extraction filter + +> The extraction filter is now deprecated. The selector filter with an extraction function specified +> provides identical functionality and should be used instead. + +Extraction filter matches a dimension using some specific [Extraction function](./dimensionspecs.md#extraction-functions). +The following filter matches the values for which the extraction function has transformation entry `input_key=output_value` where +`output_value` is equal to the filter `value` and `input_key` is present as dimension. + +| Property | Description | Required | +| -------- | ----------- | -------- | +| `type` | Must be "extraction" | Yes | +| `dimension` | Input column or virtual column name to filter. | Yes | +| `value` | String value to match. | No, if not specified the filter will match NULL values. | +| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | + +### Example: matching dimension values in `[product_1, product_3, product_5]` for the column `product` + +```json +{ + "filter": { + "type": "extraction", + "dimension": "product", + "value": "bar_1", + "extractionFn": { + "type": "lookup", + "lookup": { + "type": "map", + "map": { + "product_1": "bar_1", + "product_5": "bar_1", + "product_3": "bar_1" + } + } + } + } +} +``` + ## Filtering with extraction functions All filters except the "spatial" filter support extraction functions. @@ -420,9 +670,7 @@ If specified, the extraction function will be used to transform input values bef The example below shows a selector filter combined with an extraction function. This filter will transform input values according to the values defined in the lookup map; transformed values will then be matched with the string "bar_1". - -**Example** -The following matches dimension values in `[product_1, product_3, product_5]` for the column `product` +### Example: matches dimension values in `[product_1, product_3, product_5]` for the column `product` ```json { @@ -449,29 +697,97 @@ The following matches dimension values in `[product_1, product_3, product_5]` fo Druid supports filtering on timestamp, string, long, and float columns. -Note that only string columns have bitmap indexes. Therefore, queries that filter on other column types will need to +Note that only string columns and columns produced with the ['auto' ingestion spec](../ingestion/ingestion-spec.md#dimension-objects) also used by [type aware schema discovery](../ingestion/schema-design.md#type-aware-schema-discovery) have bitmap indexes. Therefore, queries that filter on other column types will need to scan those columns. +### Filtering on multi-value string columns + +All filters will return true if any one of the dimension values is satisfies the filter. + +#### Example: multi-value match behavior. +Given a multi-value STRING row with values `['a', 'b', 'c']`, a filter such as + +```json +{ "type": "equality", "column": "someMultiValueColumn", "matchValueType": "STRING", "matchValue": "b" } +``` +will successfully match the entire row. This can produce sometimes unintuitive behavior when coupled with the implicit UNNEST functionality of Druid [GroupBy](./groupbyquery.md) and [TopN](./topnquery.md) queries. + +Additionally, contradictory filters may be defined and perfectly legal in native queries which will not work in SQL. + +#### Example: SQL "contradiction" +This query is impossible to express as is in SQL since it is a contradiction that the SQL planner will optimize to false and match nothing. + +Given a multi-value STRING row with values `['a', 'b', 'c']`, and filter such as +```json +{ + "type": "and", + "fields": [ + { + "type": "equality", + "column": "someMultiValueColumn", + "matchValueType": "STRING", + "matchValue": "a" + }, + { + "type": "equality", + "column": "someMultiValueColumn", + "matchValueType": "STRING", + "matchValue": "b" + } + ] +} +``` +will successfully match the entire row, but not match a row with value `['a', 'c']`. + +To express this filter in SQL, one would need to use [SQL multi-value string functions](./sql-multivalue-string-functions.md) such as `MV_CONTAINS`, which can be optimized by the planner to the same native filters. + ### Filtering on numeric columns -When filtering on numeric columns, you can write filters as if they were strings. In most cases, your filter will be +Some filters, such as equality and range filters allow accepting numeric match values directly since they include a secondary `matchValueType` parameter. + +When filtering on numeric columns using string based filters such as the selector, in, and bounds filters, you can write filter match values as if they were strings. In most cases, your filter will be converted into a numeric predicate and will be applied to the numeric column values directly. In some cases (such as the "regex" filter) the numeric column values will be converted to strings during the scan. -For example, filtering on a specific value, `myFloatColumn = 10.1`: +#### Example: filtering on a specific value, `myFloatColumn = 10.1`: + +```json +{ + "type": "equality", + "dimension": "myFloatColumn", + "matchValueType": "FLOAT", + "value": 10.1 +} +``` + +or with a selector filter: ```json -"filter": { +{ "type": "selector", "dimension": "myFloatColumn", "value": "10.1" } ``` -Filtering on a range of values, `10 <= myFloatColumn < 20`: +#### Example: filtering on a range of values, `10 <= myFloatColumn < 20`: ```json -"filter": { +{ + "type": "range", + "column": "myFloatColumn", + "matchvalueType": "FLOAT", + "lower": 10.1, + "lowerOpen": false, + "upper": 20.9, + "upperOpen": true +} +``` + +or with a bound filter: + +```json +{ "type": "bound", "dimension": "myFloatColumn", "ordering": "numeric", @@ -490,20 +806,31 @@ should be specified as if the timestamp values were strings. If the user wishes to interpret the timestamp with a specific format, timezone, or locale, the [Time Format Extraction Function](./dimensionspecs.md#time-format-extraction-function) is useful. -For example, filtering on a long timestamp value: +#### Example: filtering on a long timestamp value ```json -"filter": { +{ + "type": "equality", + "dimension": "__time", + "matchValueType": "LONG", + "value": 124457387532 +} +``` + +or with a selector filter: + +```json +{ "type": "selector", "dimension": "__time", "value": "124457387532" } ``` -Filtering on day of week: +#### Example: filtering on day of week using an extractionFn ```json -"filter": { +{ "type": "selector", "dimension": "__time", "value": "Friday", @@ -516,7 +843,7 @@ Filtering on day of week: } ``` -Filtering on a set of ISO 8601 intervals: +#### Example: filtering on a set of ISO 8601 intervals ```json { @@ -529,25 +856,3 @@ Filtering on a set of ISO 8601 intervals: } ``` -### True filter -The true filter is a filter which matches all values. It can be used to temporarily disable other filters without removing the filter. - -```json - -{ "type" : "true" } -``` - -### Expression filter -The expression filter allows for the implementation of arbitrary conditions, leveraging the Druid expression system. - -This filter allows for more flexibility, but it might be less performant than a combination of the other filters on this page due to the fact that not all filter optimizations are in place yet. - -```json - -{ - "type" : "expression" , - "expression" : "((product_type == 42) && (!is_deleted))" -} -``` - -See the [Druid expression system](math-expr.md) for more details. diff --git a/docs/querying/sql-query-context.md b/docs/querying/sql-query-context.md index e469fa390a7e..ec05e06e20f7 100644 --- a/docs/querying/sql-query-context.md +++ b/docs/querying/sql-query-context.md @@ -44,6 +44,7 @@ Configure Druid SQL query planning using the parameters in the table below. |`enableTimeBoundaryPlanning`|If true, SQL queries will get converted to TimeBoundary queries wherever possible. TimeBoundary queries are very efficient for min-max calculation on `__time` column in a datasource |`druid.query.default.context.enableTimeBoundaryPlanning` on the Broker (default: false)| |`useNativeQueryExplain`|If true, `EXPLAIN PLAN FOR` will return the explain plan as a JSON representation of equivalent native query(s), else it will return the original version of explain plan generated by Calcite.

This property is provided for backwards compatibility. It is not recommended to use this parameter unless you were depending on the older behavior.|`druid.sql.planner.useNativeQueryExplain` on the Broker (default: true)| |`sqlFinalizeOuterSketches`|If false (default behavior in Druid 25.0.0 and later), `DS_HLL`, `DS_THETA`, and `DS_QUANTILES_SKETCH` return sketches in query results, as documented. If true (default behavior in Druid 24.0.1 and earlier), sketches from these functions are finalized when they appear in query results.

This property is provided for backwards compatibility with behavior in Druid 24.0.1 and earlier. It is not recommended to use this parameter unless you were depending on the older behavior. Instead, use a function that does not return a sketch, such as `APPROX_COUNT_DISTINCT_DS_HLL`, `APPROX_COUNT_DISTINCT_DS_THETA`, `APPROX_QUANTILE_DS`, `DS_THETA_ESTIMATE`, or `DS_GET_QUANTILE`.|`druid.query.default.context.sqlFinalizeOuterSketches` on the Broker (default: false)| +|`sqlUseBoundAndSelectors`|If false (default behavior if `druid.generic.useDefaultValueForNull=false` in Druid 27 and later), the SQL planner will use [equality](./filters.md#equality-filter), [null](./filters.md#null-filter), and [range](./filters.md#range-filter) filters instead of [selector](./filters.md#selector-filter) and [bounds](./filters.md#bounds-filters). This value must be set to `false` for correct behavior for filtering `ARRAY` typed values. | Defaults to same value as `druid.generic.useDefaultValueForNull`. | ## Setting the query context The query context parameters can be specified as a "context" object in the [JSON API](../api-reference/sql-api.md) or as a [JDBC connection properties object](../api-reference/sql-jdbc.md). diff --git a/website/.spelling b/website/.spelling index 775c056dcf0b..64d426e1c96d 100644 --- a/website/.spelling +++ b/website/.spelling @@ -347,6 +347,7 @@ interruptible isAllowList jackson-jq javadoc +javascript joinable jsonCompression json_keys diff --git a/website/static/css/custom.css b/website/static/css/custom.css index c625c6826253..c73a53081425 100644 --- a/website/static/css/custom.css +++ b/website/static/css/custom.css @@ -99,15 +99,3 @@ article iframe { margin-right: auto; max-width: 100%; } -.getAPI { - color: #0073e6; - font-weight: bold; -} -.postAPI { - color: #00bf7d; - font-weight: bold; -} -.deleteAPI { - color: #f49200; - font-weight: bold; -} \ No newline at end of file From bb55f526d688dce265750d11c42bac67ffa181ae Mon Sep 17 00:00:00 2001 From: Clint Wylie Date: Fri, 4 Aug 2023 17:54:09 -0700 Subject: [PATCH 2/6] oops --- docs/querying/sql-query-context.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/querying/sql-query-context.md b/docs/querying/sql-query-context.md index ec05e06e20f7..e3f131001bfb 100644 --- a/docs/querying/sql-query-context.md +++ b/docs/querying/sql-query-context.md @@ -44,7 +44,7 @@ Configure Druid SQL query planning using the parameters in the table below. |`enableTimeBoundaryPlanning`|If true, SQL queries will get converted to TimeBoundary queries wherever possible. TimeBoundary queries are very efficient for min-max calculation on `__time` column in a datasource |`druid.query.default.context.enableTimeBoundaryPlanning` on the Broker (default: false)| |`useNativeQueryExplain`|If true, `EXPLAIN PLAN FOR` will return the explain plan as a JSON representation of equivalent native query(s), else it will return the original version of explain plan generated by Calcite.

This property is provided for backwards compatibility. It is not recommended to use this parameter unless you were depending on the older behavior.|`druid.sql.planner.useNativeQueryExplain` on the Broker (default: true)| |`sqlFinalizeOuterSketches`|If false (default behavior in Druid 25.0.0 and later), `DS_HLL`, `DS_THETA`, and `DS_QUANTILES_SKETCH` return sketches in query results, as documented. If true (default behavior in Druid 24.0.1 and earlier), sketches from these functions are finalized when they appear in query results.

This property is provided for backwards compatibility with behavior in Druid 24.0.1 and earlier. It is not recommended to use this parameter unless you were depending on the older behavior. Instead, use a function that does not return a sketch, such as `APPROX_COUNT_DISTINCT_DS_HLL`, `APPROX_COUNT_DISTINCT_DS_THETA`, `APPROX_QUANTILE_DS`, `DS_THETA_ESTIMATE`, or `DS_GET_QUANTILE`.|`druid.query.default.context.sqlFinalizeOuterSketches` on the Broker (default: false)| -|`sqlUseBoundAndSelectors`|If false (default behavior if `druid.generic.useDefaultValueForNull=false` in Druid 27 and later), the SQL planner will use [equality](./filters.md#equality-filter), [null](./filters.md#null-filter), and [range](./filters.md#range-filter) filters instead of [selector](./filters.md#selector-filter) and [bounds](./filters.md#bounds-filters). This value must be set to `false` for correct behavior for filtering `ARRAY` typed values. | Defaults to same value as `druid.generic.useDefaultValueForNull`. | +|`sqlUseBoundAndSelectors`|If false (default behavior if `druid.generic.useDefaultValueForNull=false` in Druid 27 and later), the SQL planner will use [equality](./filters.md#equality-filter), [null](./filters.md#null-filter), and [range](./filters.md#range-filter) filters instead of [selector](./filters.md#selector-filter) and [bounds](./filters.md#bound-filters). This value must be set to `false` for correct behavior for filtering `ARRAY` typed values. | Defaults to same value as `druid.generic.useDefaultValueForNull`. | ## Setting the query context The query context parameters can be specified as a "context" object in the [JSON API](../api-reference/sql-api.md) or as a [JDBC connection properties object](../api-reference/sql-jdbc.md). From 0506b47cd2abdf2d885586ae1c0d4bcc9ce67f1c Mon Sep 17 00:00:00 2001 From: Clint Wylie Date: Fri, 4 Aug 2023 17:57:41 -0700 Subject: [PATCH 3/6] double oops --- docs/querying/sql-query-context.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/querying/sql-query-context.md b/docs/querying/sql-query-context.md index e3f131001bfb..2ea51b9c9ef1 100644 --- a/docs/querying/sql-query-context.md +++ b/docs/querying/sql-query-context.md @@ -44,7 +44,7 @@ Configure Druid SQL query planning using the parameters in the table below. |`enableTimeBoundaryPlanning`|If true, SQL queries will get converted to TimeBoundary queries wherever possible. TimeBoundary queries are very efficient for min-max calculation on `__time` column in a datasource |`druid.query.default.context.enableTimeBoundaryPlanning` on the Broker (default: false)| |`useNativeQueryExplain`|If true, `EXPLAIN PLAN FOR` will return the explain plan as a JSON representation of equivalent native query(s), else it will return the original version of explain plan generated by Calcite.

This property is provided for backwards compatibility. It is not recommended to use this parameter unless you were depending on the older behavior.|`druid.sql.planner.useNativeQueryExplain` on the Broker (default: true)| |`sqlFinalizeOuterSketches`|If false (default behavior in Druid 25.0.0 and later), `DS_HLL`, `DS_THETA`, and `DS_QUANTILES_SKETCH` return sketches in query results, as documented. If true (default behavior in Druid 24.0.1 and earlier), sketches from these functions are finalized when they appear in query results.

This property is provided for backwards compatibility with behavior in Druid 24.0.1 and earlier. It is not recommended to use this parameter unless you were depending on the older behavior. Instead, use a function that does not return a sketch, such as `APPROX_COUNT_DISTINCT_DS_HLL`, `APPROX_COUNT_DISTINCT_DS_THETA`, `APPROX_QUANTILE_DS`, `DS_THETA_ESTIMATE`, or `DS_GET_QUANTILE`.|`druid.query.default.context.sqlFinalizeOuterSketches` on the Broker (default: false)| -|`sqlUseBoundAndSelectors`|If false (default behavior if `druid.generic.useDefaultValueForNull=false` in Druid 27 and later), the SQL planner will use [equality](./filters.md#equality-filter), [null](./filters.md#null-filter), and [range](./filters.md#range-filter) filters instead of [selector](./filters.md#selector-filter) and [bounds](./filters.md#bound-filters). This value must be set to `false` for correct behavior for filtering `ARRAY` typed values. | Defaults to same value as `druid.generic.useDefaultValueForNull`. | +|`sqlUseBoundAndSelectors`|If false (default behavior if `druid.generic.useDefaultValueForNull=false` in Druid 27 and later), the SQL planner will use [equality](./filters.md#equality-filter), [null](./filters.md#null-filter), and [range](./filters.md#range-filter) filters instead of [selector](./filters.md#selector-filter) and [bounds](./filters.md#bound-filter). This value must be set to `false` for correct behavior for filtering `ARRAY` typed values. | Defaults to same value as `druid.generic.useDefaultValueForNull`. | ## Setting the query context The query context parameters can be specified as a "context" object in the [JSON API](../api-reference/sql-api.md) or as a [JDBC connection properties object](../api-reference/sql-jdbc.md). From 52a54bfa61ff4a9e1abcd7818f3b8daf87897d89 Mon Sep 17 00:00:00 2001 From: Clint Wylie Date: Mon, 7 Aug 2023 13:34:44 -0700 Subject: [PATCH 4/6] adjust for review --- docs/ingestion/schema-design.md | 2 +- docs/querying/filters.md | 89 ++++++++++++++---------------- docs/querying/sql-query-context.md | 2 +- 3 files changed, 44 insertions(+), 49 deletions(-) diff --git a/docs/ingestion/schema-design.md b/docs/ingestion/schema-design.md index e59515b465ad..974bf5956a61 100644 --- a/docs/ingestion/schema-design.md +++ b/docs/ingestion/schema-design.md @@ -261,7 +261,7 @@ native boolean types, Druid ingests these values as strings if `druid.expression the [array functions](../querying/sql-array-functions.md) or [UNNEST](../querying/sql-functions.md#unnest). Nested columns can be queried with the [JSON functions](../querying/sql-json-functions.md). -We also highly recommend setting `druid.generic.useDefaultValueForNull=false` when using these columns since it also enables out of the box `ARRAY` type filtering. If this is not set to true, setting `sqlUseBoundsAndSelectors` to `false` on the [SQL query context](../querying/sql-query-context.md) can enable `ARRAY` filtering. +We also highly recommend setting `druid.generic.useDefaultValueForNull=false` when using these columns since it also enables out of the box `ARRAY` type filtering. If not set to `false`, setting `sqlUseBoundsAndSelectors` to `false` on the [SQL query context](../querying/sql-query-context.md) can enable `ARRAY` filtering instead. Mixed type columns are stored in the _least_ restrictive type that can represent all values in the column. For example: diff --git a/docs/querying/filters.md b/docs/querying/filters.md index 26a21bede4ed..65c6243ef934 100644 --- a/docs/querying/filters.md +++ b/docs/querying/filters.md @@ -35,19 +35,18 @@ Apache Druid supports the following types of filters. ## Selector filter -The simplest filter is a selector filter. The selector filter will match a specific dimension with a specific value. Selector filters can be used as the base filters for more complex Boolean expressions of filters. +The simplest filter is a selector filter. The selector filter matches a specific dimension with a specific value. Selector filters can be used as the base filters for more complex Boolean expressions of filters. | Property | Description | Required | | -------- | ----------- | -------- | | `type` | Must be "selector".| Yes | | `dimension` | Input column or virtual column name to filter. | Yes | -| `value` | String value to match. | No, if not specified the filter will match NULL values. | +| `value` | String value to match. | No. If not specified the filter matches NULL values. | | `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | -The selector filter is limited to only being able to match against `STRING` (single and multi-valued), `LONG`, `FLOAT`, -`DOUBLE` types. Use the newer null and equality filters to match against `ARRAY` or `COMPLEX` types. +The selector filter can only match against `STRING` (single and multi-valued), `LONG`, `FLOAT`, `DOUBLE` types. Use the newer null and equality filters to match against `ARRAY` or `COMPLEX` types. -When the selector filter matches against numeric inputs, the string `value` will be best effort coerced into a numeric value. +When the selector filter matches against numeric inputs, the string `value` will be best-effort coerced into a numeric value. ### Example: equivalent of `WHERE someColumn = 'hello'` @@ -65,16 +64,16 @@ When the selector filter matches against numeric inputs, the string `value` will ## Equality Filter -The equality filter is a replacement for the selector filter with the ability to match against any type of column. The equality filter intends to have more SQL compatible behavior than the selector filter and so cannot match NULL values, use the null filter instead. +The equality filter is a replacement for the selector filter with the ability to match against any type of column. The equality filter is designed to have more SQL compatible behavior than the selector filter and so can not match null values. To match null values use the null filter. -Druid's SQL planner will by default use the equality filter instead of selector filter whenever `druid.generic.useDefaultValueForNull=false`, or if `sqlUseBoundAndSelectors` is set to false on the [SQL query context](./sql-query-context.md). +Druid's SQL planner uses the equality filter by default instead of selector filter whenever `druid.generic.useDefaultValueForNull=false`, or if `sqlUseBoundAndSelectors` is set to false on the [SQL query context](./sql-query-context.md). | Property | Description | Required | | -------- | ----------- | -------- | | `type` | Must be "equality".| Yes | | `column` | Input column or virtual column name to filter. | Yes | -| `matchValueType` | String specifying the type of value to match, e.g. `STRING`, `LONG`, `DOUBLE`, `FLOAT`, `ARRAY`, `ARRAY` etc. The `matchValueType` indicates how Druid should interpret the `matchValue` to assist in converting to the type of the `column` being matched. | Yes | -| `matchValue` | Value to match, must not me null. | Yes | +| `matchValueType` | String specifying the type of value to match, for example `STRING`, `LONG`, `DOUBLE`, `FLOAT`, `ARRAY`, `ARRAY`, or any other Druid type. The `matchValueType` determines how Druid interprets the `matchValue` to assist in converting to the type of the matched `column`. | Yes | +| `matchValue` | Value to match, must not be null. | Yes | ### Example: equivalent of `WHERE someColumn = 'hello'` @@ -97,7 +96,9 @@ Druid's SQL planner will by default use the equality filter instead of selector ## Null Filter -The null filter is a partial replacement for the selector filter dedicated to matching NULL values. +The null filter is a partial replacement for the selector filter. It is dedicated to matching NULL values. + +Druid's SQL planner uses the null filter by default instead of selector filter whenever `druid.generic.useDefaultValueForNull=false`, or if `sqlUseBoundAndSelectors` is set to false on the [SQL query context](./sql-query-context.md). | Property | Description | Required | | -------- | ----------- | -------- | @@ -113,7 +114,7 @@ The null filter is a partial replacement for the selector filter dedicated to ma ## Column comparison filter -The column comparison filter is similar to the selector filter, but instead compares dimensions to each other. For example: +The column comparison filter is similar to the selector filter, but compares dimensions to each other. For example: | Property | Description | Required | | -------- | ----------- | -------- | @@ -122,7 +123,7 @@ The column comparison filter is similar to the selector filter, but instead comp `dimensions` is list of [DimensionSpecs](./dimensionspecs.md), making it possible to apply an extraction function if needed. -Note that the column comparison filter converts all values to strings prior to comparison, which can allow differently typed input columns to still match each other without any sort of casting required. +Note that the column comparison filter converts all values to strings prior to comparison. This allows differently-typed input columns to match without a cast operation. ### Example: equivalent of `WHERE someColumn = someLongColumn`. @@ -242,12 +243,11 @@ greater than, less than, greater than or equal to, less than or equal to, and "b | `ordering` | String that specifies the sorting order to use when comparing values against the bound. Can be one of the following values: `"lexicographic"`, `"alphanumeric"`, `"numeric"`, `"strlen"`, `"version"`. See [Sorting Orders](./sorting-orders.md) for more details. | No, default: `"lexicographic"`| | `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | -When the bound filter matches against numeric inputs, the string `lower` and `upper` bound values will be best effort coerced into a numeric value when using the `"numeric"` mode of ordering. +When the bound filter matches against numeric inputs, the string `lower` and `upper` bound values are best-effort coerced into a numeric value when using the `"numeric"` mode of ordering. -The bound filter is limited to only being able to match against `STRING` (single and multi-valued), `LONG`, `FLOAT`, -`DOUBLE` types. Use the newer range to match against `ARRAY` or `COMPLEX` types. +The bound filter can only match against `STRING` (single and multi-valued), `LONG`, `FLOAT`, `DOUBLE` types. Use the newer range to match against `ARRAY` or `COMPLEX` types. -Also note that the bound filter will match null values if no lower bound is specified. Use the range filter instead if SQL compatible behavior is desired. +Note that the bound filter matches null values if you don't specify a lower bound. Use the range filter if SQL-compatible behavior. ### Example: equivalent to `WHERE 21 <= age <= 31`: @@ -312,17 +312,17 @@ Also note that the bound filter will match null values if no lower bound is spec ## Range filter -The range filter is a replacement for the bound filter with the ability to compare against any type of column. The range filter has more SQL compliant behavior, and will never match NULL values, even if no lower bound is specified. +The range filter is a replacement for the bound filter. It compares against any type of column and is designed to have has more SQL compliant behavior than the bound filter. It won't match null values, even if you don't specify a lower bound. -Druid's SQL planner will by default use the range filter instead of bound filter whenever `druid.generic.useDefaultValueForNull=false`, or if `sqlUseBoundAndSelectors` is set to false on the [SQL query context](./sql-query-context.md). +Druid's SQL planner uses the range filter by default instead of bound filter whenever `druid.generic.useDefaultValueForNull=false`, or if `sqlUseBoundAndSelectors` is set to false on the [SQL query context](./sql-query-context.md). | Property | Description | Required | | -------- | ----------- | -------- | | `type` | Must be "range".| Yes | | `column` | Input column or virtual column name to filter. | Yes | -| `matchValueType` | String specifying the type of bounds to match, e.g. `STRING`, `LONG`, `DOUBLE`, `FLOAT`, `ARRAY`, `ARRAY` etc. The `matchValueType` indicates how Druid should interpret the `matchValue` to assist in converting to the type of the `column` being matched and also defines the type of comparison to use when matching values. | Yes | -| `lower` | Lower bound value to match. | No, at least one of `lower` or `upper` must not me null. | -| `upper` | Upper bound value to match. | No, at least one of `lower` or `upper` must not me null. | +| `matchValueType` | String specifying the type of bounds to match, for example `STRING`, `LONG`, `DOUBLE`, `FLOAT`, `ARRAY`, `ARRAY`, or any other Druid type. The `matchValueType` determines how Druid interprets the `matchValue` to assist in converting to the type of the matched `column` and also defines the type of comparison used when matching values. | Yes | +| `lower` | Lower bound value to match. | No. At least one of `lower` or `upper` must not be null. | +| `upper` | Upper bound value to match. | No. At least one of `lower` or `upper` must not be null. | | `lowerOpen` | Boolean indicating if lower bound is open in the interval of values defined by the range (">" instead of ">="). | No | | `upperOpen` | Boolean indicating if upper bound is open on the interval of values defined by range ("<" instead of "<="). | No | @@ -456,7 +456,7 @@ This filter converts the ISO 8601 intervals to long millisecond start/end ranges | -------- | ----------- | -------- | | `type` | Must be "interval". | Yes | | `dimension` | Input column or virtual column name to filter. | Yes | -| `intervals` | A JSON array containing ISO-8601 interval strings. This defines the time ranges to filter on. | Yes | +| `intervals` | A JSON array containing ISO-8601 interval strings that defines the time ranges to filter on. | Yes | | `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | The interval filter supports the use of extraction functions, see [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. @@ -506,14 +506,14 @@ The filter above is equivalent to the following OR of Bound filters: ## True filter -The true filter is a filter which matches all values. It can be used to temporarily disable other filters without removing the filter. +A filter which matches all values. You can use it to temporarily disable other filters without removing them. ```json { "type" : "true" } ``` ## False filter -The false filter matches no values. It can be used to force a query to match nothing. +A filter matches no values. You can use it to force a query to match no values. ```json {"type": "false" } @@ -522,7 +522,7 @@ The false filter matches no values. It can be used to force a query to match not ## Search filter -Search filters can be used to filter on partial string matches. +You can use search filters to filter on partial string matches. ```json { @@ -544,8 +544,6 @@ Search filters can be used to filter on partial string matches. | `query`| A JSON object for the type of search. See [search query spec](#search-query-spec) for more information. | Yes | | `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | -The search filter supports the use of extraction functions, see [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. - ### Search query spec #### Contains @@ -553,15 +551,15 @@ The search filter supports the use of extraction functions, see [Filtering with | Property | Description | Required | | -------- | ----------- | -------- | | `type` | Must be "contains". | Yes | -| `value` |A String value to run the search over. | Yes | -| `caseSensitive` | Whether two string should be compared as case sensitive or not. | No, default is false (insensitive) | +| `value` | A String value to search. | Yes | +| `caseSensitive` | Whether the string comparison is case-sensitive or not. | No, default is false (insensitive) | #### Insensitive Contains | Property | Description | Required | | -------- | ----------- | -------- | | `type` | Must be "insensitive_contains". | Yes | -| `value` | A String value to run the search over. | Yes | +| `value` | A String value to search. | Yes | Note that an "insensitive_contains" search is equivalent to a "contains" search with "caseSensitive": false (or not provided). @@ -571,14 +569,14 @@ provided). | Property | Description | Required | | -------- | ----------- | -------- | | `type` | Must be "fragment". | Yes | -| `values` | A JSON array of String values to run the search over. | Yes | -| `caseSensitive` | Whether strings should be compared as case sensitive or not. | No, default is false (insensitive) | +| `values` | A JSON array of string values to search. | Yes | +| `caseSensitive` | Whether the string comparison is case-sensitive or not. | No, default is false (insensitive) | ## Expression filter -The expression filter allows for the implementation of arbitrary conditions, leveraging the Druid expression system. This filter allows for complete flexibility, but it might be less performant than a combination of the other filters on this page due to the fact that not always being able to take advantage of the same optimizations others might benefit from. +The expression filter allows for the implementation of arbitrary conditions, leveraging the Druid expression system. This filter allows for complete flexibility, but it might be less performant than a combination of the other filters on this page because it can't always use the same optimizations available to other filters. | Property | Description | Required | | -------- | ----------- | -------- | @@ -603,7 +601,7 @@ The JavaScript filter matches a dimension against the specified JavaScript funct | -------- | ----------- | -------- | | `type` | Must be "javascript" | Yes | | `dimension` | Input column or virtual column name to filter. | Yes | -| `function` | JavaScript function which accepts a single argument, the dimension value, and returns either true or false. | Yes | +| `function` | JavaScript function which accepts the dimension value as a single argument, and returns either true or false. | Yes | | `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | ### Example: matching any dimension values for the dimension `name` between `'bar'` and `'foo'` @@ -616,25 +614,22 @@ The JavaScript filter matches a dimension against the specified JavaScript funct } ``` -The JavaScript filter supports the use of extraction functions, see [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. - -> JavaScript-based functionality is disabled by default. Please refer to the Druid [JavaScript programming guide](../development/javascript.md) for guidelines about using Druid's JavaScript functionality, including instructions on how to enable it. +> JavaScript-based functionality is disabled by default. Refer to the Druid [JavaScript programming guide](../development/javascript.md) for guidelines about using Druid's JavaScript functionality, including instructions on how to enable it. ## Extraction filter -> The extraction filter is now deprecated. The selector filter with an extraction function specified -> provides identical functionality and should be used instead. +> The extraction filter is now deprecated. Use the selector filter with an extraction function instead. -Extraction filter matches a dimension using some specific [Extraction function](./dimensionspecs.md#extraction-functions). -The following filter matches the values for which the extraction function has transformation entry `input_key=output_value` where -`output_value` is equal to the filter `value` and `input_key` is present as dimension. +Extraction filter matches a dimension using a specific [extraction function](./dimensionspecs.md#extraction-functions). +The following filter matches the values for which the extraction function has a transformation entry `input_key=output_value` where +`output_value` is equal to the filter `value` and `input_key` is present as a dimension. | Property | Description | Required | | -------- | ----------- | -------- | | `type` | Must be "extraction" | Yes | | `dimension` | Input column or virtual column name to filter. | Yes | -| `value` | String value to match. | No, if not specified the filter will match NULL values. | +| `value` | String value to match. | No. If not specified the filter will match NULL values. | | `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | ### Example: matching dimension values in `[product_1, product_3, product_5]` for the column `product` @@ -697,12 +692,12 @@ according to the values defined in the lookup map; transformed values will then Druid supports filtering on timestamp, string, long, and float columns. -Note that only string columns and columns produced with the ['auto' ingestion spec](../ingestion/ingestion-spec.md#dimension-objects) also used by [type aware schema discovery](../ingestion/schema-design.md#type-aware-schema-discovery) have bitmap indexes. Therefore, queries that filter on other column types will need to +Note that only string columns and columns produced with the ['auto' ingestion spec](../ingestion/ingestion-spec.md#dimension-objects) also used by [type aware schema discovery](../ingestion/schema-design.md#type-aware-schema-discovery) have bitmap indexes. Queries that filter on other column types must scan those columns. ### Filtering on multi-value string columns -All filters will return true if any one of the dimension values is satisfies the filter. +All filters return true if any one of the dimension values is satisfies the filter. #### Example: multi-value match behavior. Given a multi-value STRING row with values `['a', 'b', 'c']`, a filter such as @@ -739,7 +734,7 @@ Given a multi-value STRING row with values `['a', 'b', 'c']`, and filter such as ``` will successfully match the entire row, but not match a row with value `['a', 'c']`. -To express this filter in SQL, one would need to use [SQL multi-value string functions](./sql-multivalue-string-functions.md) such as `MV_CONTAINS`, which can be optimized by the planner to the same native filters. +To express this filter in SQL, use [SQL multi-value string functions](./sql-multivalue-string-functions.md) such as `MV_CONTAINS`, which can be optimized by the planner to the same native filters. ### Filtering on numeric columns @@ -804,7 +799,7 @@ Query filters can also be applied to the timestamp column. The timestamp column to the timestamp column, use the string `__time` as the dimension name. Like numeric dimensions, timestamp filters should be specified as if the timestamp values were strings. -If the user wishes to interpret the timestamp with a specific format, timezone, or locale, the [Time Format Extraction Function](./dimensionspecs.md#time-format-extraction-function) is useful. +If you want to interpret the timestamp with a specific format, timezone, or locale, the [Time Format Extraction Function](./dimensionspecs.md#time-format-extraction-function) is useful. #### Example: filtering on a long timestamp value diff --git a/docs/querying/sql-query-context.md b/docs/querying/sql-query-context.md index 2ea51b9c9ef1..2c9c5576c838 100644 --- a/docs/querying/sql-query-context.md +++ b/docs/querying/sql-query-context.md @@ -44,7 +44,7 @@ Configure Druid SQL query planning using the parameters in the table below. |`enableTimeBoundaryPlanning`|If true, SQL queries will get converted to TimeBoundary queries wherever possible. TimeBoundary queries are very efficient for min-max calculation on `__time` column in a datasource |`druid.query.default.context.enableTimeBoundaryPlanning` on the Broker (default: false)| |`useNativeQueryExplain`|If true, `EXPLAIN PLAN FOR` will return the explain plan as a JSON representation of equivalent native query(s), else it will return the original version of explain plan generated by Calcite.

This property is provided for backwards compatibility. It is not recommended to use this parameter unless you were depending on the older behavior.|`druid.sql.planner.useNativeQueryExplain` on the Broker (default: true)| |`sqlFinalizeOuterSketches`|If false (default behavior in Druid 25.0.0 and later), `DS_HLL`, `DS_THETA`, and `DS_QUANTILES_SKETCH` return sketches in query results, as documented. If true (default behavior in Druid 24.0.1 and earlier), sketches from these functions are finalized when they appear in query results.

This property is provided for backwards compatibility with behavior in Druid 24.0.1 and earlier. It is not recommended to use this parameter unless you were depending on the older behavior. Instead, use a function that does not return a sketch, such as `APPROX_COUNT_DISTINCT_DS_HLL`, `APPROX_COUNT_DISTINCT_DS_THETA`, `APPROX_QUANTILE_DS`, `DS_THETA_ESTIMATE`, or `DS_GET_QUANTILE`.|`druid.query.default.context.sqlFinalizeOuterSketches` on the Broker (default: false)| -|`sqlUseBoundAndSelectors`|If false (default behavior if `druid.generic.useDefaultValueForNull=false` in Druid 27 and later), the SQL planner will use [equality](./filters.md#equality-filter), [null](./filters.md#null-filter), and [range](./filters.md#range-filter) filters instead of [selector](./filters.md#selector-filter) and [bounds](./filters.md#bound-filter). This value must be set to `false` for correct behavior for filtering `ARRAY` typed values. | Defaults to same value as `druid.generic.useDefaultValueForNull`. | +|`sqlUseBoundAndSelectors`|If false (default behavior if `druid.generic.useDefaultValueForNull=false` in Druid 27.0.0 and later), the SQL planner will use [equality](./filters.md#equality-filter), [null](./filters.md#null-filter), and [range](./filters.md#range-filter) filters instead of [selector](./filters.md#selector-filter) and [bounds](./filters.md#bound-filter). This value must be set to `false` for correct behavior for filtering `ARRAY` typed values. | Defaults to same value as `druid.generic.useDefaultValueForNull`. | ## Setting the query context The query context parameters can be specified as a "context" object in the [JSON API](../api-reference/sql-api.md) or as a [JDBC connection properties object](../api-reference/sql-jdbc.md). From 7c49710e3253b92399654749c1dd2f919027b603 Mon Sep 17 00:00:00 2001 From: Clint Wylie Date: Tue, 8 Aug 2023 14:59:30 -0700 Subject: [PATCH 5/6] review stuff --- docs/querying/filters.md | 28 ++++++++++++++-------------- docs/querying/sql-query-context.md | 2 +- website/static/css/custom.css | 12 ++++++++++++ 3 files changed, 27 insertions(+), 15 deletions(-) diff --git a/docs/querying/filters.md b/docs/querying/filters.md index 65c6243ef934..d103aa8c3781 100644 --- a/docs/querying/filters.md +++ b/docs/querying/filters.md @@ -55,7 +55,7 @@ When the selector filter matches against numeric inputs, the string `value` will ``` -### Example: equivalent of `WHERE someColumn IS NULL`. +### Example: equivalent of `WHERE someColumn IS NULL` ``` json { "type": "selector", "dimension": "someColumn", "value": null } @@ -72,7 +72,7 @@ Druid's SQL planner uses the equality filter by default instead of selector filt | -------- | ----------- | -------- | | `type` | Must be "equality".| Yes | | `column` | Input column or virtual column name to filter. | Yes | -| `matchValueType` | String specifying the type of value to match, for example `STRING`, `LONG`, `DOUBLE`, `FLOAT`, `ARRAY`, `ARRAY`, or any other Druid type. The `matchValueType` determines how Druid interprets the `matchValue` to assist in converting to the type of the matched `column`. | Yes | +| `matchValueType` | String specifying the type of value to match. For example `STRING`, `LONG`, `DOUBLE`, `FLOAT`, `ARRAY`, `ARRAY`, or any other Druid type. The `matchValueType` determines how Druid interprets the `matchValue` to assist in converting to the type of the matched `column`. | Yes | | `matchValue` | Value to match, must not be null. | Yes | ### Example: equivalent of `WHERE someColumn = 'hello'` @@ -125,7 +125,7 @@ The column comparison filter is similar to the selector filter, but compares dim Note that the column comparison filter converts all values to strings prior to comparison. This allows differently-typed input columns to match without a cast operation. -### Example: equivalent of `WHERE someColumn = someLongColumn`. +### Example: equivalent of `WHERE someColumn = someLongColumn` ``` json { @@ -249,7 +249,7 @@ The bound filter can only match against `STRING` (single and multi-valued), `LON Note that the bound filter matches null values if you don't specify a lower bound. Use the range filter if SQL-compatible behavior. -### Example: equivalent to `WHERE 21 <= age <= 31`: +### Example: equivalent to `WHERE 21 <= age <= 31` ```json { @@ -261,7 +261,7 @@ Note that the bound filter matches null values if you don't specify a lower boun } ``` -### Example: equivalent to `WHERE 'foo' <= name <= 'hoo'`, using the default lexicographic sorting order. +### Example: equivalent to `WHERE 'foo' <= name <= 'hoo'`, using the default lexicographic sorting order ```json { @@ -286,7 +286,7 @@ Note that the bound filter matches null values if you don't specify a lower boun } ``` -### Example: equivalent to `WHERE age < 31`. +### Example: equivalent to `WHERE age < 31` ```json { @@ -320,13 +320,13 @@ Druid's SQL planner uses the range filter by default instead of bound filter whe | -------- | ----------- | -------- | | `type` | Must be "range".| Yes | | `column` | Input column or virtual column name to filter. | Yes | -| `matchValueType` | String specifying the type of bounds to match, for example `STRING`, `LONG`, `DOUBLE`, `FLOAT`, `ARRAY`, `ARRAY`, or any other Druid type. The `matchValueType` determines how Druid interprets the `matchValue` to assist in converting to the type of the matched `column` and also defines the type of comparison used when matching values. | Yes | +| `matchValueType` | String specifying the type of bounds to match. For example `STRING`, `LONG`, `DOUBLE`, `FLOAT`, `ARRAY`, `ARRAY`, or any other Druid type. The `matchValueType` determines how Druid interprets the `matchValue` to assist in converting to the type of the matched `column` and also defines the type of comparison used when matching values. | Yes | | `lower` | Lower bound value to match. | No. At least one of `lower` or `upper` must not be null. | | `upper` | Upper bound value to match. | No. At least one of `lower` or `upper` must not be null. | | `lowerOpen` | Boolean indicating if lower bound is open in the interval of values defined by the range (">" instead of ">="). | No | | `upperOpen` | Boolean indicating if upper bound is open on the interval of values defined by range ("<" instead of "<="). | No | -### Example: equivalent to `WHERE 21 <= age <= 31`: +### Example: equivalent to `WHERE 21 <= age <= 31` ```json { @@ -338,7 +338,7 @@ Druid's SQL planner uses the range filter by default instead of bound filter whe } ``` -### Example: equivalent to `WHERE 'foo' <= name <= 'hoo'`, using STRING comparison. +### Example: equivalent to `WHERE 'foo' <= name <= 'hoo'`, using STRING comparison ```json { @@ -364,7 +364,7 @@ Druid's SQL planner uses the range filter by default instead of bound filter whe } ``` -### Example: equivalent to `WHERE age < 31`. +### Example: equivalent to `WHERE age < 31` ```json { @@ -699,7 +699,7 @@ scan those columns. All filters return true if any one of the dimension values is satisfies the filter. -#### Example: multi-value match behavior. +#### Example: multi-value match behavior Given a multi-value STRING row with values `['a', 'b', 'c']`, a filter such as ```json @@ -744,7 +744,7 @@ When filtering on numeric columns using string based filters such as the selecto converted into a numeric predicate and will be applied to the numeric column values directly. In some cases (such as the "regex" filter) the numeric column values will be converted to strings during the scan. -#### Example: filtering on a specific value, `myFloatColumn = 10.1`: +#### Example: filtering on a specific value, `myFloatColumn = 10.1` ```json { @@ -765,7 +765,7 @@ or with a selector filter: } ``` -#### Example: filtering on a range of values, `10 <= myFloatColumn < 20`: +#### Example: filtering on a range of values, `10 <= myFloatColumn < 20` ```json { @@ -822,7 +822,7 @@ or with a selector filter: } ``` -#### Example: filtering on day of week using an extractionFn +#### Example: filtering on day of week using an extraction function ```json { diff --git a/docs/querying/sql-query-context.md b/docs/querying/sql-query-context.md index 2c9c5576c838..7798fbf34ce2 100644 --- a/docs/querying/sql-query-context.md +++ b/docs/querying/sql-query-context.md @@ -44,7 +44,7 @@ Configure Druid SQL query planning using the parameters in the table below. |`enableTimeBoundaryPlanning`|If true, SQL queries will get converted to TimeBoundary queries wherever possible. TimeBoundary queries are very efficient for min-max calculation on `__time` column in a datasource |`druid.query.default.context.enableTimeBoundaryPlanning` on the Broker (default: false)| |`useNativeQueryExplain`|If true, `EXPLAIN PLAN FOR` will return the explain plan as a JSON representation of equivalent native query(s), else it will return the original version of explain plan generated by Calcite.

This property is provided for backwards compatibility. It is not recommended to use this parameter unless you were depending on the older behavior.|`druid.sql.planner.useNativeQueryExplain` on the Broker (default: true)| |`sqlFinalizeOuterSketches`|If false (default behavior in Druid 25.0.0 and later), `DS_HLL`, `DS_THETA`, and `DS_QUANTILES_SKETCH` return sketches in query results, as documented. If true (default behavior in Druid 24.0.1 and earlier), sketches from these functions are finalized when they appear in query results.

This property is provided for backwards compatibility with behavior in Druid 24.0.1 and earlier. It is not recommended to use this parameter unless you were depending on the older behavior. Instead, use a function that does not return a sketch, such as `APPROX_COUNT_DISTINCT_DS_HLL`, `APPROX_COUNT_DISTINCT_DS_THETA`, `APPROX_QUANTILE_DS`, `DS_THETA_ESTIMATE`, or `DS_GET_QUANTILE`.|`druid.query.default.context.sqlFinalizeOuterSketches` on the Broker (default: false)| -|`sqlUseBoundAndSelectors`|If false (default behavior if `druid.generic.useDefaultValueForNull=false` in Druid 27.0.0 and later), the SQL planner will use [equality](./filters.md#equality-filter), [null](./filters.md#null-filter), and [range](./filters.md#range-filter) filters instead of [selector](./filters.md#selector-filter) and [bounds](./filters.md#bound-filter). This value must be set to `false` for correct behavior for filtering `ARRAY` typed values. | Defaults to same value as `druid.generic.useDefaultValueForNull`. | +|`sqlUseBoundAndSelectors`|If false (default behavior if `druid.generic.useDefaultValueForNull=false` in Druid 27.0.0 and later), the SQL planner will use [equality](./filters.md#equality-filter), [null](./filters.md#null-filter), and [range](./filters.md#range-filter) filters instead of [selector](./filters.md#selector-filter) and [bounds](./filters.md#bound-filter). This value must be set to `false` for correct behavior for filtering `ARRAY` typed values. | Defaults to same value as `druid.generic.useDefaultValueForNull` | ## Setting the query context The query context parameters can be specified as a "context" object in the [JSON API](../api-reference/sql-api.md) or as a [JDBC connection properties object](../api-reference/sql-jdbc.md). diff --git a/website/static/css/custom.css b/website/static/css/custom.css index c73a53081425..c625c6826253 100644 --- a/website/static/css/custom.css +++ b/website/static/css/custom.css @@ -99,3 +99,15 @@ article iframe { margin-right: auto; max-width: 100%; } +.getAPI { + color: #0073e6; + font-weight: bold; +} +.postAPI { + color: #00bf7d; + font-weight: bold; +} +.deleteAPI { + color: #f49200; + font-weight: bold; +} \ No newline at end of file From ce3171e40402fe06647cad177c39fa5c3a8c320c Mon Sep 17 00:00:00 2001 From: Clint Wylie Date: Tue, 8 Aug 2023 15:13:14 -0700 Subject: [PATCH 6/6] fixes --- docs/querying/filters.md | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/docs/querying/filters.md b/docs/querying/filters.md index d103aa8c3781..74ae3f406a34 100644 --- a/docs/querying/filters.md +++ b/docs/querying/filters.md @@ -42,7 +42,7 @@ The simplest filter is a selector filter. The selector filter matches a specific | `type` | Must be "selector".| Yes | | `dimension` | Input column or virtual column name to filter. | Yes | | `value` | String value to match. | No. If not specified the filter matches NULL values. | -| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | +| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [filtering with extraction functions](#filtering-with-extraction-functions) for details. | No | The selector filter can only match against `STRING` (single and multi-valued), `LONG`, `FLOAT`, `DOUBLE` types. Use the newer null and equality filters to match against `ARRAY` or `COMPLEX` types. @@ -207,7 +207,7 @@ The in filter can match input rows against a set of values, where a match occurs | `type` | Must be "in".| Yes | | `dimension` | Input column or virtual column name to filter. | Yes | | `values` | List of string value to match. | Yes | -| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | +| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [filtering with extraction functions](#filtering-with-extraction-functions) for details. | No | If an empty `values` array is passed to the "in" filter, it will simply return an empty result. @@ -241,7 +241,7 @@ greater than, less than, greater than or equal to, less than or equal to, and "b | `lowerStrict` | Boolean indicating whether to perform strict comparison on the `lower` bound (">" instead of ">="). | No, default: `false` | | `upperStrict` | Boolean indicating whether to perform strict comparison on the upper bound ("<" instead of "<="). | No, default: `false`| | `ordering` | String that specifies the sorting order to use when comparing values against the bound. Can be one of the following values: `"lexicographic"`, `"alphanumeric"`, `"numeric"`, `"strlen"`, `"version"`. See [Sorting Orders](./sorting-orders.md) for more details. | No, default: `"lexicographic"`| -| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | +| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [filtering with extraction functions](#filtering-with-extraction-functions) for details. | No | When the bound filter matches against numeric inputs, the string `lower` and `upper` bound values are best-effort coerced into a numeric value when using the `"numeric"` mode of ordering. @@ -387,7 +387,7 @@ Druid's SQL planner uses the range filter by default instead of bound filter whe } ``` -### Example: equivalent to `WHERE ARRAY['a','b','c'] < arrayColumn < ARRAY['d','e','f']`, using ARRAY comparison. +### Example: equivalent to `WHERE ARRAY['a','b','c'] < arrayColumn < ARRAY['d','e','f']`, using ARRAY comparison ```json { @@ -413,7 +413,7 @@ supported are "%" (matches any number of characters) and "\_" (matches any one c | `dimension` | Input column or virtual column name to filter. | Yes | | `pattern` | String LIKE pattern, such as "foo%" or "___bar".| Yes | | `escape`| A string escape character that can be used to escape special characters. | No | -| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | +| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [filtering with extraction functions](#filtering-with-extraction-functions) for details. | No | Like filters support the use of extraction functions, see [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. @@ -436,7 +436,7 @@ The regular expression filter is similar to the selector filter, but using regul | `type` | Must be "regex".| Yes | | `dimension` | Input column or virtual column name to filter. | Yes | | `pattern` | String pattern to match - any standard [Java regular expression](http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html). | Yes | -| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | +| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [filtering with extraction functions](#filtering-with-extraction-functions) for details. | No | Note that it is often more optimal to use a like filter instead of a regex for simple matching of prefixes. @@ -457,7 +457,7 @@ This filter converts the ISO 8601 intervals to long millisecond start/end ranges | `type` | Must be "interval". | Yes | | `dimension` | Input column or virtual column name to filter. | Yes | | `intervals` | A JSON array containing ISO-8601 interval strings that defines the time ranges to filter on. | Yes | -| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | +| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [filtering with extraction functions](#filtering-with-extraction-functions) for details. | No | The interval filter supports the use of extraction functions, see [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. @@ -542,7 +542,7 @@ You can use search filters to filter on partial string matches. | `type` | Must be "search". | Yes | | `dimension` | Input column or virtual column name to filter. | Yes | | `query`| A JSON object for the type of search. See [search query spec](#search-query-spec) for more information. | Yes | -| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | +| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [filtering with extraction functions](#filtering-with-extraction-functions) for details. | No | ### Search query spec @@ -554,7 +554,7 @@ You can use search filters to filter on partial string matches. | `value` | A String value to search. | Yes | | `caseSensitive` | Whether the string comparison is case-sensitive or not. | No, default is false (insensitive) | -#### Insensitive Contains +#### Insensitive contains | Property | Description | Required | | -------- | ----------- | -------- | @@ -602,7 +602,7 @@ The JavaScript filter matches a dimension against the specified JavaScript funct | `type` | Must be "javascript" | Yes | | `dimension` | Input column or virtual column name to filter. | Yes | | `function` | JavaScript function which accepts the dimension value as a single argument, and returns either true or false. | Yes | -| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | +| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [filtering with extraction functions](#filtering-with-extraction-functions) for details. | No | ### Example: matching any dimension values for the dimension `name` between `'bar'` and `'foo'` @@ -630,7 +630,7 @@ The following filter matches the values for which the extraction function has a | `type` | Must be "extraction" | Yes | | `dimension` | Input column or virtual column name to filter. | Yes | | `value` | String value to match. | No. If not specified the filter will match NULL values. | -| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [Filtering with Extraction Functions](#filtering-with-extraction-functions) for details. | No | +| `extractionFn` | [Extraction function](./dimensionspecs.md#extraction-functions) to apply to `dimension` prior to value matching. See [filtering with extraction functions](#filtering-with-extraction-functions) for details. | No | ### Example: matching dimension values in `[product_1, product_3, product_5]` for the column `product`