From ccbf3abee9a341d8f5c80f37e7506088f10252bf Mon Sep 17 00:00:00 2001 From: Jill Osborne Date: Thu, 10 Nov 2022 15:13:32 +0000 Subject: [PATCH 1/8] Update and document experimental features --- docs/configuration/index.md | 2 +- docs/development/experimental-features.md | 109 ++++++++++++++++++ .../extensions-core/druid-lookups.md | 3 - .../kafka-supervisor-reference.md | 2 - .../extensions-core/kinesis-ingestion.md | 2 - docs/querying/lookups.md | 4 +- 6 files changed, 112 insertions(+), 10 deletions(-) create mode 100644 docs/development/experimental-features.md diff --git a/docs/configuration/index.md b/docs/configuration/index.md index 6fb3201d98f4..992b5f82da21 100644 --- a/docs/configuration/index.md +++ b/docs/configuration/index.md @@ -2188,7 +2188,7 @@ Supported query contexts: |Key|Description|Default| |---|-----------|-------| |`druid.expressions.useStrictBooleans`|Controls the behavior of Druid boolean operators and functions, if set to `true` all boolean values will be either a `1` or `0`. See [expression documentation](../misc/math-expr.md#logical-operator-modes)|false| -|`druid.expressions.allowNestedArrays`|If enabled, Druid array expressions can create nested arrays. This is experimental and should be used with caution.|false| +|`druid.expressions.allowNestedArrays`|If enabled, Druid array expressions can create nested arrays.|false| ### Router #### Router Process Configs diff --git a/docs/development/experimental-features.md b/docs/development/experimental-features.md new file mode 100644 index 000000000000..bdbec47b9365 --- /dev/null +++ b/docs/development/experimental-features.md @@ -0,0 +1,109 @@ +--- +id: experimental-features +title: "Experimental features" +--- + + + +The following features are marked [experimental](./experimental.md) in the Druid docs. + +This document includes each page that mentions an experimental feature. To graduate a feature, remove all mentions of its experimental status on all relevant pages. + +## SQL-based ingestion + +[SQL-based ingestion](../multi-stage-query/index.md) + +- As an experimental feature, the task engine also supports running SELECT queries as batch tasks. + +[SQL-based ingestion concepts](../multi-stage-query/concepts.md) + +- As an experimental feature, the MSQ task engine also supports running SELECT queries as batch tasks. The behavior and result format of plain SELECT (without INSERT or REPLACE) is subject to change. + +[SQL-based ingestion and multi-stage query task API](../multi-stage-query/api.md) + +- As an experimental feature, the `/druid/v2/sql/task/` endpoint also accepts SELECT queries. +- As an experimental feature, the MSQ task engine supports running SELECT queries. + +## Nested columns + +[Nested columns](../querying/nested-columns.md) + +- Nested columns is an experimental feature available starting in Apache Druid 24.0. + +## Indexer process + +[Indexer process](../design/indexer.md) + +- The Indexer is an optional and experimental feature. Its memory management system is still under development and will be significantly enhanced in later releases. + +[Configuration reference](../configuration/index.md) + +- Data Server: Configuration options for the experimental Indexer process are also provided here. + +[Processes and servers](../design/processes.md) + +- Indexer process (optional): The Indexer is designed to be easier to configure and deploy compared to the MiddleManager + Peon system and to better enable resource sharing across tasks. The Indexer is a newer feature and is currently designated experimental due to the fact that its memory management system is still under development. It will continue to mature in future versions of Druid. + +## Kubernetes + +[Kubernetes](../development/extensions-core/kubernetes.md) + +- Consider this an EXPERIMENTAL feature mostly because it has not been tested yet on a wide variety of long running Druid clusters. + +## Segment locking + +[Configuration reference](../configuration/index.md) + +- Overlord operations property `druid.indexer.tasklock.forceTimeChunkLock` +
Setting this to false is still experimental. If set, all tasks are enforced to use time chunk lock. If not set, each task automatically chooses a lock type to use. This configuration can be overwritten by setting `forceTimeChunkLock` in the task context. + +[Task reference](../ingestion/tasks.md) + +- Locking: The segment locking is still experimental. It could have unknown bugs which potentially lead to incorrect query results. +- Context parameter `forceTimeChunkLock`: Setting this to false is still experimental. + +[Design](../design/architecture.md) + +- Druid also supports an experimental segment locking mode that is activated by setting `forceTimeChunkLock` to false in the context of an ingestion task. + +## Materialized view + +[Materialized view](../development/extensions-contrib/materialized-view.md) + +- Note that Materialized View is currently designated as experimental. Please make sure the time of all processes are the same and increase monotonically. Otherwise, some unexpected errors may happen on query results. + +## Moments sketch + +[Aggregations](../querying/aggregations.md) + +- Moments Sketch (Experimental): The Moments Sketch extension-provided aggregator is an experimental aggregator that provides quantile estimates using the Moments Sketch. +- The Moments Sketch aggregator is provided as an experimental option. It is optimized for merging speed and it can have higher aggregation performance compared to the DataSketches quantiles aggregator. However, the accuracy of the Moments Sketch is distribution-dependent, so users will need to empirically verify that the aggregator is suitable for their input data. +- As a general guideline for experimentation, the Moments Sketch paper points out that this algorithm works better on inputs with high entropy. In particular, the algorithm is not a good fit when the input data consists of a small number of clustered discrete values. + +## Other configuration properties + +[Configuration reference](../configuration/index.md) + +- Task runner `httpRemote` +
Overlord operations property `druid.indexer.runner.type`: Choices "local" or "remote". Indicates whether tasks should be run locally or in a distributed environment. Experimental task runner "httpRemote" is also available which is same as "remote" but uses HTTP to interact with Middle Managers instead of Zookeeper.

+- `CLOSED_SEGMENTS_SINKS` mode +
Peon config `druid.indexer.task.batchProcessingMode`: `CLOSED_SEGMENTS_SINKS` mode isn't as well tested as other modes so is currently considered experimental.

+- Expression processing configuration `druid.expressions.allowNestedArrays` +
If enabled, Druid array expressions can create nested arrays. This is experimental and should be used with caution. diff --git a/docs/development/extensions-core/druid-lookups.md b/docs/development/extensions-core/druid-lookups.md index b44f9620bd0a..5b19508c2375 100644 --- a/docs/development/extensions-core/druid-lookups.md +++ b/docs/development/extensions-core/druid-lookups.md @@ -22,9 +22,6 @@ title: "Cached Lookup Module" ~ under the License. --> - -> Please note that this is an experimental module and the development/testing still at early stage. Feel free to try it and give us your feedback. - ## Description This Apache Druid module provides a per-lookup caching mechanism for JDBC data sources. The main goal of this cache is to speed up the access to a high latency lookup sources and to provide a caching isolation for every lookup source. diff --git a/docs/development/extensions-core/kafka-supervisor-reference.md b/docs/development/extensions-core/kafka-supervisor-reference.md index 6f94f90ffc71..81ba59be93be 100644 --- a/docs/development/extensions-core/kafka-supervisor-reference.md +++ b/docs/development/extensions-core/kafka-supervisor-reference.md @@ -56,8 +56,6 @@ This topic contains configuration reference information for the Apache Kafka sup ## Task Autoscaler Properties -> Note that Task AutoScaler is currently designated as experimental. - | Property | Description | Required | | ------------- | ------------- | ------------- | | `enableTaskAutoScaler` | Enable or disable autoscaling. `false` or blank disables the `autoScaler` even when `autoScalerConfig` is not null| no (default == false) | diff --git a/docs/development/extensions-core/kinesis-ingestion.md b/docs/development/extensions-core/kinesis-ingestion.md index 916ea911d3c6..c0e5d5c9d64c 100644 --- a/docs/development/extensions-core/kinesis-ingestion.md +++ b/docs/development/extensions-core/kinesis-ingestion.md @@ -149,8 +149,6 @@ Where the file `supervisor-spec.json` contains a Kinesis supervisor spec: #### Task Autoscaler Properties -> Note that Task AutoScaler is currently designated as experimental. - | Property | Description | Required | | ------------- | ------------- | ------------- | | `enableTaskAutoScaler` | Enable or disable the auto scaler. When false or absent, Druid disables the `autoScaler` even when `autoScalerConfig` is not null.| no (default == false) | diff --git a/docs/querying/lookups.md b/docs/querying/lookups.md index 57b406afb199..860a3ed2e277 100644 --- a/docs/querying/lookups.md +++ b/docs/querying/lookups.md @@ -115,8 +115,8 @@ will not detect this automatically. Dynamic Configuration --------------------- -> Dynamic lookup configuration is an [experimental](../development/experimental.md) feature. Static -> configuration is no longer supported. +> Static configuration is no longer supported. + The following documents the behavior of the cluster-wide config which is accessible through the Coordinator. The configuration is propagated through the concept of "tier" of servers. A "tier" is defined as a group of services which should receive a set of lookups. From d7b8fae15fc389ddbac1a57121ff965adb63f2af Mon Sep 17 00:00:00 2001 From: Jill Osborne Date: Thu, 17 Nov 2022 13:02:01 +0000 Subject: [PATCH 2/8] Updated --- docs/development/experimental-features.md | 75 +++++------------------ 1 file changed, 17 insertions(+), 58 deletions(-) diff --git a/docs/development/experimental-features.md b/docs/development/experimental-features.md index bdbec47b9365..708629886b33 100644 --- a/docs/development/experimental-features.md +++ b/docs/development/experimental-features.md @@ -28,82 +28,41 @@ This document includes each page that mentions an experimental feature. To gradu ## SQL-based ingestion -[SQL-based ingestion](../multi-stage-query/index.md) - -- As an experimental feature, the task engine also supports running SELECT queries as batch tasks. - -[SQL-based ingestion concepts](../multi-stage-query/concepts.md) - -- As an experimental feature, the MSQ task engine also supports running SELECT queries as batch tasks. The behavior and result format of plain SELECT (without INSERT or REPLACE) is subject to change. - -[SQL-based ingestion and multi-stage query task API](../multi-stage-query/api.md) - -- As an experimental feature, the `/druid/v2/sql/task/` endpoint also accepts SELECT queries. -- As an experimental feature, the MSQ task engine supports running SELECT queries. +- [SQL-based ingestion](../multi-stage-query/index.md) +- [SQL-based ingestion concepts](../multi-stage-query/concepts.md) +- [SQL-based ingestion and multi-stage query task API](../multi-stage-query/api.md) ## Nested columns -[Nested columns](../querying/nested-columns.md) - -- Nested columns is an experimental feature available starting in Apache Druid 24.0. +- [Nested columns](../querying/nested-columns.md) ## Indexer process -[Indexer process](../design/indexer.md) - -- The Indexer is an optional and experimental feature. Its memory management system is still under development and will be significantly enhanced in later releases. - -[Configuration reference](../configuration/index.md) - -- Data Server: Configuration options for the experimental Indexer process are also provided here. - -[Processes and servers](../design/processes.md) - -- Indexer process (optional): The Indexer is designed to be easier to configure and deploy compared to the MiddleManager + Peon system and to better enable resource sharing across tasks. The Indexer is a newer feature and is currently designated experimental due to the fact that its memory management system is still under development. It will continue to mature in future versions of Druid. +- [Indexer process](../design/indexer.md) +- [Configuration reference](../configuration/index.md) +- [Processes and servers](../design/processes.md) ## Kubernetes -[Kubernetes](../development/extensions-core/kubernetes.md) - -- Consider this an EXPERIMENTAL feature mostly because it has not been tested yet on a wide variety of long running Druid clusters. +- [Kubernetes](../development/extensions-core/kubernetes.md) ## Segment locking -[Configuration reference](../configuration/index.md) - -- Overlord operations property `druid.indexer.tasklock.forceTimeChunkLock` -
Setting this to false is still experimental. If set, all tasks are enforced to use time chunk lock. If not set, each task automatically chooses a lock type to use. This configuration can be overwritten by setting `forceTimeChunkLock` in the task context. - -[Task reference](../ingestion/tasks.md) - -- Locking: The segment locking is still experimental. It could have unknown bugs which potentially lead to incorrect query results. -- Context parameter `forceTimeChunkLock`: Setting this to false is still experimental. - -[Design](../design/architecture.md) - -- Druid also supports an experimental segment locking mode that is activated by setting `forceTimeChunkLock` to false in the context of an ingestion task. +- [Configuration reference](../configuration/index.md) +- [Task reference](../ingestion/tasks.md) +- [Design](../design/architecture.md) ## Materialized view -[Materialized view](../development/extensions-contrib/materialized-view.md) - -- Note that Materialized View is currently designated as experimental. Please make sure the time of all processes are the same and increase monotonically. Otherwise, some unexpected errors may happen on query results. +- [Materialized view](../development/extensions-contrib/materialized-view.md) ## Moments sketch -[Aggregations](../querying/aggregations.md) - -- Moments Sketch (Experimental): The Moments Sketch extension-provided aggregator is an experimental aggregator that provides quantile estimates using the Moments Sketch. -- The Moments Sketch aggregator is provided as an experimental option. It is optimized for merging speed and it can have higher aggregation performance compared to the DataSketches quantiles aggregator. However, the accuracy of the Moments Sketch is distribution-dependent, so users will need to empirically verify that the aggregator is suitable for their input data. -- As a general guideline for experimentation, the Moments Sketch paper points out that this algorithm works better on inputs with high entropy. In particular, the algorithm is not a good fit when the input data consists of a small number of clustered discrete values. +- [Aggregations](../querying/aggregations.md) ## Other configuration properties -[Configuration reference](../configuration/index.md) - -- Task runner `httpRemote` -
Overlord operations property `druid.indexer.runner.type`: Choices "local" or "remote". Indicates whether tasks should be run locally or in a distributed environment. Experimental task runner "httpRemote" is also available which is same as "remote" but uses HTTP to interact with Middle Managers instead of Zookeeper.

-- `CLOSED_SEGMENTS_SINKS` mode -
Peon config `druid.indexer.task.batchProcessingMode`: `CLOSED_SEGMENTS_SINKS` mode isn't as well tested as other modes so is currently considered experimental.

-- Expression processing configuration `druid.expressions.allowNestedArrays` -
If enabled, Druid array expressions can create nested arrays. This is experimental and should be used with caution. +- [Configuration reference](../configuration/index.md) + - Experimental task runner `httpRemote` + - `CLOSED_SEGMENTS_SINKS` mode + - Expression processing configuration `druid.expressions.allowNestedArrays` From 77c0ac1b8e2f3b909990fd1977af9636d5a30da4 Mon Sep 17 00:00:00 2001 From: Jill Osborne Date: Thu, 17 Nov 2022 16:16:38 +0000 Subject: [PATCH 3/8] Update experimental-features.md --- docs/development/experimental-features.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/development/experimental-features.md b/docs/development/experimental-features.md index 708629886b33..b55232e0e17a 100644 --- a/docs/development/experimental-features.md +++ b/docs/development/experimental-features.md @@ -60,6 +60,10 @@ This document includes each page that mentions an experimental feature. To gradu - [Aggregations](../querying/aggregations.md) +## Front coding + +- [Ingestion spec reference](../ingestion/ingestion-spec.md#front-coding) + ## Other configuration properties - [Configuration reference](../configuration/index.md) From 98c18e41dea40dbc08d9b4e1a88ec5acb658b2c7 Mon Sep 17 00:00:00 2001 From: Jill Osborne Date: Fri, 18 Nov 2022 16:37:03 +0000 Subject: [PATCH 4/8] Update docs/development/experimental-features.md Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com> --- docs/development/experimental-features.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/development/experimental-features.md b/docs/development/experimental-features.md index b55232e0e17a..c3b839fb3468 100644 --- a/docs/development/experimental-features.md +++ b/docs/development/experimental-features.md @@ -49,7 +49,7 @@ This document includes each page that mentions an experimental feature. To gradu ## Segment locking - [Configuration reference](../configuration/index.md) -- [Task reference](../ingestion/tasks.md) +- [Task reference](../ingestion/tasks.md#locking) - [Design](../design/architecture.md) ## Materialized view From 975ae240ca2c1e008cd0ef7ad968beb6edd3413c Mon Sep 17 00:00:00 2001 From: Jill Osborne Date: Fri, 18 Nov 2022 16:54:16 +0000 Subject: [PATCH 5/8] Updated after review --- docs/development/experimental-features.md | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/docs/development/experimental-features.md b/docs/development/experimental-features.md index c3b839fb3468..98c537af1920 100644 --- a/docs/development/experimental-features.md +++ b/docs/development/experimental-features.md @@ -39,8 +39,8 @@ This document includes each page that mentions an experimental feature. To gradu ## Indexer process - [Indexer process](../design/indexer.md) -- [Configuration reference](../configuration/index.md) -- [Processes and servers](../design/processes.md) +- [Configuration reference](../configuration/index.md#overlord-operations) +- [Processes and servers](../design/processes.md#indexer-process-optional) ## Kubernetes @@ -48,9 +48,9 @@ This document includes each page that mentions an experimental feature. To gradu ## Segment locking -- [Configuration reference](../configuration/index.md) +- [Configuration reference](../configuration/index.md#overlord-operations) - [Task reference](../ingestion/tasks.md#locking) -- [Design](../design/architecture.md) +- [Design](../design/architecture.md#availability-and-consistency) ## Materialized view @@ -58,7 +58,7 @@ This document includes each page that mentions an experimental feature. To gradu ## Moments sketch -- [Aggregations](../querying/aggregations.md) +- [Aggregations](../querying/aggregations.md#moments-sketch-experimental) ## Front coding @@ -67,6 +67,5 @@ This document includes each page that mentions an experimental feature. To gradu ## Other configuration properties - [Configuration reference](../configuration/index.md) - - Experimental task runner `httpRemote` - `CLOSED_SEGMENTS_SINKS` mode - Expression processing configuration `druid.expressions.allowNestedArrays` From eb8268ec44490134c16656245b8f73a8fcb550f6 Mon Sep 17 00:00:00 2001 From: Jill Osborne Date: Mon, 21 Nov 2022 14:59:13 +0000 Subject: [PATCH 6/8] Updated --- docs/configuration/index.md | 2 +- docs/development/experimental-features.md | 5 ----- docs/development/extensions-contrib/materialized-view.md | 2 -- 3 files changed, 1 insertion(+), 8 deletions(-) diff --git a/docs/configuration/index.md b/docs/configuration/index.md index 992b5f82da21..8ac9382b7e21 100644 --- a/docs/configuration/index.md +++ b/docs/configuration/index.md @@ -1375,7 +1375,7 @@ For GCE's properties, please refer to the [gce-extensions](../development/extens This section contains the configuration options for the processes that reside on Data servers (MiddleManagers/Peons and Historicals) in the suggested [three-server configuration](../design/processes.md#server-types). -Configuration options for the experimental [Indexer process](../design/indexer.md) are also provided here. +Configuration options for the [Indexer process](../design/indexer.md) are also provided here. ### MiddleManager and Peons diff --git a/docs/development/experimental-features.md b/docs/development/experimental-features.md index 98c537af1920..18f45a786e22 100644 --- a/docs/development/experimental-features.md +++ b/docs/development/experimental-features.md @@ -39,7 +39,6 @@ This document includes each page that mentions an experimental feature. To gradu ## Indexer process - [Indexer process](../design/indexer.md) -- [Configuration reference](../configuration/index.md#overlord-operations) - [Processes and servers](../design/processes.md#indexer-process-optional) ## Kubernetes @@ -52,10 +51,6 @@ This document includes each page that mentions an experimental feature. To gradu - [Task reference](../ingestion/tasks.md#locking) - [Design](../design/architecture.md#availability-and-consistency) -## Materialized view - -- [Materialized view](../development/extensions-contrib/materialized-view.md) - ## Moments sketch - [Aggregations](../querying/aggregations.md#moments-sketch-experimental) diff --git a/docs/development/extensions-contrib/materialized-view.md b/docs/development/extensions-contrib/materialized-view.md index a493c3c4177b..f01738c89f4a 100644 --- a/docs/development/extensions-contrib/materialized-view.md +++ b/docs/development/extensions-contrib/materialized-view.md @@ -132,5 +132,3 @@ There are 2 parts in a view query: |--------|-----------|---------| |queryType |The query type. This should always be view |yes| |query |The real query of this `view` query. The real query must be [groupBy](../../querying/groupbyquery.md), [topN](../../querying/topnquery.md), or [timeseries](../../querying/timeseriesquery.md) type.|yes| - -**Note that Materialized View is currently designated as experimental. Please make sure the time of all processes are the same and increase monotonically. Otherwise, some unexpected errors may happen on query results.** From 53c3bde505abc861d22896b46204b65ec1e7851f Mon Sep 17 00:00:00 2001 From: Jill Osborne Date: Wed, 23 Nov 2022 12:29:47 +0000 Subject: [PATCH 7/8] Update materialized-view.md --- docs/development/extensions-contrib/materialized-view.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/development/extensions-contrib/materialized-view.md b/docs/development/extensions-contrib/materialized-view.md index f01738c89f4a..a493c3c4177b 100644 --- a/docs/development/extensions-contrib/materialized-view.md +++ b/docs/development/extensions-contrib/materialized-view.md @@ -132,3 +132,5 @@ There are 2 parts in a view query: |--------|-----------|---------| |queryType |The query type. This should always be view |yes| |query |The real query of this `view` query. The real query must be [groupBy](../../querying/groupbyquery.md), [topN](../../querying/topnquery.md), or [timeseries](../../querying/timeseriesquery.md) type.|yes| + +**Note that Materialized View is currently designated as experimental. Please make sure the time of all processes are the same and increase monotonically. Otherwise, some unexpected errors may happen on query results.** From 77148f7e97f5e1bef6a3474ee698ce37e6213307 Mon Sep 17 00:00:00 2001 From: Jill Osborne Date: Wed, 23 Nov 2022 12:31:51 +0000 Subject: [PATCH 8/8] Update experimental-features.md --- docs/development/experimental-features.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/development/experimental-features.md b/docs/development/experimental-features.md index 18f45a786e22..a68fc47411af 100644 --- a/docs/development/experimental-features.md +++ b/docs/development/experimental-features.md @@ -26,6 +26,8 @@ The following features are marked [experimental](./experimental.md) in the Druid This document includes each page that mentions an experimental feature. To graduate a feature, remove all mentions of its experimental status on all relevant pages. +Note that this document does not track the status of contrib extensions, some of which are experimental. + ## SQL-based ingestion - [SQL-based ingestion](../multi-stage-query/index.md)