From 6c8f5728bed04e64b45ef88f1fb6b8c65b88874f Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Sun, 28 Jun 2020 15:08:02 +0800 Subject: [PATCH 01/28] Update aggregate-group-by-functions.md --- functions-and-operators/aggregate-group-by-functions.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/functions-and-operators/aggregate-group-by-functions.md b/functions-and-operators/aggregate-group-by-functions.md index a0790299dbd5d..edcf67bcdde7a 100644 --- a/functions-and-operators/aggregate-group-by-functions.md +++ b/functions-and-operators/aggregate-group-by-functions.md @@ -21,7 +21,9 @@ This section describes the supported MySQL group (aggregate) functions in TiDB. | [`AVG()`](https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_avg) | Return the average value of the argument | | [`MAX()`](https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_max) | Return the maximum value | | [`MIN()`](https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_min) | Return the minimum value | -| [`GROUP_CONCAT()`](https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_group-concat) | Return a concatenated string | +| [`GROUP_CONCAT()`](https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_group-concat) | Return a concatenated string | +| [`VARIANCE()`, `VAR_POP()`](https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_var-pop) | Return the population standard variance| +| [`JSON_OBJECTAGG(key, value)`](https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_json-objectagg) | Return result set as a single JSON object| - Unless otherwise stated, group functions ignore `NULL` values. - If you use a group function in a statement containing no `GROUP BY` clause, it is equivalent to grouping on all rows. @@ -118,7 +120,5 @@ The following aggregate functions are currently unsupported in TiDB. You can tra - `STD`, `STDDEV`, `STDDEV_POP` - `STDDEV_SAMP` -- `VARIANCE`, `VAR_POP` - `VAR_SAMP` - `JSON_ARRAYAGG` -- `JSON_OBJECTAGG` From 7cb9fe4adb1c0710aea1562dff07b848a13e9d47 Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Sun, 28 Jun 2020 15:21:24 +0800 Subject: [PATCH 02/28] Update json-functions.md --- functions-and-operators/json-functions.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/functions-and-operators/json-functions.md b/functions-and-operators/json-functions.md index 3272b9e553a85..d2b580228b1c8 100644 --- a/functions-and-operators/json-functions.md +++ b/functions-and-operators/json-functions.md @@ -46,6 +46,8 @@ TiDB supports most of the JSON functions that shipped with the GA release of MyS | [JSON_REPLACE(json_doc, path, val[, path, val] ...)][json_replace] | Replaces existing values in a JSON document and returns the result | | [JSON_SET(json_doc, path, val[, path, val] ...)][json_set] | Inserts or updates data in a JSON document and returns the result | | [JSON_UNQUOTE(json_val)][json_unquote] | Unquotes a JSON value and returns the result as a string | +| [JSON_ARRAY_APPEND(json_doc, path, val[, path, val] ...)][json_array_append] | Appends values to the end of the indicated arrays within a JSON document and returns the result | +| [JSON_ARRAY_INSERT(json_doc, path, val[, path, val] ...)][json_array_insert] | Updates a JSON document, inserting into an array within the document and returning the modified document | ## Functions that return JSON value attributes @@ -54,16 +56,15 @@ TiDB supports most of the JSON functions that shipped with the GA release of MyS | [JSON_DEPTH(json_doc)][json_depth] | Returns the maximum depth of a JSON document | | [JSON_LENGTH(json_doc[, path])][json_length] | Returns the length of a JSON document, or, if a path argument is given, the length of the value within the path | | [JSON_TYPE(json_val)][json_type] | Returns a string indicating the type of a JSON value | - +| [JSON_VALID(json_val)][json_valid] | Returns 0 or 1 to indicate whether a value is valid JSON | ## Unsupported functions The following JSON functions are unsupported in TiDB. You can track the progress in adding them in [TiDB #7546](https://github.com/pingcap/tidb/issues/7546): -* `JSON_ARRAY_INSERT` + * `JSON_MERGE_PATCH` * `JSON_PRETTY` * `JSON_STORAGE_SIZE` -* `JSON_VALID` * `JSON_ARRAYAGG` * `JSON_OBJECTAGG` @@ -91,3 +92,5 @@ The following JSON functions are unsupported in TiDB. You can track the progress [json_search]: https://dev.mysql.com/doc/refman/5.7/en/json-search-functions.html#function_json-search [json_append]: https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-append [json_array_append]: https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-array-append +[json_array_insert]: https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-array-insert +[json_search]: https://dev.mysql.com/doc/refman/5.7/en/json-search-functions.html#function_json-search From 014eb9cca10c6502f89fc56c1674ff9a5a691c85 Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Sun, 28 Jun 2020 18:15:13 +0800 Subject: [PATCH 03/28] Update generated-columns.md --- generated-columns.md | 100 +++++++++++++++++++++++++++++++++---------- 1 file changed, 78 insertions(+), 22 deletions(-) diff --git a/generated-columns.md b/generated-columns.md index 8ae0c92b9b9bb..1dad9745c474e 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -11,9 +11,18 @@ aliases: ['/docs/dev/generated-columns/','/docs/dev/reference/sql/generated-colu > > This is still an experimental feature. It is **NOT** recommended that you use it in the production environment. -TiDB supports generated columns as part of MySQL 5.7 compatibility. One of the primary use cases for generated columns is to extract data out of a JSON data type and enable it to be indexed. +This document introduces the concept and usage of generated columns. -## Index JSON using generated column +## Basic concepts of generated columns + +Unlike general columns, the value of the generated column is calculated by the expression in the column definition. When inserting or updating a generated column, you cannot assign a value, you can only use `DEFAULT`. + +There are two kinds of generated columns: virtual and stored. A virtual generated column occupies no storage and is computed when it is read. A stored generated column is computed when it is written (inserted or updated) and occupies storage as if it were a normal column. Compared with the virtual generated columns, the stored generated columns perform better, but it takes up more disk space. + +You can create an index on a generated column whether it is virtual or stored. + +## Usage of Generated columns +One of the main usage of generated columns: extracting data from the JSON data type and indexing the data. In both MySQL 5.7 and TiDB, columns of type JSON can not be indexed directly. i.e. The following table structure is **not supported**: @@ -26,65 +35,112 @@ CREATE TABLE person ( ); ``` -To index a JSON column, you must first extract it as a generated column. +To index a JSON column, you must extract it as a generated column first. -Using the `city` stored generated column as an example, you are then able to add an index: +Using the `city` field in `address_info` as an example, you can create a virtual generated column and add an index for it: +{{< copyable "sql" >}} ```sql CREATE TABLE person ( id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, name VARCHAR(255) NOT NULL, address_info JSON, - city VARCHAR(64) AS (JSON_UNQUOTE(JSON_EXTRACT(address_info, '$.city'))) STORED, + city VARCHAR(64) AS (JSON_UNQUOTE(JSON_EXTRACT(address_info, '$.city'))), KEY (city) ); ``` -In this table, the `city` column is a **generated column**. As the name implies, the column is generated from other columns in the table, and cannot be assigned a value when inserted or updated. This column is generated based on a defined expression and is stored in the database. Thus this column can be read directly, not in a way that its dependent column `address_info` is read first and then the data is calculated. The index on `city` however is _stored_ and uses the same structure as other indexes of the type `varchar(64)`. +In this table, the `city` column is a **virtual generated column**, and there is an index on `city` column. The following query can use the index to speed up: -You can use the index on the stored generated column in order to speed up the following statement: +{{< copyable "sql" >}} ```sql SELECT name, id FROM person WHERE city = 'Beijing'; ``` -If no data exists at path `$.city`, `JSON_EXTRACT` returns `NULL`. If you want to enforce a constraint that `city` must be `NOT NULL`, you can define the virtual column as follows: +{{< copyable "sql" >}} + +```sql +EXPLAIN SELECT name, id FROM person WHERE city = 'Beijing'; +``` +``` ++---------------------------------+---------+-----------+--------------------------------+-------------------------------------------------------------+ +| id | estRows | task | access object | operator info | ++---------------------------------+---------+-----------+--------------------------------+-------------------------------------------------------------+ +| Projection_4 | 10.00 | root | | test.person.name, test.person.id | +| └─IndexLookUp_10 | 10.00 | root | | | +| ├─IndexRangeScan_8(Build) | 10.00 | cop[tikv] | table:person, index:city(city) | range:["Beijing","Beijing"], keep order:false, stats:pseudo | +| └─TableRowIDScan_9(Probe) | 10.00 | cop[tikv] | table:person | keep order:false, stats:pseudo | ++---------------------------------+---------+-----------+--------------------------------+-------------------------------------------------------------+ +``` + + +From the query execution plan, it can be seen that the index of city is used to read the `HANDLE` of the row that meets the condition `city ='Beijing'`, and then use this `HANDLE` to read the data of the row. +If no data exists at path `$.city`, `JSON_EXTRACT` returns `NULL`. If you want to enforce a constraint that `city` must be `NOT NULL`, you can define the generated virtual column as follows: + +{{< copyable "sql" >}} ```sql CREATE TABLE person ( id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, name VARCHAR(255) NOT NULL, address_info JSON, - city VARCHAR(64) AS (JSON_UNQUOTE(JSON_EXTRACT(address_info, '$.city'))) STORED NOT NULL, + city VARCHAR(64) AS (JSON_UNQUOTE(JSON_EXTRACT(address_info, '$.city'))) NOT NULL, KEY (city) ); ``` +## Validation of Generated columns Both `INSERT` and `UPDATE` statements check virtual column definitions. Rows that do not pass validation return errors: +{{< copyable "sql" >}} ```sql mysql> INSERT INTO person (name, address_info) VALUES ('Morgan', JSON_OBJECT('Country', 'Canada')); ERROR 1048 (23000): Column 'city' cannot be null ``` -## Use generated virtual columns - -TiDB also supports generated virtual columns. Different from generated store columns, generated virtual columns are **virtual** in that they are generated as needed and are not stored in the database or cached in the memory. +## Generated columns index replacement +When an expression in a query is equivalent to a generated column with an index, TiDB replaces the expression with the corresponding generated column, so that the optimizer can take that index into account during execution plan construction. +For example, the following example creates a generated column for the expression `a+1` and adds an index: ```sql -CREATE TABLE person ( - id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, - name VARCHAR(255) NOT NULL, - address_info JSON, - city VARCHAR(64) AS (JSON_UNQUOTE(JSON_EXTRACT(address_info, '$.city'))) VIRTUAL -); +create table t(a int); +desc select a+1 from t where a+1=3; ++---------------------------+----------+-----------+---------------+--------------------------------+ +| id | estRows | task | access object | operator info | ++---------------------------+----------+-----------+---------------+--------------------------------+ +| Projection_4 | 8000.00 | root | | plus(test.t.a, 1)->Column#3 | +| └─TableReader_7 | 8000.00 | root | | data:Selection_6 | +| └─Selection_6 | 8000.00 | cop[tikv] | | eq(plus(test.t.a, 1), 3) | +| └─TableFullScan_5 | 10000.00 | cop[tikv] | table:t | keep order:false, stats:pseudo | ++---------------------------+----------+-----------+---------------+--------------------------------+ +4 rows in set (0.00 sec) + +alter table t add column b bigint as (a+1) virtual; +alter table t add index idx_b(b); +desc select a+1 from t where a+1=3; ++------------------------+---------+-----------+-------------------------+---------------------------------------------+ +| id | estRows | task | access object | operator info | ++------------------------+---------+-----------+-------------------------+---------------------------------------------+ +| IndexReader_6 | 10.00 | root | | index:IndexRangeScan_5 | +| └─IndexRangeScan_5 | 10.00 | cop[tikv] | table:t, index:idx_b(b) | range:[3,3], keep order:false, stats:pseudo | ++------------------------+---------+-----------+-------------------------+---------------------------------------------+ +2 rows in set (0.01 sec) ``` +> **Note:** +> +> Only when the expression type to be replaced and the generated column type are strictly equal, the conversion will be performed. +> +> In the above example, the column type of `a` is int and the column type of `a+1` is bigint. If the type of the generated column is set to int, the replacement will not occur. +> +> For type conversion rules, see [Type Conversion of Expression Evaluation] (/functions-and-operators/type-conversion-in-expression-evaluation.md). -## Limitations +## Limitations of Generated columns The current limitations of JSON and generated columns are as follows: -- You cannot add the generated column in the storage type of `STORED` through `ALTER TABLE`. +- You cannot add the generated column through `ALTER TABLE`. - You can neither convert a generated stored column to a normal column through the `ALTER TABLE` statement nor convert a normal column to a generated stored column. -- You cannot modify the **expression** of a generated stored column through the `ALTER TABLE` statement. -- Not all [JSON functions](/functions-and-operators/json-functions.md) are supported. +- Not all [JSON functions](/functions-and-operators/json-functions.md) are supported; +- Currently, the generated column index replacement rule is valid only when the generated column is a virtual generated column. It is not valid on the stored generated column, but the index can still be used by directly using the generated column itself. + From 4898ce97e4a05209455f1621d107690e18e0c729 Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Sun, 28 Jun 2020 18:24:43 +0800 Subject: [PATCH 04/28] Update generated-columns.md --- generated-columns.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/generated-columns.md b/generated-columns.md index 1dad9745c474e..35428b194bf7e 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -13,11 +13,11 @@ aliases: ['/docs/dev/generated-columns/','/docs/dev/reference/sql/generated-colu This document introduces the concept and usage of generated columns. -## Basic concepts of generated columns +## Basic concepts of Generated columns Unlike general columns, the value of the generated column is calculated by the expression in the column definition. When inserting or updating a generated column, you cannot assign a value, you can only use `DEFAULT`. -There are two kinds of generated columns: virtual and stored. A virtual generated column occupies no storage and is computed when it is read. A stored generated column is computed when it is written (inserted or updated) and occupies storage as if it were a normal column. Compared with the virtual generated columns, the stored generated columns perform better, but it takes up more disk space. +There are two kinds of generated columns: virtual and stored. A virtual generated column occupies no storage and is computed when it is read. A stored generated column is computed when it is written (inserted or updated) and occupies storage. Compared with the virtual generated columns, the stored generated columns perform better, but it takes up more disk space. You can create an index on a generated column whether it is virtual or stored. @@ -77,7 +77,7 @@ EXPLAIN SELECT name, id FROM person WHERE city = 'Beijing'; From the query execution plan, it can be seen that the index of city is used to read the `HANDLE` of the row that meets the condition `city ='Beijing'`, and then use this `HANDLE` to read the data of the row. -If no data exists at path `$.city`, `JSON_EXTRACT` returns `NULL`. If you want to enforce a constraint that `city` must be `NOT NULL`, you can define the generated virtual column as follows: +If no data exists at path `$.city`, `JSON_EXTRACT` returns `NULL`. If you want to enforce a constraint that `city` must be `NOT NULL`, you can define the virtual generated column as follows: {{< copyable "sql" >}} ```sql @@ -99,7 +99,7 @@ mysql> INSERT INTO person (name, address_info) VALUES ('Morgan', JSON_OBJECT('Co ERROR 1048 (23000): Column 'city' cannot be null ``` -## Generated columns index replacement +## Generated columns index replacement rule When an expression in a query is equivalent to a generated column with an index, TiDB replaces the expression with the corresponding generated column, so that the optimizer can take that index into account during execution plan construction. For example, the following example creates a generated column for the expression `a+1` and adds an index: @@ -129,7 +129,7 @@ desc select a+1 from t where a+1=3; ``` > **Note:** > -> Only when the expression type to be replaced and the generated column type are strictly equal, the conversion will be performed. +> Only when the expression type and the generated column type are strictly equal, the replacement will be performed. > > In the above example, the column type of `a` is int and the column type of `a+1` is bigint. If the type of the generated column is set to int, the replacement will not occur. > From aa6fc4b7adbcdf34888b564e444864c43a12b6c3 Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Sun, 28 Jun 2020 19:15:07 +0800 Subject: [PATCH 05/28] Update json-functions.md --- functions-and-operators/json-functions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/functions-and-operators/json-functions.md b/functions-and-operators/json-functions.md index d2b580228b1c8..e7ac42c31d8dd 100644 --- a/functions-and-operators/json-functions.md +++ b/functions-and-operators/json-functions.md @@ -57,11 +57,11 @@ TiDB supports most of the JSON functions that shipped with the GA release of MyS | [JSON_LENGTH(json_doc[, path])][json_length] | Returns the length of a JSON document, or, if a path argument is given, the length of the value within the path | | [JSON_TYPE(json_val)][json_type] | Returns a string indicating the type of a JSON value | | [JSON_VALID(json_val)][json_valid] | Returns 0 or 1 to indicate whether a value is valid JSON | + ## Unsupported functions The following JSON functions are unsupported in TiDB. You can track the progress in adding them in [TiDB #7546](https://github.com/pingcap/tidb/issues/7546): - * `JSON_MERGE_PATCH` * `JSON_PRETTY` * `JSON_STORAGE_SIZE` From 12ef4a78c2f5927a0dfb1b1444362e38763a6c86 Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Sun, 28 Jun 2020 19:18:51 +0800 Subject: [PATCH 06/28] Update generated-columns.md --- generated-columns.md | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/generated-columns.md b/generated-columns.md index 35428b194bf7e..ab609b6d64aef 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -22,10 +22,13 @@ There are two kinds of generated columns: virtual and stored. A virtual generate You can create an index on a generated column whether it is virtual or stored. ## Usage of Generated columns + One of the main usage of generated columns: extracting data from the JSON data type and indexing the data. In both MySQL 5.7 and TiDB, columns of type JSON can not be indexed directly. i.e. The following table structure is **not supported**: +{{< copyable "sql" >}} + ```sql CREATE TABLE person ( id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, @@ -40,6 +43,7 @@ To index a JSON column, you must extract it as a generated column first. Using the `city` field in `address_info` as an example, you can create a virtual generated column and add an index for it: {{< copyable "sql" >}} + ```sql CREATE TABLE person ( id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, @@ -63,6 +67,7 @@ SELECT name, id FROM person WHERE city = 'Beijing'; ```sql EXPLAIN SELECT name, id FROM person WHERE city = 'Beijing'; ``` + ``` +---------------------------------+---------+-----------+--------------------------------+-------------------------------------------------------------+ | id | estRows | task | access object | operator info | @@ -80,6 +85,7 @@ From the query execution plan, it can be seen that the index of city is used to If no data exists at path `$.city`, `JSON_EXTRACT` returns `NULL`. If you want to enforce a constraint that `city` must be `NOT NULL`, you can define the virtual generated column as follows: {{< copyable "sql" >}} + ```sql CREATE TABLE person ( id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, @@ -103,6 +109,7 @@ ERROR 1048 (23000): Column 'city' cannot be null When an expression in a query is equivalent to a generated column with an index, TiDB replaces the expression with the corresponding generated column, so that the optimizer can take that index into account during execution plan construction. For example, the following example creates a generated column for the expression `a+1` and adds an index: + ```sql create table t(a int); desc select a+1 from t where a+1=3; @@ -127,6 +134,7 @@ desc select a+1 from t where a+1=3; +------------------------+---------+-----------+-------------------------+---------------------------------------------+ 2 rows in set (0.01 sec) ``` + > **Note:** > > Only when the expression type and the generated column type are strictly equal, the replacement will be performed. @@ -138,7 +146,6 @@ desc select a+1 from t where a+1=3; ## Limitations of Generated columns The current limitations of JSON and generated columns are as follows: - - You cannot add the generated column through `ALTER TABLE`. - You can neither convert a generated stored column to a normal column through the `ALTER TABLE` statement nor convert a normal column to a generated stored column. - Not all [JSON functions](/functions-and-operators/json-functions.md) are supported; From 47f3eb88ff2937848a043917223f8c63a043812f Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Tue, 30 Jun 2020 15:13:32 +0800 Subject: [PATCH 07/28] Update generated-columns.md --- generated-columns.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/generated-columns.md b/generated-columns.md index ab609b6d64aef..e8cf75809b91a 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -79,7 +79,6 @@ EXPLAIN SELECT name, id FROM person WHERE city = 'Beijing'; +---------------------------------+---------+-----------+--------------------------------+-------------------------------------------------------------+ ``` - From the query execution plan, it can be seen that the index of city is used to read the `HANDLE` of the row that meets the condition `city ='Beijing'`, and then use this `HANDLE` to read the data of the row. If no data exists at path `$.city`, `JSON_EXTRACT` returns `NULL`. If you want to enforce a constraint that `city` must be `NOT NULL`, you can define the virtual generated column as follows: @@ -97,6 +96,7 @@ CREATE TABLE person ( ``` ## Validation of Generated columns + Both `INSERT` and `UPDATE` statements check virtual column definitions. Rows that do not pass validation return errors: {{< copyable "sql" >}} @@ -106,6 +106,7 @@ ERROR 1048 (23000): Column 'city' cannot be null ``` ## Generated columns index replacement rule + When an expression in a query is equivalent to a generated column with an index, TiDB replaces the expression with the corresponding generated column, so that the optimizer can take that index into account during execution plan construction. For example, the following example creates a generated column for the expression `a+1` and adds an index: From 484f89661d23e4a7c8f2ed9b4aff64756a9eb1fe Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Tue, 30 Jun 2020 15:15:51 +0800 Subject: [PATCH 08/28] Update generated-columns.md --- generated-columns.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/generated-columns.md b/generated-columns.md index e8cf75809b91a..4022ac366b0f1 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -100,6 +100,7 @@ CREATE TABLE person ( Both `INSERT` and `UPDATE` statements check virtual column definitions. Rows that do not pass validation return errors: {{< copyable "sql" >}} + ```sql mysql> INSERT INTO person (name, address_info) VALUES ('Morgan', JSON_OBJECT('Country', 'Canada')); ERROR 1048 (23000): Column 'city' cannot be null @@ -147,8 +148,8 @@ desc select a+1 from t where a+1=3; ## Limitations of Generated columns The current limitations of JSON and generated columns are as follows: + - You cannot add the generated column through `ALTER TABLE`. - You can neither convert a generated stored column to a normal column through the `ALTER TABLE` statement nor convert a normal column to a generated stored column. - Not all [JSON functions](/functions-and-operators/json-functions.md) are supported; - Currently, the generated column index replacement rule is valid only when the generated column is a virtual generated column. It is not valid on the stored generated column, but the index can still be used by directly using the generated column itself. - From 05174b90f3215a15a2b20b98074ffc5000927661 Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Fri, 3 Jul 2020 10:15:29 +0800 Subject: [PATCH 09/28] Update functions-and-operators/aggregate-group-by-functions.md Co-authored-by: Ran --- functions-and-operators/aggregate-group-by-functions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/functions-and-operators/aggregate-group-by-functions.md b/functions-and-operators/aggregate-group-by-functions.md index edcf67bcdde7a..2f250a7170db2 100644 --- a/functions-and-operators/aggregate-group-by-functions.md +++ b/functions-and-operators/aggregate-group-by-functions.md @@ -22,7 +22,7 @@ This section describes the supported MySQL group (aggregate) functions in TiDB. | [`MAX()`](https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_max) | Return the maximum value | | [`MIN()`](https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_min) | Return the minimum value | | [`GROUP_CONCAT()`](https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_group-concat) | Return a concatenated string | -| [`VARIANCE()`, `VAR_POP()`](https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_var-pop) | Return the population standard variance| +| [`VARIANCE()`, `VAR_POP()`](https://dev.mysql.com/doc/refman/5.7/en/aggregate-functions.html#function_var-pop) | Return the population standard variance| | [`JSON_OBJECTAGG(key, value)`](https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_json-objectagg) | Return result set as a single JSON object| - Unless otherwise stated, group functions ignore `NULL` values. From dd421cb389a427e6dab465679ccddb70e88702a3 Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Fri, 3 Jul 2020 10:15:47 +0800 Subject: [PATCH 10/28] Update functions-and-operators/aggregate-group-by-functions.md Co-authored-by: Ran --- functions-and-operators/aggregate-group-by-functions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/functions-and-operators/aggregate-group-by-functions.md b/functions-and-operators/aggregate-group-by-functions.md index 2f250a7170db2..ad6bca9e953d9 100644 --- a/functions-and-operators/aggregate-group-by-functions.md +++ b/functions-and-operators/aggregate-group-by-functions.md @@ -23,7 +23,7 @@ This section describes the supported MySQL group (aggregate) functions in TiDB. | [`MIN()`](https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_min) | Return the minimum value | | [`GROUP_CONCAT()`](https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_group-concat) | Return a concatenated string | | [`VARIANCE()`, `VAR_POP()`](https://dev.mysql.com/doc/refman/5.7/en/aggregate-functions.html#function_var-pop) | Return the population standard variance| -| [`JSON_OBJECTAGG(key, value)`](https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_json-objectagg) | Return result set as a single JSON object| +| [`JSON_OBJECTAGG(key, value)`](https://dev.mysql.com/doc/refman/5.7/en/aggregate-functions.html#function_json-objectagg) | Return the result set as a single JSON object containing key-value pairs | - Unless otherwise stated, group functions ignore `NULL` values. - If you use a group function in a statement containing no `GROUP BY` clause, it is equivalent to grouping on all rows. From ea669d1b36e37b5784de91147916b5cd63a77f80 Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Fri, 3 Jul 2020 10:16:10 +0800 Subject: [PATCH 11/28] Update functions-and-operators/json-functions.md Co-authored-by: Ran --- functions-and-operators/json-functions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/functions-and-operators/json-functions.md b/functions-and-operators/json-functions.md index e7ac42c31d8dd..6d666e838188e 100644 --- a/functions-and-operators/json-functions.md +++ b/functions-and-operators/json-functions.md @@ -47,7 +47,7 @@ TiDB supports most of the JSON functions that shipped with the GA release of MyS | [JSON_SET(json_doc, path, val[, path, val] ...)][json_set] | Inserts or updates data in a JSON document and returns the result | | [JSON_UNQUOTE(json_val)][json_unquote] | Unquotes a JSON value and returns the result as a string | | [JSON_ARRAY_APPEND(json_doc, path, val[, path, val] ...)][json_array_append] | Appends values to the end of the indicated arrays within a JSON document and returns the result | -| [JSON_ARRAY_INSERT(json_doc, path, val[, path, val] ...)][json_array_insert] | Updates a JSON document, inserting into an array within the document and returning the modified document | +| [JSON_ARRAY_INSERT(json_doc, path, val[, path, val] ...)][json_array_insert] | Insert values into the specified location of a JSON document and returns the result | ## Functions that return JSON value attributes From 10e23c31e0fd627ae2a3f8bbab9058259fca4f88 Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Fri, 3 Jul 2020 10:16:26 +0800 Subject: [PATCH 12/28] Update generated-columns.md Co-authored-by: Ran --- generated-columns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/generated-columns.md b/generated-columns.md index 4022ac366b0f1..1460ae20510a3 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -13,7 +13,7 @@ aliases: ['/docs/dev/generated-columns/','/docs/dev/reference/sql/generated-colu This document introduces the concept and usage of generated columns. -## Basic concepts of Generated columns +## Basic concepts Unlike general columns, the value of the generated column is calculated by the expression in the column definition. When inserting or updating a generated column, you cannot assign a value, you can only use `DEFAULT`. From 35ef841270df07a2f17e78cbac22d59b5754b7b0 Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Fri, 3 Jul 2020 10:16:42 +0800 Subject: [PATCH 13/28] Update generated-columns.md Co-authored-by: Ran --- generated-columns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/generated-columns.md b/generated-columns.md index 1460ae20510a3..da4595681084d 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -15,7 +15,7 @@ This document introduces the concept and usage of generated columns. ## Basic concepts -Unlike general columns, the value of the generated column is calculated by the expression in the column definition. When inserting or updating a generated column, you cannot assign a value, you can only use `DEFAULT`. +Unlike general columns, the value of the generated column is calculated by the expression in the column definition. When inserting or updating a generated column, you cannot assign a value, but only use `DEFAULT`. There are two kinds of generated columns: virtual and stored. A virtual generated column occupies no storage and is computed when it is read. A stored generated column is computed when it is written (inserted or updated) and occupies storage. Compared with the virtual generated columns, the stored generated columns perform better, but it takes up more disk space. From 3d23e7ea4c642fe34775d75ae40aa479f04f48de Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Fri, 3 Jul 2020 10:17:07 +0800 Subject: [PATCH 14/28] Update generated-columns.md Co-authored-by: Ran --- generated-columns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/generated-columns.md b/generated-columns.md index da4595681084d..308bf9e966fdf 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -17,7 +17,7 @@ This document introduces the concept and usage of generated columns. Unlike general columns, the value of the generated column is calculated by the expression in the column definition. When inserting or updating a generated column, you cannot assign a value, but only use `DEFAULT`. -There are two kinds of generated columns: virtual and stored. A virtual generated column occupies no storage and is computed when it is read. A stored generated column is computed when it is written (inserted or updated) and occupies storage. Compared with the virtual generated columns, the stored generated columns perform better, but it takes up more disk space. +There are two kinds of generated columns: virtual and stored. A virtual generated column occupies no storage and is computed when it is read. A stored generated column is computed when it is written (inserted or updated) and occupies storage. Compared with the virtual generated columns, the stored generated columns have better read performance, but take up more disk space. You can create an index on a generated column whether it is virtual or stored. From 021454d7a062facc818b4419899ce9d796e3951f Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Fri, 3 Jul 2020 10:19:44 +0800 Subject: [PATCH 15/28] Update generated-columns.md Co-authored-by: Ran --- generated-columns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/generated-columns.md b/generated-columns.md index 308bf9e966fdf..e83749f824a54 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -81,7 +81,7 @@ EXPLAIN SELECT name, id FROM person WHERE city = 'Beijing'; From the query execution plan, it can be seen that the index of city is used to read the `HANDLE` of the row that meets the condition `city ='Beijing'`, and then use this `HANDLE` to read the data of the row. -If no data exists at path `$.city`, `JSON_EXTRACT` returns `NULL`. If you want to enforce a constraint that `city` must be `NOT NULL`, you can define the virtual generated column as follows: +If no data exists at path `$.city`, `JSON_EXTRACT` returns `NULL`. If you want to enforce a constraint that `city` must be `NOT NULL`, you can define the virtual generated column as follows: {{< copyable "sql" >}} From ecf4fc7cbac0e816709445bcae131af425c79c4b Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Fri, 3 Jul 2020 10:20:05 +0800 Subject: [PATCH 16/28] Update generated-columns.md Co-authored-by: Ran --- generated-columns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/generated-columns.md b/generated-columns.md index e83749f824a54..3406a368b5633 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -139,7 +139,7 @@ desc select a+1 from t where a+1=3; > **Note:** > -> Only when the expression type and the generated column type are strictly equal, the replacement will be performed. +> Only when the expression type and the generated column type are strictly equal, the replacement is performed. > > In the above example, the column type of `a` is int and the column type of `a+1` is bigint. If the type of the generated column is set to int, the replacement will not occur. > From 9037578c580c7e60b501800d65eb4a73e80e67bb Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Fri, 3 Jul 2020 10:20:13 +0800 Subject: [PATCH 17/28] Update generated-columns.md Co-authored-by: Ran --- generated-columns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/generated-columns.md b/generated-columns.md index 3406a368b5633..e3bf3a7bacf22 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -145,7 +145,7 @@ desc select a+1 from t where a+1=3; > > For type conversion rules, see [Type Conversion of Expression Evaluation] (/functions-and-operators/type-conversion-in-expression-evaluation.md). -## Limitations of Generated columns +## Limitations The current limitations of JSON and generated columns are as follows: From 41445b6ee1cd51e1ee7ac39356416febcb44cf0f Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Fri, 3 Jul 2020 10:20:44 +0800 Subject: [PATCH 18/28] Update generated-columns.md Co-authored-by: Ran --- generated-columns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/generated-columns.md b/generated-columns.md index e3bf3a7bacf22..8c6ebe1d1e967 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -150,6 +150,6 @@ desc select a+1 from t where a+1=3; The current limitations of JSON and generated columns are as follows: - You cannot add the generated column through `ALTER TABLE`. -- You can neither convert a generated stored column to a normal column through the `ALTER TABLE` statement nor convert a normal column to a generated stored column. +- You can neither convert a stored generated column to a normal column through the `ALTER TABLE` statement nor convert a normal column to a stored generated column. - Not all [JSON functions](/functions-and-operators/json-functions.md) are supported; - Currently, the generated column index replacement rule is valid only when the generated column is a virtual generated column. It is not valid on the stored generated column, but the index can still be used by directly using the generated column itself. From ddd171502d431753f3ba890370433ec503892e34 Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Fri, 3 Jul 2020 10:26:32 +0800 Subject: [PATCH 19/28] Update generated-columns.md Co-authored-by: Ran --- generated-columns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/generated-columns.md b/generated-columns.md index 8c6ebe1d1e967..4c4a57011994a 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -21,7 +21,7 @@ There are two kinds of generated columns: virtual and stored. A virtual generate You can create an index on a generated column whether it is virtual or stored. -## Usage of Generated columns +## Usage One of the main usage of generated columns: extracting data from the JSON data type and indexing the data. From 6b171e0434297d23c4a73a4e74f34279ddfb106d Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Fri, 3 Jul 2020 10:26:42 +0800 Subject: [PATCH 20/28] Update generated-columns.md Co-authored-by: Ran --- generated-columns.md | 1 + 1 file changed, 1 insertion(+) diff --git a/generated-columns.md b/generated-columns.md index 4c4a57011994a..bb2f6b5da5e18 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -151,5 +151,6 @@ The current limitations of JSON and generated columns are as follows: - You cannot add the generated column through `ALTER TABLE`. - You can neither convert a stored generated column to a normal column through the `ALTER TABLE` statement nor convert a normal column to a stored generated column. +- You cannot modify the expression of a stored generated column through the `ALTER TABLE` statement. - Not all [JSON functions](/functions-and-operators/json-functions.md) are supported; - Currently, the generated column index replacement rule is valid only when the generated column is a virtual generated column. It is not valid on the stored generated column, but the index can still be used by directly using the generated column itself. From 6b8db3dc1e24727849fb7f0db07a3e32df0dba2a Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Fri, 3 Jul 2020 10:27:05 +0800 Subject: [PATCH 21/28] Update generated-columns.md Co-authored-by: Ran --- generated-columns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/generated-columns.md b/generated-columns.md index bb2f6b5da5e18..b3e1ce6163e86 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -23,7 +23,7 @@ You can create an index on a generated column whether it is virtual or stored. ## Usage -One of the main usage of generated columns: extracting data from the JSON data type and indexing the data. +One of the main usage of generated columns is to extract data from the JSON data type and indexing the data. In both MySQL 5.7 and TiDB, columns of type JSON can not be indexed directly. i.e. The following table structure is **not supported**: From 0d7bdeca417b28002f736e579b66ae909286ef5f Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Fri, 3 Jul 2020 10:27:32 +0800 Subject: [PATCH 22/28] Update generated-columns.md Co-authored-by: Ran --- generated-columns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/generated-columns.md b/generated-columns.md index b3e1ce6163e86..44661e630ac5b 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -79,7 +79,7 @@ EXPLAIN SELECT name, id FROM person WHERE city = 'Beijing'; +---------------------------------+---------+-----------+--------------------------------+-------------------------------------------------------------+ ``` -From the query execution plan, it can be seen that the index of city is used to read the `HANDLE` of the row that meets the condition `city ='Beijing'`, and then use this `HANDLE` to read the data of the row. +From the query execution plan, it can be seen that the `city` index is used to read the `HANDLE` of the row that meets the condition `city ='Beijing'`, and then it uses this `HANDLE` to read the data of the row. If no data exists at path `$.city`, `JSON_EXTRACT` returns `NULL`. If you want to enforce a constraint that `city` must be `NOT NULL`, you can define the virtual generated column as follows: From 8e9744457de704ba10a7169b3e1b6cc0a3b84b3d Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Fri, 3 Jul 2020 10:27:48 +0800 Subject: [PATCH 23/28] Update generated-columns.md Co-authored-by: Ran --- generated-columns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/generated-columns.md b/generated-columns.md index 44661e630ac5b..8e6e598d51a22 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -25,7 +25,7 @@ You can create an index on a generated column whether it is virtual or stored. One of the main usage of generated columns is to extract data from the JSON data type and indexing the data. -In both MySQL 5.7 and TiDB, columns of type JSON can not be indexed directly. i.e. The following table structure is **not supported**: +In both MySQL 5.7 and TiDB, columns of type JSON can not be indexed directly. That is, the following table schema is **not supported**: {{< copyable "sql" >}} From 9b238dd6ac4176477d4adfb120f57d993eeea6b9 Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Fri, 3 Jul 2020 10:28:23 +0800 Subject: [PATCH 24/28] Update generated-columns.md Co-authored-by: Ran --- generated-columns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/generated-columns.md b/generated-columns.md index 8e6e598d51a22..607d00e5759a8 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -54,7 +54,7 @@ CREATE TABLE person ( ); ``` -In this table, the `city` column is a **virtual generated column**, and there is an index on `city` column. The following query can use the index to speed up: +In this table, the `city` column is a **virtual generated column** and has an index. The following query can use the index to speed up the execution: {{< copyable "sql" >}} From af69b201560a273fb9ea726fc90af2576b2ec82e Mon Sep 17 00:00:00 2001 From: Ran Date: Fri, 3 Jul 2020 11:32:51 +0800 Subject: [PATCH 25/28] Update generated-columns.md --- generated-columns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/generated-columns.md b/generated-columns.md index 607d00e5759a8..f3609e5b355f0 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -95,7 +95,7 @@ CREATE TABLE person ( ); ``` -## Validation of Generated columns +## Validation of generated columns Both `INSERT` and `UPDATE` statements check virtual column definitions. Rows that do not pass validation return errors: From 13c411226cf989d7266ab517befaf30dddaee999 Mon Sep 17 00:00:00 2001 From: Ran Date: Fri, 3 Jul 2020 12:22:47 +0800 Subject: [PATCH 26/28] Update generated-columns.md --- generated-columns.md | 1 + 1 file changed, 1 insertion(+) diff --git a/generated-columns.md b/generated-columns.md index f3609e5b355f0..67d926d89fa0b 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -152,5 +152,6 @@ The current limitations of JSON and generated columns are as follows: - You cannot add the generated column through `ALTER TABLE`. - You can neither convert a stored generated column to a normal column through the `ALTER TABLE` statement nor convert a normal column to a stored generated column. - You cannot modify the expression of a stored generated column through the `ALTER TABLE` statement. +- You cannot modify the expression of a stored generated column through the `ALTER TABLE` statement. - Not all [JSON functions](/functions-and-operators/json-functions.md) are supported; - Currently, the generated column index replacement rule is valid only when the generated column is a virtual generated column. It is not valid on the stored generated column, but the index can still be used by directly using the generated column itself. From 13acf019e59b2ce18d87eff25e23cf284a96966a Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Mon, 6 Jul 2020 11:33:09 +0800 Subject: [PATCH 27/28] Update generated-columns.md --- generated-columns.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/generated-columns.md b/generated-columns.md index 67d926d89fa0b..cb3db398407f5 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -149,9 +149,8 @@ desc select a+1 from t where a+1=3; The current limitations of JSON and generated columns are as follows: -- You cannot add the generated column through `ALTER TABLE`. +- You cannot add the stored generated column through `ALTER TABLE`. - You can neither convert a stored generated column to a normal column through the `ALTER TABLE` statement nor convert a normal column to a stored generated column. - You cannot modify the expression of a stored generated column through the `ALTER TABLE` statement. -- You cannot modify the expression of a stored generated column through the `ALTER TABLE` statement. - Not all [JSON functions](/functions-and-operators/json-functions.md) are supported; - Currently, the generated column index replacement rule is valid only when the generated column is a virtual generated column. It is not valid on the stored generated column, but the index can still be used by directly using the generated column itself. From ed9e2b3eb0eaef45a9a9f032158402c9ad090395 Mon Sep 17 00:00:00 2001 From: "Zefeng.Nie" <37355882+niezefeng@users.noreply.github.com> Date: Mon, 6 Jul 2020 11:35:23 +0800 Subject: [PATCH 28/28] Update generated-columns.md --- generated-columns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/generated-columns.md b/generated-columns.md index cb3db398407f5..70761d4eb000d 100644 --- a/generated-columns.md +++ b/generated-columns.md @@ -149,7 +149,7 @@ desc select a+1 from t where a+1=3; The current limitations of JSON and generated columns are as follows: -- You cannot add the stored generated column through `ALTER TABLE`. +- You cannot add a stored generated column through `ALTER TABLE`. - You can neither convert a stored generated column to a normal column through the `ALTER TABLE` statement nor convert a normal column to a stored generated column. - You cannot modify the expression of a stored generated column through the `ALTER TABLE` statement. - Not all [JSON functions](/functions-and-operators/json-functions.md) are supported;