From 5a6982a99c85c44872a5d5b0a4494cea386436d6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Fri, 10 May 2024 11:49:43 +0200 Subject: [PATCH 01/12] Update statistics related things to use EBNF etc --- sql-statements/show-column-stats-usage.md | 56 +++++ sql-statements/show-stats-buckets.md | 65 +++++ sql-statements/sql-statement-drop-stats.md | 48 ++-- .../sql-statement-show-stats-healthy.md | 9 + .../sql-statement-show-stats-histograms.md | 22 +- .../sql-statement-show-stats-locked.md | 9 + .../sql-statement-show-stats-topn.md | 58 +++++ statistics.md | 237 ++---------------- 8 files changed, 265 insertions(+), 239 deletions(-) create mode 100644 sql-statements/show-column-stats-usage.md create mode 100644 sql-statements/show-stats-buckets.md create mode 100644 sql-statements/sql-statement-show-stats-topn.md diff --git a/sql-statements/show-column-stats-usage.md b/sql-statements/show-column-stats-usage.md new file mode 100644 index 0000000000000..8fa6e041b19c7 --- /dev/null +++ b/sql-statements/show-column-stats-usage.md @@ -0,0 +1,56 @@ +--- +title: SHOW COLUMN_STATS_USAGE +summary: An overview of the usage of SHOW COLUMN_STATS_USAGE for TiDB database. +--- + +# SHOW COLUMN_STATS_USAGE + +The `SHOW COLUMN_STATS_USAGE` statement returns the following 6 columns: + +| Column name | Description | +| -------- | ------------- | +| `Db_name` | The database name | +| `Table_name` | The table name | +| `Partition_name` | The partition name | +| `Column_name` | The column name | +| `Last_used_at` | The last time when the column statistics were used in the query optimization | +| `Last_analyzed_at` | The last time when the column statistics were collected | + +## Synopsis + +```ebnf+diagram +ShowColumnStatsUsageStmt ::= + "SHOW" "COLUMN_STATS_USAGE" ShowLikeOrWhere? + +ShowLikeOrWhere ::= + "LIKE" SimpleExpr +| "WHERE" Expression +``` + +## Examples + +```sql +SHOW COLUMN_STATS_USAGE; +``` + +``` ++---------+------------+----------------+-------------+--------------+---------------------+ +| Db_name | Table_name | Partition_name | Column_name | Last_used_at | Last_analyzed_at | ++---------+------------+----------------+-------------+--------------+---------------------+ +| test | t1 | | id | NULL | 2024-05-10 11:04:23 | +| test | t1 | | b | NULL | 2024-05-10 11:04:23 | +| test | t1 | | pad | NULL | 2024-05-10 11:04:23 | +| test | t | | a | NULL | 2024-05-10 11:37:06 | +| test | t | | b | NULL | 2024-05-10 11:37:06 | ++---------+------------+----------------+-------------+--------------+---------------------+ +5 rows in set (0.00 sec) +``` + +## MySQL compatibility + +This statement is a TiDB extension to MySQL syntax. + +## See also + +* [ANALYZE](/sql-statements/sql-statement-analyze-table.md) +* [Introduction to Statistics](/statistics.md) \ No newline at end of file diff --git a/sql-statements/show-stats-buckets.md b/sql-statements/show-stats-buckets.md new file mode 100644 index 0000000000000..ba32975dd71cf --- /dev/null +++ b/sql-statements/show-stats-buckets.md @@ -0,0 +1,65 @@ +--- +title: SHOW STATS_BUCKETS +summary: An overview of the usage of SHOW STATS_BUCKETS for TiDB database. +--- + +# SHOW STATS_BUCKETS + +This statement returns information about all the buckets. + +Currently, the `SHOW STATS_BUCKETS` statement returns the following 11 columns: + +| Column name | Description | +| :-------- | :------------- | +| `Db_name` | The database name | +| `Table_name` | The table name | +| `Partition_name` | The partition name | +| `Column_name` | The column name (when `is_index` is `0`) or the index name (when `is_index` is `1`) | +| `Is_index` | Whether it is an index column or not | +| `Bucket_id` | The ID of a bucket | +| `Count` | The number of all the values that falls on the bucket and the previous buckets | +| `Repeats` | The occurrence number of the maximum value | +| `Lower_bound` | The minimum value | +| `Upper_bound` | The maximum value | +| `Ndv` | The number of different values in the bucket. When `tidb_analyze_version` = `1`, `ndv` is always `0`, which has no actual meaning. | + +## Synopsis + +```ebnf+diagram +ShowStatsBucketsStmt ::= + "SHOW" "STATS_BUCKETS" ShowLikeOrWhere? + +ShowLikeOrWhere ::= + "LIKE" SimpleExpr +| "WHERE" Expression +``` + +## Examples + +```sql +SHOW STATS_BUCKETS WHERE Table_name='t'; +``` + +``` ++---------+------------+----------------+-------------+----------+-----------+-------+---------+--------------------------+--------------------------+------+ +| Db_name | Table_name | Partition_name | Column_name | Is_index | Bucket_id | Count | Repeats | Lower_Bound | Upper_Bound | Ndv | ++---------+------------+----------------+-------------+----------+-----------+-------+---------+--------------------------+--------------------------+------+ +| test | t | | a | 0 | 0 | 1 | 1 | 2023-12-27 00:00:00 | 2023-12-27 00:00:00 | 0 | +| test | t | | a | 0 | 1 | 2 | 1 | 2023-12-28 00:00:00 | 2023-12-28 00:00:00 | 0 | +| test | t | | ia | 1 | 0 | 1 | 1 | (NULL, 2) | (NULL, 2) | 0 | +| test | t | | ia | 1 | 1 | 2 | 1 | (NULL, 4) | (NULL, 4) | 0 | +| test | t | | ia | 1 | 2 | 3 | 1 | (2023-12-27 00:00:00, 1) | (2023-12-27 00:00:00, 1) | 0 | +| test | t | | ia | 1 | 3 | 4 | 1 | (2023-12-28 00:00:00, 3) | (2023-12-28 00:00:00, 3) | 0 | ++---------+------------+----------------+-------------+----------+-----------+-------+---------+--------------------------+--------------------------+------+ +6 rows in set (0.00 sec) +``` + + +## MySQL compatibility + +This statement is a TiDB extension to MySQL syntax. + +## See also + +* [ANALYZE](/sql-statements/sql-statement-analyze-table.md) +* [Introduction to Statistics](/statistics.md) \ No newline at end of file diff --git a/sql-statements/sql-statement-drop-stats.md b/sql-statements/sql-statement-drop-stats.md index 86ddcfd01bf93..c2aaf1a54ad82 100644 --- a/sql-statements/sql-statement-drop-stats.md +++ b/sql-statements/sql-statement-drop-stats.md @@ -12,10 +12,7 @@ The `DROP STATS` statement is used to delete the statistics of the selected tabl ```ebnf+diagram DropStatsStmt ::= - 'DROP' 'STATS' TableNameList - -TableNameList ::= - TableName ( ',' TableName )* + 'DROP' 'STATS' TableName ("PARTITION" partition | "GLOBAL")? ( ',' TableName )* TableName ::= Identifier ('.' Identifier)? @@ -23,23 +20,19 @@ TableName ::= ## Examples -{{< copyable "sql" >}} - ```sql CREATE TABLE t(a INT); ``` -```sql +``` Query OK, 0 rows affected (0.01 sec) ``` -{{< copyable "sql" >}} - ```sql SHOW STATS_META WHERE db_name='test' and table_name='t'; ``` -```sql +``` +---------+------------+----------------+---------------------+--------------+-----------+ | Db_name | Table_name | Partition_name | Update_time | Modify_count | Row_count | +---------+------------+----------------+---------------------+--------------+-----------+ @@ -48,25 +41,50 @@ SHOW STATS_META WHERE db_name='test' and table_name='t'; 1 row in set (0.00 sec) ``` -{{< copyable "sql" >}} - ```sql DROP STATS t; ``` +``` +Query OK, 0 rows affected (0.00 sec) +``` + +```sql +SHOW STATS_META WHERE db_name='test' and table_name='t'; +``` + +``` +Empty set (0.00 sec) +``` + ```sql +DROP STATS TableName +``` + +``` Query OK, 0 rows affected (0.00 sec) ``` -{{< copyable "sql" >}} +The preceding statement deletes all statistics of `TableName`. If a partitioned table is specified, this statement will delete statistics of all partitions in this table as well as GlobalStats generated in dynamic pruning mode. ```sql -SHOW STATS_META WHERE db_name='test' and table_name='t'; +DROP STATS TableName PARTITION PartitionNameList; ``` +``` +Query OK, 0 rows affected (0.00 sec) +``` + +This preceding statement only deletes statistics of the specified partitions in `PartitionNameList`. + ```sql -Empty set (0.00 sec) +DROP STATS TableName GLOBAL; +``` + +``` +Query OK, 0 rows affected (0.00 sec) ``` +The preceding statement only deletes GlobalStats generated in dynamic pruning mode of the specified table. ## MySQL compatibility diff --git a/sql-statements/sql-statement-show-stats-healthy.md b/sql-statements/sql-statement-show-stats-healthy.md index 631e2d97c1d28..e397d352df723 100644 --- a/sql-statements/sql-statement-show-stats-healthy.md +++ b/sql-statements/sql-statement-show-stats-healthy.md @@ -9,6 +9,15 @@ The `SHOW STATS_HEALTHY` statement shows an estimation of how accurate statistic The health of a table can be improved by running the [`ANALYZE`](/sql-statements/sql-statement-analyze-table.md) statement. `ANALYZE` runs automatically when the health drops below the [`tidb_auto_analyze_ratio`](/system-variables.md#tidb_auto_analyze_ratio) threshold. +Currently, the `SHOW STATS_HEALTHY` statement outputs 4 columns: + +| Column name | Description | +| -------- | ------------- | +| Db_name | Database name | +| Table_name | Table name | +| Partition_name | Partition name | +| Healthy | Healthy percentage between 0 and 100 | + ## Synopsis ```ebnf+diagram diff --git a/sql-statements/sql-statement-show-stats-histograms.md b/sql-statements/sql-statement-show-stats-histograms.md index 3bc9cfe0b17d8..97a656c181e41 100644 --- a/sql-statements/sql-statement-show-stats-histograms.md +++ b/sql-statements/sql-statement-show-stats-histograms.md @@ -1,6 +1,6 @@ --- title: SHOW STATS_HISTOGRAMS -summary: An overview of the usage of SHOW HISTOGRAMS for TiDB database. +summary: An overview of the usage of SHOW STATS_HISTOGRAMS for TiDB database. aliases: ['/docs/dev/sql-statements/sql-statement-show-histograms/','/tidb/dev/sql-statement-show-histograms'] --- @@ -8,6 +8,26 @@ aliases: ['/docs/dev/sql-statements/sql-statement-show-histograms/','/tidb/dev/s This statement shows the histogram information collected by the [`ANALYZE` statement](/sql-statements/sql-statement-analyze-table.md) as part of database [statistics](/statistics.md). +Currently, the `SHOW STATS_HISTOGRAMS` statement outputs 15 columns: + +| Column name | Description | +| -------- | ------------- | +| Db_name | Database name | +| Table_name | Table name | +| Partition_name | Partition name | +| Column_name | Column name | +| Is_index | 1 if this is an index, else 0 | +| Update_time | Update time | +| Distinct_count | Distinct count | +| Null_count | NULL count | +| Avg_col_size | Average col size | +| Correlation | Correlation | +| Load_status | Load status like `allEvicted`, `allLoaded`, etc | +| Total_mem_usage | Total memory usage | +| Hist_mem_usage | Historical memory usage | +| Topn_mem_usage | TopN memory usage | +| Cms_mem_usage | CMS memory usage | + ## Synopsis ```ebnf+diagram diff --git a/sql-statements/sql-statement-show-stats-locked.md b/sql-statements/sql-statement-show-stats-locked.md index a3f9d66e41273..4e854fd25c41d 100644 --- a/sql-statements/sql-statement-show-stats-locked.md +++ b/sql-statements/sql-statement-show-stats-locked.md @@ -7,6 +7,15 @@ summary: An overview of the usage of SHOW STATS_LOCKED for the TiDB database. `SHOW STATS_LOCKED` shows the tables whose statistics are locked. +Currently, the `SHOW STATS_LOCKED` statement outputs 4 columns: + +| Column name | Description | +| -------- | ------------- | +| Db_name | Database name | +| Table_name | Table name | +| Partition_name | Partition name | +| Status | Status, e.g. `locked` | + ## Synopsis ```ebnf+diagram diff --git a/sql-statements/sql-statement-show-stats-topn.md b/sql-statements/sql-statement-show-stats-topn.md new file mode 100644 index 0000000000000..5f0767e1f494b --- /dev/null +++ b/sql-statements/sql-statement-show-stats-topn.md @@ -0,0 +1,58 @@ +--- +title: SHOW STATS_TOPN +summary: An overview of the usage of SHOW STATS_TOPN for TiDB database. +--- + +# SHOW STATS_TOPN + +Currently, the `SHOW STATS_TOPN` statement returns the following 7 columns: + +| Column name | Description | +| ---- | ----| +| `Db_name` | The database name | +| `Table_name` | The table name | +| `Partition_name` | The partition name | +| `Column_name` | The column name (when `is_index` is `0`) or the index name (when `is_index` is `1`) | +| `Is_index` | Whether it is an index column or not | +| `Value` | The value of this column | +| `Count` | How many times the value appears | + +## Synopsis + +```ebnf+diagram +ShowStatsTopnStmt ::= + "SHOW" "STATS_TOPN" ShowLikeOrWhere? + +ShowLikeOrWhere ::= + "LIKE" SimpleExpr +| "WHERE" Expression +``` + +## Example + +```sql +SHOW STATS_TOPN WHERE Table_name='t'; +``` + +``` ++---------+------------+----------------+-------------+----------+--------------------------+-------+ +| Db_name | Table_name | Partition_name | Column_name | Is_index | Value | Count | ++---------+------------+----------------+-------------+----------+--------------------------+-------+ +| test | t | | a | 0 | 2023-12-27 00:00:00 | 1 | +| test | t | | a | 0 | 2023-12-28 00:00:00 | 1 | +| test | t | | ia | 1 | (NULL, 2) | 1 | +| test | t | | ia | 1 | (NULL, 4) | 1 | +| test | t | | ia | 1 | (2023-12-27 00:00:00, 1) | 1 | +| test | t | | ia | 1 | (2023-12-28 00:00:00, 3) | 1 | ++---------+------------+----------------+-------------+----------+--------------------------+-------+ +6 rows in set (0.00 sec) +``` + +## MySQL compatibility + +This statement is a TiDB extension to MySQL syntax. + +## See also + +* [ANALYZE](/sql-statements/sql-statement-analyze-table.md) +* [Introduction to Statistics](/statistics.md) \ No newline at end of file diff --git a/statistics.md b/statistics.md index 782278d92a9c5..66a7cfe58d246 100644 --- a/statistics.md +++ b/statistics.md @@ -12,7 +12,7 @@ TiDB uses statistics as input to the optimizer to estimate the number of rows pr ### Automatic update -For the `INSERT`, `DELETE`, or `UPDATE` statements, TiDB automatically updates the number of rows and modified rows in statistics. +For the [`INSERT`](/sql-statements/sql-statement-insert.md), [`DELETE`](/sql-statements/sql-statement-delete.md), or [`UPDATE`](/sql-statements/sql-statement-update.md) statements, TiDB automatically updates the number of rows and modified rows in statistics. @@ -132,8 +132,6 @@ If a table has many columns, collecting statistics on all the columns can cause - To collect statistics on specific columns, use the following syntax: - {{< copyable "sql" >}} - ```sql ANALYZE TABLE TableName COLUMNS ColumnNameList [WITH NUM BUCKETS|TOPN|CMSKETCH DEPTH|CMSKETCH WIDTH]|[WITH NUM SAMPLES|WITH FLOATNUM SAMPLERATE]; ``` @@ -162,8 +160,6 @@ If a table has many columns, collecting statistics on all the columns can cause 2. After the query pattern of your business is relatively stable, collect statistics on `PREDICATE COLUMNS` by using the following syntax: - {{< copyable "sql" >}} - ```sql ANALYZE TABLE TableName PREDICATE COLUMNS [WITH NUM BUCKETS|TOPN|CMSKETCH DEPTH|CMSKETCH WIDTH]|[WITH NUM SAMPLES|WITH FLOATNUM SAMPLERATE]; ``` @@ -172,13 +168,11 @@ If a table has many columns, collecting statistics on all the columns can cause > **Note:** > - > - If the `mysql.column_stats_usage` system table does not contain any `PREDICATE COLUMNS` recorded for that table, the preceding syntax collects statistics on all columns and all indexes in that table. + > - If the [`mysql.column_stats_usage`](/mysql-schema.md) system table does not contain any `PREDICATE COLUMNS` recorded for that table, the preceding syntax collects statistics on all columns and all indexes in that table. > - Any columns excluded from collection (either by manually listing columns or using `PREDICATE COLUMNS`) will not have their statistics overwritten. When executing a new type of SQL query, the optimizer will use the old statistics for such columns if it exists or pseudo column statistics if columns never had statistics collected. The next ANALYZE using `PREDICATE COLUMNS` will collect the statistics on those columns. - To collect statistics on all columns and indexes, use the following syntax: - {{< copyable "sql" >}} - ```sql ANALYZE TABLE TableName ALL COLUMNS [WITH NUM BUCKETS|TOPN|CMSKETCH DEPTH|CMSKETCH WIDTH]|[WITH NUM SAMPLES|WITH FLOATNUM SAMPLERATE]; ``` @@ -187,16 +181,12 @@ If a table has many columns, collecting statistics on all the columns can cause - To collect statistics on all partitions in `PartitionNameList` in `TableName`, use the following syntax: - {{< copyable "sql" >}} - ```sql ANALYZE TABLE TableName PARTITION PartitionNameList [WITH NUM BUCKETS|TOPN|CMSKETCH DEPTH|CMSKETCH WIDTH]|[WITH NUM SAMPLES|WITH FLOATNUM SAMPLERATE]; ``` - To collect index statistics on all partitions in `PartitionNameList` in `TableName`, use the following syntax: - {{< copyable "sql" >}} - ```sql ANALYZE TABLE TableName PARTITION PartitionNameList INDEX [IndexNameList] [WITH NUM BUCKETS|TOPN|CMSKETCH DEPTH|CMSKETCH WIDTH]|[WITH NUM SAMPLES|WITH FLOATNUM SAMPLERATE]; ``` @@ -207,8 +197,6 @@ If a table has many columns, collecting statistics on all the columns can cause > > Currently, collecting statistics on `PREDICATE COLUMNS` is an experimental feature. It is not recommended that you use it in production environments. - {{< copyable "sql" >}} - ```sql ANALYZE TABLE TableName PARTITION PartitionNameList [COLUMNS ColumnNameList|PREDICATE COLUMNS|ALL COLUMNS] [WITH NUM BUCKETS|TOPN|CMSKETCH DEPTH|CMSKETCH WIDTH]|[WITH NUM SAMPLES|WITH FLOATNUM SAMPLERATE]; ``` @@ -342,22 +330,7 @@ If you want to persist the column configuration in the `ANALYZE` statement (incl - When TiDB collects statistics automatically or when you manually collect statistics by executing the `ANALYZE` statement without specifying the column configuration, TiDB continues using the previously persisted configuration for statistics collection. - When you manually execute the `ANALYZE` statement multiple times with column configuration specified, TiDB overwrites the previously recorded persistent configuration using the new configuration specified by the latest `ANALYZE` statement. -To locate `PREDICATE COLUMNS` and columns on which statistics have been collected, use the following syntax: - -```sql -SHOW COLUMN_STATS_USAGE [ShowLikeOrWhere]; -``` - -The `SHOW COLUMN_STATS_USAGE` statement returns the following 6 columns: - -| Column name | Description | -| -------- | ------------- | -| `Db_name` | The database name | -| `Table_name` | The table name | -| `Partition_name` | The partition name | -| `Column_name` | The column name | -| `Last_used_at` | The last time when the column statistics were used in the query optimization | -| `Last_analyzed_at` | The last time when the column statistics were collected | +To locate `PREDICATE COLUMNS` and columns on which statistics have been collected, use the [`SHOW COLUMN_STATS_USAGE`](/sql-statements/show-column-stats-usage.md) statement. In the following example, after executing `ANALYZE TABLE t PREDICATE COLUMNS;`, TiDB collects statistics on columns `b`, `c`, and `d`, where column `b` is a `PREDICATE COLUMN` and columns `c` and `d` are index columns. @@ -401,7 +374,7 @@ WHERE db_name = 'test' AND table_name = 't' AND last_analyzed_at IS NOT NULL; ## Versions of statistics -The `tidb_analyze_version` variable controls the statistics collected by TiDB. Currently, two versions of statistics are supported: `tidb_analyze_version = 1` and `tidb_analyze_version = 2`. +The [`tidb_analyze_version`](/system-variables.md#tidb_analyze_version-new-in-v510) variable controls the statistics collected by TiDB. Currently, two versions of statistics are supported: `tidb_analyze_version = 1` and `tidb_analyze_version = 2`. - For TiDB Self-Hosted, the default value of this variable changes from `1` to `2` starting from v5.3.0. - For TiDB Cloud, the default value of this variable changes from `1` to `2` starting from v6.5.0. @@ -459,30 +432,7 @@ You can view the `ANALYZE` status and statistics information using the following ### `ANALYZE` state -When executing the `ANALYZE` statement, you can view the current state of `ANALYZE` using the following SQL statement: - -{{< copyable "sql" >}} - -```sql -SHOW ANALYZE STATUS [ShowLikeOrWhere] -``` - -This statement returns the state of `ANALYZE`. You can use `ShowLikeOrWhere` to filter the information you need. - -Currently, the `SHOW ANALYZE STATUS` statement returns the following 11 columns: - -| Column name | Description | -| :-------- | :------------- | -| table_schema | The database name | -| table_name | The table name | -| partition_name| The partition name | -| job_info | The task information. If an index is analyzed, this information will include the index name. When `tidb_analyze_version =2`, this information will include configuration items such as sample rate. | -| processed_rows | The number of rows that have been analyzed | -| start_time | The time at which the task starts | -| state | The state of a task, including `pending`, `running`, `finished`, and `failed` | -| fail_reason | The reason why the task fails. If the execution is successful, the value is `NULL`. | -| instance | The TiDB instance that executes the task | -| process_id | The process ID that executes the task | +When executing the `ANALYZE` statement, you can view the current state of `ANALYZE` using [`SHOW ANALYZE STATUS`](/sql-statements/sql-statement-show-analyze-status.md). Starting from TiDB v6.1.0, the `SHOW ANALYZE STATUS` statement supports showing cluster-level tasks. Even after a TiDB restart, you can still view task records before the restart using this statement. Before TiDB v6.1.0, the `SHOW ANALYZE STATUS` statement can only show instance-level tasks, and task records are cleared after a TiDB restart. @@ -503,172 +453,27 @@ mysql> SHOW ANALYZE STATUS [ShowLikeOrWhere]; ### Metadata of tables -You can use the `SHOW STATS_META` statement to view the total number of rows and the number of updated rows. - -{{< copyable "sql" >}} - -```sql -SHOW STATS_META [ShowLikeOrWhere]; -``` - -The syntax of `ShowLikeOrWhereOpt` is as follows: - -![ShowLikeOrWhereOpt](/media/sqlgram/ShowLikeOrWhereOpt.png) - -Currently, the `SHOW STATS_META` statement returns the following 6 columns: - -| Column name | Description | -| :-------- | :------------- | -| `db_name` | The database name | -| `table_name` | The table name | -| `partition_name`| The partition name | -| `update_time` | The time of the update | -| `modify_count` | The number of modified rows | -| `row_count` | The total number of rows | - -> **Note:** -> -> When TiDB automatically updates the total number of rows and the number of modified rows according to DML statements, `update_time` is also updated. Therefore, `update_time` does not necessarily indicate the last time when the `ANALYZE` statement is executed. +You can use the [`SHOW STATS_META`](/sql-statements/sql-statement-show-stats-meta.md) statement to view the total number of rows and the number of updated rows. ### Health state of tables -You can use the `SHOW STATS_HEALTHY` statement to check the health state of tables and roughly estimate the accuracy of the statistics. When `modify_count` >= `row_count`, the health state is 0; when `modify_count` < `row_count`, the health state is (1 - `modify_count`/`row_count`) * 100. - -The syntax is as follows: - -{{< copyable "sql" >}} - -```sql -SHOW STATS_HEALTHY [ShowLikeOrWhere]; -``` - -The synopsis of `SHOW STATS_HEALTHY` is: - -![ShowStatsHealthy](/media/sqlgram/ShowStatsHealthy.png) - -Currently, the `SHOW STATS_HEALTHY` statement returns the following 4 columns: - -| Column name | Description | -| :-------- | :------------- | -| `db_name` | The database name | -| `table_name` | The table name | -| `partition_name` | The partition name | -| `healthy` | The health state of tables | +You can use the [`SHOW STATS_HEALTHY`](/sql-statements/sql-statement-show-stats-healthy.md) statement to check the health state of tables and roughly estimate the accuracy of the statistics. When `modify_count` >= `row_count`, the health state is 0; when `modify_count` < `row_count`, the health state is (1 - `modify_count`/`row_count`) * 100. ### Metadata of columns -You can use the `SHOW STATS_HISTOGRAMS` statement to view the number of different values and the number of `NULL` in all the columns. - -Syntax as follows: - -{{< copyable "sql" >}} - -```sql -SHOW STATS_HISTOGRAMS [ShowLikeOrWhere] -``` - -This statement returns the number of different values and the number of `NULL` in all the columns. You can use `ShowLikeOrWhere` to filter the information you need. - -Currently, the `SHOW STATS_HISTOGRAMS` statement returns the following 10 columns: - -| Column name | Description | -| :-------- | :------------- | -| `db_name` | The database name | -| `table_name` | The table name | -| `partition_name` | The partition name | -| `column_name` | The column name (when `is_index` is `0`) or the index name (when `is_index` is `1`) | -| `is_index` | Whether it is an index column or not | -| `update_time` | The time of the update | -| `distinct_count` | The number of different values | -| `null_count` | The number of `NULL` | -| `avg_col_size` | The average length of columns | -| correlation | The Pearson correlation coefficient of the column and the integer primary key, which indicates the degree of association between the two columns| +You can use the [`SHOW STATS_HISTOGRAMS`](/sql-statements/sql-statement-show-stats-histograms.md) statement to view the number of different values and the number of `NULL` in all the columns. ### Buckets of histogram -You can use the `SHOW STATS_BUCKETS` statement to view each bucket of the histogram. - -The syntax is as follows: - -{{< copyable "sql" >}} - -```sql -SHOW STATS_BUCKETS [ShowLikeOrWhere] -``` - -The diagram is as follows: - -![SHOW STATS_BUCKETS](/media/sqlgram/SHOW_STATS_BUCKETS.png) - -This statement returns information about all the buckets. You can use `ShowLikeOrWhere` to filter the information you need. - -Currently, the `SHOW STATS_BUCKETS` statement returns the following 11 columns: - -| Column name | Description | -| :-------- | :------------- | -| `db_name` | The database name | -| `table_name` | The table name | -| `partition_name` | The partition name | -| `column_name` | The column name (when `is_index` is `0`) or the index name (when `is_index` is `1`) | -| `is_index` | Whether it is an index column or not | -| `bucket_id` | The ID of a bucket | -| `count` | The number of all the values that falls on the bucket and the previous buckets | -| `repeats` | The occurrence number of the maximum value | -| `lower_bound` | The minimum value | -| `upper_bound` | The maximum value | -| `ndv` | The number of different values in the bucket. When `tidb_analyze_version` = `1`, `ndv` is always `0`, which has no actual meaning. | +You can use the [`SHOW STATS_BUCKETS`](/sql-statements/show-stats-buckets.md) statement to view each bucket of the histogram. ### Top-N information -You can use the `SHOW STATS_TOPN` statement to view the Top-N information currently collected by TiDB. - -The syntax is as follows: - -{{< copyable "sql" >}} - -```sql -SHOW STATS_TOPN [ShowLikeOrWhere]; -``` - -Currently, the `SHOW STATS_TOPN` statement returns the following 7 columns: - -| Column name | Description | -| ---- | ----| -| `db_name` | The database name | -| `table_name` | The table name | -| `partition_name` | The partition name | -| `column_name` | The column name (when `is_index` is `0`) or the index name (when `is_index` is `1`) | -| `is_index` | Whether it is an index column or not | -| `value` | The value of this column | -| `count` | How many times the value appears | +You can use the [`SHOW STATS_TOPN`](/sql-statements/sql-statement-show-stats-topn.md) statement to view the Top-N information currently collected by TiDB. ## Delete statistics -You can run the `DROP STATS` statement to delete statistics. - -{{< copyable "sql" >}} - -```sql -DROP STATS TableName -``` - -The preceding statement deletes all statistics of `TableName`. If a partitioned table is specified, this statement will delete statistics of all partitions in this table as well as GlobalStats generated in dynamic pruning mode. - -{{< copyable "sql" >}} - -```sql -DROP STATS TableName PARTITION PartitionNameList; -``` - -This preceding statement only deletes statistics of the specified partitions in `PartitionNameList`. - -{{< copyable "sql" >}} - -```sql -DROP STATS TableName GLOBAL; -``` - -The preceding statement only deletes GlobalStats generated in dynamic pruning mode of the specified table. +You can run the [`DROP STATS`](/sql-statements/sql-statement-drop-stats.md) statement to delete statistics. ## Load statistics @@ -726,24 +531,18 @@ The interface to export statistics is as follows: + To obtain the JSON format statistics of the `${table_name}` table in the `${db_name}` database: - {{< copyable "" >}} - ``` http://${tidb-server-ip}:${tidb-server-status-port}/stats/dump/${db_name}/${table_name} ``` For example: - {{< copyable "" >}} - - ``` + ```shell curl -s http://127.0.0.1:10080/stats/dump/test/t1 -o /tmp/t1.json ``` + To obtain the JSON format statistics of the `${table_name}` table in the `${db_name}` database at specific time: - {{< copyable "" >}} - ``` http://${tidb-server-ip}:${tidb-server-status-port}/stats/dump/${db_name}/${table_name}/${yyyyMMddHHmmss} ``` @@ -756,15 +555,7 @@ The interface to export statistics is as follows: Generally, the imported statistics refer to the JSON file obtained using the export interface. -Syntax: - -{{< copyable "sql" >}} - -``` -LOAD STATS 'file_name' -``` - -`file_name` is the file name of the statistics to be imported. +Loading statistics can be done with the [`LOAD STATS`](/sql-statements/sql-statement-load-stats.md) statement. ## Lock statistics @@ -837,7 +628,7 @@ mysql> SHOW WARNINGS; 1 row in set (0.00 sec) ``` -In addition, you can also lock the statistics of a partition using `LOCK STATS`. For example: +In addition, you can also lock the statistics of a partition using [`LOCK STATS`](/sql-statements/sql-statement-lock-stats.md). For example: Create a partition table `t`, and insert data into it. When the statistics of partition `p1` are not locked, the `ANALYZE` statement can be successfully executed. From 723b9dd40cf1688c3ff9226b88a0cdb9c316cbf1 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Fri, 10 May 2024 11:56:10 +0200 Subject: [PATCH 02/12] Small fixes --- sql-statements/show-stats-buckets.md | 1 - sql-statements/sql-statement-drop-stats.md | 1 + 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/sql-statements/show-stats-buckets.md b/sql-statements/show-stats-buckets.md index ba32975dd71cf..2c80a525d19cb 100644 --- a/sql-statements/show-stats-buckets.md +++ b/sql-statements/show-stats-buckets.md @@ -54,7 +54,6 @@ SHOW STATS_BUCKETS WHERE Table_name='t'; 6 rows in set (0.00 sec) ``` - ## MySQL compatibility This statement is a TiDB extension to MySQL syntax. diff --git a/sql-statements/sql-statement-drop-stats.md b/sql-statements/sql-statement-drop-stats.md index c2aaf1a54ad82..e9bce879e056f 100644 --- a/sql-statements/sql-statement-drop-stats.md +++ b/sql-statements/sql-statement-drop-stats.md @@ -84,6 +84,7 @@ DROP STATS TableName GLOBAL; ``` Query OK, 0 rows affected (0.00 sec) ``` + The preceding statement only deletes GlobalStats generated in dynamic pruning mode of the specified table. ## MySQL compatibility From 74e4e1b3e3148ee619b485870593710d1924a3f0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Fri, 10 May 2024 12:14:12 +0200 Subject: [PATCH 03/12] Update toc and correct filenames --- TOC-tidb-cloud.md | 11 +++++++---- TOC.md | 3 +++ ...ge.md => sql-statement-show-column-stats-usage.md} | 0 ...buckets.md => sql-statement-show-stats-buckets.md} | 0 4 files changed, 10 insertions(+), 4 deletions(-) rename sql-statements/{show-column-stats-usage.md => sql-statement-show-column-stats-usage.md} (100%) rename sql-statements/{show-stats-buckets.md => sql-statement-show-stats-buckets.md} (100%) diff --git a/TOC-tidb-cloud.md b/TOC-tidb-cloud.md index 27e0234bb1137..9d2a551b1c475 100644 --- a/TOC-tidb-cloud.md +++ b/TOC-tidb-cloud.md @@ -449,7 +449,8 @@ - [`SHOW BUILTINS`](/sql-statements/sql-statement-show-builtins.md) - [`SHOW CHARACTER SET`](/sql-statements/sql-statement-show-character-set.md) - [`SHOW COLLATION`](/sql-statements/sql-statement-show-collation.md) - - [`SHOW [FULL] COLUMNS FROM`](/sql-statements/sql-statement-show-columns-from.md) + - [`SHOW COLUMN_STATS_USAGE`](/sql-statements/sql-statement-show-column-stats-usage.md) + - [`SHOW COLUMNS FROM`](/sql-statements/sql-statement-show-columns-from.md) - [`SHOW CREATE DATABASE`](/sql-statements/sql-statement-show-create-database.md) - [`SHOW CREATE PLACEMENT POLICY`](/sql-statements/sql-statement-show-create-placement-policy.md) - [`SHOW CREATE RESOURCE GROUP`](/sql-statements/sql-statement-show-create-resource-group.md) @@ -459,7 +460,7 @@ - [`SHOW DATABASES`](/sql-statements/sql-statement-show-databases.md) - [`SHOW ENGINES`](/sql-statements/sql-statement-show-engines.md) - [`SHOW ERRORS`](/sql-statements/sql-statement-show-errors.md) - - [`SHOW [FULL] FIELDS FROM`](/sql-statements/sql-statement-show-fields-from.md) + - [`SHOW FIELDS FROM`](/sql-statements/sql-statement-show-fields-from.md) - [`SHOW GRANTS`](/sql-statements/sql-statement-show-grants.md) - [`SHOW IMPORT JOB`](/sql-statements/sql-statement-show-import-job.md) - [`SHOW INDEXES [FROM|IN]`](/sql-statements/sql-statement-show-indexes.md) @@ -469,18 +470,20 @@ - [`SHOW PLACEMENT LABELS`](/sql-statements/sql-statement-show-placement-labels.md) - [`SHOW PLUGINS`](/sql-statements/sql-statement-show-plugins.md) - [`SHOW PRIVILEGES`](/sql-statements/sql-statement-show-privileges.md) - - [`SHOW [FULL] PROCESSSLIST`](/sql-statements/sql-statement-show-processlist.md) + - [`SHOW PROCESSSLIST`](/sql-statements/sql-statement-show-processlist.md) - [`SHOW PROFILES`](/sql-statements/sql-statement-show-profiles.md) - [`SHOW SCHEMAS`](/sql-statements/sql-statement-show-schemas.md) + - [`SHOW STATS_BUCKETS`](/sql-statements/sql-statement-show-stats-buckets.md) - [`SHOW STATS_HEALTHY`](/sql-statements/sql-statement-show-stats-healthy.md) - [`SHOW STATS_HISTOGRAMS`](/sql-statements/sql-statement-show-stats-histograms.md) - [`SHOW STATS_LOCKED`](/sql-statements/sql-statement-show-stats-locked.md) - [`SHOW STATS_META`](/sql-statements/sql-statement-show-stats-meta.md) + - [`SHOW STATS_TOPN`](/sql-statements/sql-statement-show-stats-topn.md) - [`SHOW STATUS`](/sql-statements/sql-statement-show-status.md) - [`SHOW TABLE NEXT_ROW_ID`](/sql-statements/sql-statement-show-table-next-rowid.md) - [`SHOW TABLE REGIONS`](/sql-statements/sql-statement-show-table-regions.md) - [`SHOW TABLE STATUS`](/sql-statements/sql-statement-show-table-status.md) - - [`SHOW [FULL] TABLES`](/sql-statements/sql-statement-show-tables.md) + - [`SHOW TABLES`](/sql-statements/sql-statement-show-tables.md) - [`SHOW [GLOBAL|SESSION] VARIABLES`](/sql-statements/sql-statement-show-variables.md) - [`SHOW WARNINGS`](/sql-statements/sql-statement-show-warnings.md) - [`SPLIT REGION`](/sql-statements/sql-statement-split-region.md) diff --git a/TOC.md b/TOC.md index 7e64f636574dc..115236559a7a4 100644 --- a/TOC.md +++ b/TOC.md @@ -823,6 +823,7 @@ - [`SHOW BUILTINS`](/sql-statements/sql-statement-show-builtins.md) - [`SHOW CHARACTER SET`](/sql-statements/sql-statement-show-character-set.md) - [`SHOW COLLATION`](/sql-statements/sql-statement-show-collation.md) + - [`SHOW COLUMN_STATS_USAGE`](/sql-statements/sql-statement-show-column-stats-usage.md) - [`SHOW COLUMNS FROM`](/sql-statements/sql-statement-show-columns-from.md) - [`SHOW CONFIG`](/sql-statements/sql-statement-show-config.md) - [`SHOW CREATE DATABASE`](/sql-statements/sql-statement-show-create-database.md) @@ -849,10 +850,12 @@ - [`SHOW PROFILES`](/sql-statements/sql-statement-show-profiles.md) - [`SHOW PUMP STATUS`](/sql-statements/sql-statement-show-pump-status.md) - [`SHOW SCHEMAS`](/sql-statements/sql-statement-show-schemas.md) + - [`SHOW STATS_BUCKETS`](/sql-statements/sql-statement-show-stats-buckets.md) - [`SHOW STATS_HEALTHY`](/sql-statements/sql-statement-show-stats-healthy.md) - [`SHOW STATS_HISTOGRAMS`](/sql-statements/sql-statement-show-stats-histograms.md) - [`SHOW STATS_LOCKED`](/sql-statements/sql-statement-show-stats-locked.md) - [`SHOW STATS_META`](/sql-statements/sql-statement-show-stats-meta.md) + - [`SHOW STATS_TOPN`](/sql-statements/sql-statement-show-stats-topn.md) - [`SHOW STATUS`](/sql-statements/sql-statement-show-status.md) - [`SHOW TABLE NEXT_ROW_ID`](/sql-statements/sql-statement-show-table-next-rowid.md) - [`SHOW TABLE REGIONS`](/sql-statements/sql-statement-show-table-regions.md) diff --git a/sql-statements/show-column-stats-usage.md b/sql-statements/sql-statement-show-column-stats-usage.md similarity index 100% rename from sql-statements/show-column-stats-usage.md rename to sql-statements/sql-statement-show-column-stats-usage.md diff --git a/sql-statements/show-stats-buckets.md b/sql-statements/sql-statement-show-stats-buckets.md similarity index 100% rename from sql-statements/show-stats-buckets.md rename to sql-statements/sql-statement-show-stats-buckets.md From 0be069ba92a95e38dd9be1f237092d3b6f70bb0b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Fri, 10 May 2024 12:34:39 +0200 Subject: [PATCH 04/12] Fixup links --- statistics.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/statistics.md b/statistics.md index 66a7cfe58d246..0eca125f5d832 100644 --- a/statistics.md +++ b/statistics.md @@ -330,7 +330,7 @@ If you want to persist the column configuration in the `ANALYZE` statement (incl - When TiDB collects statistics automatically or when you manually collect statistics by executing the `ANALYZE` statement without specifying the column configuration, TiDB continues using the previously persisted configuration for statistics collection. - When you manually execute the `ANALYZE` statement multiple times with column configuration specified, TiDB overwrites the previously recorded persistent configuration using the new configuration specified by the latest `ANALYZE` statement. -To locate `PREDICATE COLUMNS` and columns on which statistics have been collected, use the [`SHOW COLUMN_STATS_USAGE`](/sql-statements/show-column-stats-usage.md) statement. +To locate `PREDICATE COLUMNS` and columns on which statistics have been collected, use the [`SHOW COLUMN_STATS_USAGE`](/sql-statements/sql-statement-show-column-stats-usage.md) statement. In the following example, after executing `ANALYZE TABLE t PREDICATE COLUMNS;`, TiDB collects statistics on columns `b`, `c`, and `d`, where column `b` is a `PREDICATE COLUMN` and columns `c` and `d` are index columns. @@ -465,7 +465,7 @@ You can use the [`SHOW STATS_HISTOGRAMS`](/sql-statements/sql-statement-show-sta ### Buckets of histogram -You can use the [`SHOW STATS_BUCKETS`](/sql-statements/show-stats-buckets.md) statement to view each bucket of the histogram. +You can use the [`SHOW STATS_BUCKETS`](/sql-statements/sql-statement-show-stats-buckets.md statement to view each bucket of the histogram. ### Top-N information From 0a811d3e0d33aee0b2a4a739799fecaf4d538d47 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Thu, 16 May 2024 11:17:33 +0200 Subject: [PATCH 05/12] Update sql-statements/sql-statement-show-column-stats-usage.md Co-authored-by: Grace Cai --- sql-statements/sql-statement-show-column-stats-usage.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/sql-statements/sql-statement-show-column-stats-usage.md b/sql-statements/sql-statement-show-column-stats-usage.md index 8fa6e041b19c7..e34a094248cba 100644 --- a/sql-statements/sql-statement-show-column-stats-usage.md +++ b/sql-statements/sql-statement-show-column-stats-usage.md @@ -5,7 +5,9 @@ summary: An overview of the usage of SHOW COLUMN_STATS_USAGE for TiDB database. # SHOW COLUMN_STATS_USAGE -The `SHOW COLUMN_STATS_USAGE` statement returns the following 6 columns: +The `SHOW COLUMN_STATS_USAGE` statement shows the last usage time and collection time of column statistics. You can also use it to locate `PREDICATE COLUMNS` and columns on which statistics have been collected. + +Currently, the `SHOW COLUMN_STATS_USAGE` statement returns the following columns: | Column name | Description | | -------- | ------------- | From 8fa7e07bea228bf90fedff489d47631c8921c862 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Thu, 16 May 2024 17:24:47 +0800 Subject: [PATCH 06/12] drop-stats: add a usage section --- sql-statements/sql-statement-drop-stats.md | 50 +++++++++++----------- 1 file changed, 26 insertions(+), 24 deletions(-) diff --git a/sql-statements/sql-statement-drop-stats.md b/sql-statements/sql-statement-drop-stats.md index e9bce879e056f..8aeed832f72e6 100644 --- a/sql-statements/sql-statement-drop-stats.md +++ b/sql-statements/sql-statement-drop-stats.md @@ -18,75 +18,77 @@ TableName ::= Identifier ('.' Identifier)? ``` -## Examples +## Usage + +The following statement deletes all statistics of `TableName`. If a partitioned table is specified, this statement deletes statistics of all partitions in this table as well as [GlobalStats generated in dynamic pruning mode](/statistics.md#collect-statistics-of-partitioned-tables-in-dynamic-pruning-mode). ```sql -CREATE TABLE t(a INT); +DROP STATS TableName ``` ``` -Query OK, 0 rows affected (0.01 sec) +Query OK, 0 rows affected (0.00 sec) ``` +The following statement only deletes statistics of the specified partitions in `PartitionNameList`. + ```sql -SHOW STATS_META WHERE db_name='test' and table_name='t'; +DROP STATS TableName PARTITION PartitionNameList; ``` ``` -+---------+------------+----------------+---------------------+--------------+-----------+ -| Db_name | Table_name | Partition_name | Update_time | Modify_count | Row_count | -+---------+------------+----------------+---------------------+--------------+-----------+ -| test | t | | 2020-05-25 20:34:33 | 0 | 0 | -+---------+------------+----------------+---------------------+--------------+-----------+ -1 row in set (0.00 sec) +Query OK, 0 rows affected (0.00 sec) ``` +The following statement only deletes GlobalStats generated in dynamic pruning mode of the specified table. + ```sql -DROP STATS t; +DROP STATS TableName GLOBAL; ``` ``` Query OK, 0 rows affected (0.00 sec) ``` +## Examples + ```sql -SHOW STATS_META WHERE db_name='test' and table_name='t'; +CREATE TABLE t(a INT); ``` ``` -Empty set (0.00 sec) +Query OK, 0 rows affected (0.01 sec) ``` ```sql -DROP STATS TableName +SHOW STATS_META WHERE db_name='test' and table_name='t'; ``` ``` -Query OK, 0 rows affected (0.00 sec) ++---------+------------+----------------+---------------------+--------------+-----------+ +| Db_name | Table_name | Partition_name | Update_time | Modify_count | Row_count | ++---------+------------+----------------+---------------------+--------------+-----------+ +| test | t | | 2020-05-25 20:34:33 | 0 | 0 | ++---------+------------+----------------+---------------------+--------------+-----------+ +1 row in set (0.00 sec) ``` -The preceding statement deletes all statistics of `TableName`. If a partitioned table is specified, this statement will delete statistics of all partitions in this table as well as GlobalStats generated in dynamic pruning mode. - ```sql -DROP STATS TableName PARTITION PartitionNameList; +DROP STATS t; ``` ``` Query OK, 0 rows affected (0.00 sec) ``` -This preceding statement only deletes statistics of the specified partitions in `PartitionNameList`. - ```sql -DROP STATS TableName GLOBAL; +SHOW STATS_META WHERE db_name='test' and table_name='t'; ``` ``` -Query OK, 0 rows affected (0.00 sec) +Empty set (0.00 sec) ``` -The preceding statement only deletes GlobalStats generated in dynamic pruning mode of the specified table. - ## MySQL compatibility This statement is a TiDB extension to MySQL syntax. From 7215e9a377977e4021e45a7c70ae1d9553f16e13 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Thu, 16 May 2024 12:03:23 +0200 Subject: [PATCH 07/12] Apply suggestions from code review Co-authored-by: Grace Cai --- sql-statements/sql-statement-show-stats-buckets.md | 4 ++-- sql-statements/sql-statement-show-stats-healthy.md | 2 +- sql-statements/sql-statement-show-stats-histograms.md | 2 +- sql-statements/sql-statement-show-stats-locked.md | 4 ++-- sql-statements/sql-statement-show-stats-topn.md | 4 +++- 5 files changed, 9 insertions(+), 7 deletions(-) diff --git a/sql-statements/sql-statement-show-stats-buckets.md b/sql-statements/sql-statement-show-stats-buckets.md index 2c80a525d19cb..7e2a714bdfa42 100644 --- a/sql-statements/sql-statement-show-stats-buckets.md +++ b/sql-statements/sql-statement-show-stats-buckets.md @@ -5,9 +5,9 @@ summary: An overview of the usage of SHOW STATS_BUCKETS for TiDB database. # SHOW STATS_BUCKETS -This statement returns information about all the buckets. +The `SHOW STATS_BUCKETS` statement shows the bucket information in [statistics](/statistics.md). -Currently, the `SHOW STATS_BUCKETS` statement returns the following 11 columns: +Currently, the `SHOW STATS_BUCKETS` statement returns the following columns: | Column name | Description | | :-------- | :------------- | diff --git a/sql-statements/sql-statement-show-stats-healthy.md b/sql-statements/sql-statement-show-stats-healthy.md index e397d352df723..641a4f569352b 100644 --- a/sql-statements/sql-statement-show-stats-healthy.md +++ b/sql-statements/sql-statement-show-stats-healthy.md @@ -9,7 +9,7 @@ The `SHOW STATS_HEALTHY` statement shows an estimation of how accurate statistic The health of a table can be improved by running the [`ANALYZE`](/sql-statements/sql-statement-analyze-table.md) statement. `ANALYZE` runs automatically when the health drops below the [`tidb_auto_analyze_ratio`](/system-variables.md#tidb_auto_analyze_ratio) threshold. -Currently, the `SHOW STATS_HEALTHY` statement outputs 4 columns: +Currently, the `SHOW STATS_HEALTHY` statement returns the following columns: | Column name | Description | | -------- | ------------- | diff --git a/sql-statements/sql-statement-show-stats-histograms.md b/sql-statements/sql-statement-show-stats-histograms.md index 97a656c181e41..27dce935f323f 100644 --- a/sql-statements/sql-statement-show-stats-histograms.md +++ b/sql-statements/sql-statement-show-stats-histograms.md @@ -8,7 +8,7 @@ aliases: ['/docs/dev/sql-statements/sql-statement-show-histograms/','/tidb/dev/s This statement shows the histogram information collected by the [`ANALYZE` statement](/sql-statements/sql-statement-analyze-table.md) as part of database [statistics](/statistics.md). -Currently, the `SHOW STATS_HISTOGRAMS` statement outputs 15 columns: +Currently, the `SHOW STATS_HISTOGRAMS` statement returns the following columns: | Column name | Description | | -------- | ------------- | diff --git a/sql-statements/sql-statement-show-stats-locked.md b/sql-statements/sql-statement-show-stats-locked.md index 4e854fd25c41d..e66d535f38a40 100644 --- a/sql-statements/sql-statement-show-stats-locked.md +++ b/sql-statements/sql-statement-show-stats-locked.md @@ -7,14 +7,14 @@ summary: An overview of the usage of SHOW STATS_LOCKED for the TiDB database. `SHOW STATS_LOCKED` shows the tables whose statistics are locked. -Currently, the `SHOW STATS_LOCKED` statement outputs 4 columns: +Currently, the `SHOW STATS_LOCKED` statement returns the following columns: | Column name | Description | | -------- | ------------- | | Db_name | Database name | | Table_name | Table name | | Partition_name | Partition name | -| Status | Status, e.g. `locked` | +| Status | Statistics status, such as `locked` | ## Synopsis diff --git a/sql-statements/sql-statement-show-stats-topn.md b/sql-statements/sql-statement-show-stats-topn.md index 5f0767e1f494b..bddc711cd8774 100644 --- a/sql-statements/sql-statement-show-stats-topn.md +++ b/sql-statements/sql-statement-show-stats-topn.md @@ -5,7 +5,9 @@ summary: An overview of the usage of SHOW STATS_TOPN for TiDB database. # SHOW STATS_TOPN -Currently, the `SHOW STATS_TOPN` statement returns the following 7 columns: +The `SHOW STATS_TOPN` statement shows the Top-N information in [statistics](/statistics.md). + +Currently, the `SHOW STATS_TOPN` statement returns the following columns: | Column name | Description | | ---- | ----| From d07d2a81fe91813b7778cae03f9e26977521a82a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Thu, 16 May 2024 11:52:43 +0100 Subject: [PATCH 08/12] Add column description for SHOW ANALYZE STATUS back --- .../sql-statement-show-analyze-status.md | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/sql-statements/sql-statement-show-analyze-status.md b/sql-statements/sql-statement-show-analyze-status.md index 087a6e8819176..ea82ae75fcbd2 100644 --- a/sql-statements/sql-statement-show-analyze-status.md +++ b/sql-statements/sql-statement-show-analyze-status.md @@ -14,6 +14,21 @@ Starting from TiDB v6.1.0, you can view the history tasks within the last 7 days Starting from TiDB v7.3.0, you can view the progress of the current `ANALYZE` task through the system table `mysql.analyze_jobs` or `SHOW ANALYZE STATUS`. +Currently, the `SHOW ANALYZE STATUS` statement returns the following columns: + +| Column name | Description | +| :--------------- | :------------- | +| `Table_schema` | The database name | +| `Table_name` | The table name | +| `Partition_name` | The partition name | +| `Job_info` | The task information. If an index is analyzed, this information will include the index name. When `tidb_analyze_version =2`, this information will include configuration items such as sample rate. | +| `Processed_rows` | The number of rows that have been analyzed | +| `Start_time` | The time at which the task starts | +| `State` | The state of a task, including `pending`, `running`, `finished`, and `failed` | +| `Fail_reason` | The reason why the task fails. If the execution is successful, the value is `NULL`. | +| `Instance` | The TiDB instance that executes the task | +| `Process_id` | The process ID that executes the task | + ## Synopsis ```ebnf+diagram @@ -72,4 +87,4 @@ This statement is a TiDB extension to MySQL syntax. ## See also -* [ANALYZE_STATUS table](/information-schema/information-schema-analyze-status.md) \ No newline at end of file +* [ANALYZE_STATUS table](/information-schema/information-schema-analyze-status.md) From 63f7c33857a9d14f01ce0beca1ad1d6e6f38d01e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Wed, 22 May 2024 10:46:48 +0200 Subject: [PATCH 09/12] Update statistics.md Co-authored-by: Grace Cai --- statistics.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/statistics.md b/statistics.md index 0eca125f5d832..dbb20106be58b 100644 --- a/statistics.md +++ b/statistics.md @@ -557,6 +557,15 @@ Generally, the imported statistics refer to the JSON file obtained using the exp Loading statistics can be done with the [`LOAD STATS`](/sql-statements/sql-statement-load-stats.md) statement. +For example: + +```sql +LOAD STATS 'file_name' +``` + +`file_name` is the file name of the statistics to be imported. + + ## Lock statistics Starting from v6.5.0, TiDB supports locking statistics. After the statistics of a table or a partition are locked, the statistics of the table cannot be modified and the `ANALYZE` statement cannot be executed on the table. For example: From dca3b5a2cb819185486c4602a9f9118af3d54c4c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Wed, 22 May 2024 10:52:50 +0200 Subject: [PATCH 10/12] Fix MD012 --- statistics.md | 1 - 1 file changed, 1 deletion(-) diff --git a/statistics.md b/statistics.md index dbb20106be58b..18292a30ea5b9 100644 --- a/statistics.md +++ b/statistics.md @@ -565,7 +565,6 @@ LOAD STATS 'file_name' `file_name` is the file name of the statistics to be imported. - ## Lock statistics Starting from v6.5.0, TiDB supports locking statistics. After the statistics of a table or a partition are locked, the statistics of the table cannot be modified and the `ANALYZE` statement cannot be executed on the table. For example: From ddf7e0744db4f95acfad509e998dfb606f512864 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Wed, 22 May 2024 15:48:37 +0200 Subject: [PATCH 11/12] Update sql-statements/sql-statement-show-stats-topn.md Co-authored-by: xixirangrang --- sql-statements/sql-statement-show-stats-topn.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sql-statements/sql-statement-show-stats-topn.md b/sql-statements/sql-statement-show-stats-topn.md index bddc711cd8774..767c4de4daa08 100644 --- a/sql-statements/sql-statement-show-stats-topn.md +++ b/sql-statements/sql-statement-show-stats-topn.md @@ -56,5 +56,5 @@ This statement is a TiDB extension to MySQL syntax. ## See also -* [ANALYZE](/sql-statements/sql-statement-analyze-table.md) +* [`ANALYZE`](/sql-statements/sql-statement-analyze-table.md) * [Introduction to Statistics](/statistics.md) \ No newline at end of file From 292e1bc6d8c8e4c36e32e959770e0f9ce2b77afd Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dani=C3=ABl=20van=20Eeden?= Date: Wed, 22 May 2024 15:49:42 +0200 Subject: [PATCH 12/12] Apply suggestions from code review Co-authored-by: xixirangrang --- .../sql-statement-show-column-stats-usage.md | 2 +- .../sql-statement-show-stats-buckets.md | 4 +-- .../sql-statement-show-stats-healthy.md | 8 +++--- .../sql-statement-show-stats-histograms.md | 28 +++++++++---------- .../sql-statement-show-stats-locked.md | 8 +++--- 5 files changed, 25 insertions(+), 25 deletions(-) diff --git a/sql-statements/sql-statement-show-column-stats-usage.md b/sql-statements/sql-statement-show-column-stats-usage.md index e34a094248cba..b7d28adc4d04e 100644 --- a/sql-statements/sql-statement-show-column-stats-usage.md +++ b/sql-statements/sql-statement-show-column-stats-usage.md @@ -54,5 +54,5 @@ This statement is a TiDB extension to MySQL syntax. ## See also -* [ANALYZE](/sql-statements/sql-statement-analyze-table.md) +* [`ANALYZE`](/sql-statements/sql-statement-analyze-table.md) * [Introduction to Statistics](/statistics.md) \ No newline at end of file diff --git a/sql-statements/sql-statement-show-stats-buckets.md b/sql-statements/sql-statement-show-stats-buckets.md index 7e2a714bdfa42..6d8540b2adb18 100644 --- a/sql-statements/sql-statement-show-stats-buckets.md +++ b/sql-statements/sql-statement-show-stats-buckets.md @@ -21,7 +21,7 @@ Currently, the `SHOW STATS_BUCKETS` statement returns the following columns: | `Repeats` | The occurrence number of the maximum value | | `Lower_bound` | The minimum value | | `Upper_bound` | The maximum value | -| `Ndv` | The number of different values in the bucket. When `tidb_analyze_version` = `1`, `ndv` is always `0`, which has no actual meaning. | +| `Ndv` | The number of different values in the bucket. When `tidb_analyze_version` = `1`, `Ndv` is always `0`, which has no actual meaning. | ## Synopsis @@ -60,5 +60,5 @@ This statement is a TiDB extension to MySQL syntax. ## See also -* [ANALYZE](/sql-statements/sql-statement-analyze-table.md) +* [`ANALYZE`](/sql-statements/sql-statement-analyze-table.md) * [Introduction to Statistics](/statistics.md) \ No newline at end of file diff --git a/sql-statements/sql-statement-show-stats-healthy.md b/sql-statements/sql-statement-show-stats-healthy.md index 641a4f569352b..0339aac26fdd7 100644 --- a/sql-statements/sql-statement-show-stats-healthy.md +++ b/sql-statements/sql-statement-show-stats-healthy.md @@ -13,10 +13,10 @@ Currently, the `SHOW STATS_HEALTHY` statement returns the following columns: | Column name | Description | | -------- | ------------- | -| Db_name | Database name | -| Table_name | Table name | -| Partition_name | Partition name | -| Healthy | Healthy percentage between 0 and 100 | +| `Db_name` | The database name | +| `Table_name` | The table name | +| `Partition_name` | The partition name | +| `Healthy` | The healthy percentage between 0 and 100 | ## Synopsis diff --git a/sql-statements/sql-statement-show-stats-histograms.md b/sql-statements/sql-statement-show-stats-histograms.md index 27dce935f323f..2214d15a2abb8 100644 --- a/sql-statements/sql-statement-show-stats-histograms.md +++ b/sql-statements/sql-statement-show-stats-histograms.md @@ -13,20 +13,20 @@ Currently, the `SHOW STATS_HISTOGRAMS` statement returns the following columns: | Column name | Description | | -------- | ------------- | | Db_name | Database name | -| Table_name | Table name | -| Partition_name | Partition name | -| Column_name | Column name | -| Is_index | 1 if this is an index, else 0 | -| Update_time | Update time | -| Distinct_count | Distinct count | -| Null_count | NULL count | -| Avg_col_size | Average col size | -| Correlation | Correlation | -| Load_status | Load status like `allEvicted`, `allLoaded`, etc | -| Total_mem_usage | Total memory usage | -| Hist_mem_usage | Historical memory usage | -| Topn_mem_usage | TopN memory usage | -| Cms_mem_usage | CMS memory usage | +| `Table_name` | The table name | +| `Partition_name` | The partition name | +| `Column_name` | The column name | +| `Is_index` | Whether it is an index column or not | +| `Update_time` | The update time | +| `Distinct_count` | The distinct count | +| `Null_count` | NULL count | +| `Avg_col_size` | The average col size | +| `Correlation` | Correlation | +| `Load_status` | Load status, such as `allEvicted` and `allLoaded` | +| `Total_mem_usage` | The total memory usage | +| `Hist_mem_usage` | The historical memory usage | +| `Topn_mem_usage` | The TopN memory usage | +| `Cms_mem_usage` | The CMS memory usage | ## Synopsis diff --git a/sql-statements/sql-statement-show-stats-locked.md b/sql-statements/sql-statement-show-stats-locked.md index e66d535f38a40..165578f5dfcd8 100644 --- a/sql-statements/sql-statement-show-stats-locked.md +++ b/sql-statements/sql-statement-show-stats-locked.md @@ -11,10 +11,10 @@ Currently, the `SHOW STATS_LOCKED` statement returns the following columns: | Column name | Description | | -------- | ------------- | -| Db_name | Database name | -| Table_name | Table name | -| Partition_name | Partition name | -| Status | Statistics status, such as `locked` | +| `Db_name` | The database name | +| `Table_name` | The table name | +| `Partition_name` | The partition name | +| `Status` | The statistics status, such as `locked` | ## Synopsis