diff --git a/TOC-tidb-cloud.md b/TOC-tidb-cloud.md index f94cf4d8b4e02..b3ccb6d4d76ef 100644 --- a/TOC-tidb-cloud.md +++ b/TOC-tidb-cloud.md @@ -451,7 +451,8 @@ - [`SHOW BUILTINS`](/sql-statements/sql-statement-show-builtins.md) - [`SHOW CHARACTER SET`](/sql-statements/sql-statement-show-character-set.md) - [`SHOW COLLATION`](/sql-statements/sql-statement-show-collation.md) - - [`SHOW [FULL] COLUMNS FROM`](/sql-statements/sql-statement-show-columns-from.md) + - [`SHOW COLUMN_STATS_USAGE`](/sql-statements/sql-statement-show-column-stats-usage.md) + - [`SHOW COLUMNS FROM`](/sql-statements/sql-statement-show-columns-from.md) - [`SHOW CREATE DATABASE`](/sql-statements/sql-statement-show-create-database.md) - [`SHOW CREATE PLACEMENT POLICY`](/sql-statements/sql-statement-show-create-placement-policy.md) - [`SHOW CREATE RESOURCE GROUP`](/sql-statements/sql-statement-show-create-resource-group.md) @@ -461,7 +462,7 @@ - [`SHOW DATABASES`](/sql-statements/sql-statement-show-databases.md) - [`SHOW ENGINES`](/sql-statements/sql-statement-show-engines.md) - [`SHOW ERRORS`](/sql-statements/sql-statement-show-errors.md) - - [`SHOW [FULL] FIELDS FROM`](/sql-statements/sql-statement-show-fields-from.md) + - [`SHOW FIELDS FROM`](/sql-statements/sql-statement-show-fields-from.md) - [`SHOW GRANTS`](/sql-statements/sql-statement-show-grants.md) - [`SHOW IMPORT JOB`](/sql-statements/sql-statement-show-import-job.md) - [`SHOW INDEXES [FROM|IN]`](/sql-statements/sql-statement-show-indexes.md) @@ -471,18 +472,20 @@ - [`SHOW PLACEMENT LABELS`](/sql-statements/sql-statement-show-placement-labels.md) - [`SHOW PLUGINS`](/sql-statements/sql-statement-show-plugins.md) - [`SHOW PRIVILEGES`](/sql-statements/sql-statement-show-privileges.md) - - [`SHOW [FULL] PROCESSSLIST`](/sql-statements/sql-statement-show-processlist.md) + - [`SHOW PROCESSSLIST`](/sql-statements/sql-statement-show-processlist.md) - [`SHOW PROFILES`](/sql-statements/sql-statement-show-profiles.md) - [`SHOW SCHEMAS`](/sql-statements/sql-statement-show-schemas.md) + - [`SHOW STATS_BUCKETS`](/sql-statements/sql-statement-show-stats-buckets.md) - [`SHOW STATS_HEALTHY`](/sql-statements/sql-statement-show-stats-healthy.md) - [`SHOW STATS_HISTOGRAMS`](/sql-statements/sql-statement-show-stats-histograms.md) - [`SHOW STATS_LOCKED`](/sql-statements/sql-statement-show-stats-locked.md) - [`SHOW STATS_META`](/sql-statements/sql-statement-show-stats-meta.md) + - [`SHOW STATS_TOPN`](/sql-statements/sql-statement-show-stats-topn.md) - [`SHOW STATUS`](/sql-statements/sql-statement-show-status.md) - [`SHOW TABLE NEXT_ROW_ID`](/sql-statements/sql-statement-show-table-next-rowid.md) - [`SHOW TABLE REGIONS`](/sql-statements/sql-statement-show-table-regions.md) - [`SHOW TABLE STATUS`](/sql-statements/sql-statement-show-table-status.md) - - [`SHOW [FULL] TABLES`](/sql-statements/sql-statement-show-tables.md) + - [`SHOW TABLES`](/sql-statements/sql-statement-show-tables.md) - [`SHOW [GLOBAL|SESSION] VARIABLES`](/sql-statements/sql-statement-show-variables.md) - [`SHOW WARNINGS`](/sql-statements/sql-statement-show-warnings.md) - [`SPLIT REGION`](/sql-statements/sql-statement-split-region.md) diff --git a/TOC.md b/TOC.md index 53b619169ffae..ec1cc7475953a 100644 --- a/TOC.md +++ b/TOC.md @@ -824,6 +824,7 @@ - [`SHOW BUILTINS`](/sql-statements/sql-statement-show-builtins.md) - [`SHOW CHARACTER SET`](/sql-statements/sql-statement-show-character-set.md) - [`SHOW COLLATION`](/sql-statements/sql-statement-show-collation.md) + - [`SHOW COLUMN_STATS_USAGE`](/sql-statements/sql-statement-show-column-stats-usage.md) - [`SHOW COLUMNS FROM`](/sql-statements/sql-statement-show-columns-from.md) - [`SHOW CONFIG`](/sql-statements/sql-statement-show-config.md) - [`SHOW CREATE DATABASE`](/sql-statements/sql-statement-show-create-database.md) @@ -850,10 +851,12 @@ - [`SHOW PROFILES`](/sql-statements/sql-statement-show-profiles.md) - [`SHOW PUMP STATUS`](/sql-statements/sql-statement-show-pump-status.md) - [`SHOW SCHEMAS`](/sql-statements/sql-statement-show-schemas.md) + - [`SHOW STATS_BUCKETS`](/sql-statements/sql-statement-show-stats-buckets.md) - [`SHOW STATS_HEALTHY`](/sql-statements/sql-statement-show-stats-healthy.md) - [`SHOW STATS_HISTOGRAMS`](/sql-statements/sql-statement-show-stats-histograms.md) - [`SHOW STATS_LOCKED`](/sql-statements/sql-statement-show-stats-locked.md) - [`SHOW STATS_META`](/sql-statements/sql-statement-show-stats-meta.md) + - [`SHOW STATS_TOPN`](/sql-statements/sql-statement-show-stats-topn.md) - [`SHOW STATUS`](/sql-statements/sql-statement-show-status.md) - [`SHOW TABLE NEXT_ROW_ID`](/sql-statements/sql-statement-show-table-next-rowid.md) - [`SHOW TABLE REGIONS`](/sql-statements/sql-statement-show-table-regions.md) diff --git a/sql-statements/sql-statement-drop-stats.md b/sql-statements/sql-statement-drop-stats.md index 65d1326e5e101..3be3431ca29fa 100644 --- a/sql-statements/sql-statement-drop-stats.md +++ b/sql-statements/sql-statement-drop-stats.md @@ -11,34 +11,59 @@ The `DROP STATS` statement is used to delete the statistics of the selected tabl ```ebnf+diagram DropStatsStmt ::= - 'DROP' 'STATS' TableNameList - -TableNameList ::= - TableName ( ',' TableName )* + 'DROP' 'STATS' TableName ("PARTITION" partition | "GLOBAL")? ( ',' TableName )* TableName ::= Identifier ('.' Identifier)? ``` -## Examples +## Usage -{{< copyable "sql" >}} +The following statement deletes all statistics of `TableName`. If a partitioned table is specified, this statement deletes statistics of all partitions in this table as well as [GlobalStats generated in dynamic pruning mode](/statistics.md#collect-statistics-of-partitioned-tables-in-dynamic-pruning-mode). ```sql -CREATE TABLE t(a INT); +DROP STATS TableName +``` + ``` +Query OK, 0 rows affected (0.00 sec) +``` + +The following statement only deletes statistics of the specified partitions in `PartitionNameList`. ```sql -Query OK, 0 rows affected (0.01 sec) +DROP STATS TableName PARTITION PartitionNameList; ``` -{{< copyable "sql" >}} +``` +Query OK, 0 rows affected (0.00 sec) +``` + +The following statement only deletes GlobalStats generated in dynamic pruning mode of the specified table. ```sql -SHOW STATS_META WHERE db_name='test' and table_name='t'; +DROP STATS TableName GLOBAL; +``` + +``` +Query OK, 0 rows affected (0.00 sec) +``` + +## Examples + +```sql +CREATE TABLE t(a INT); +``` + +``` +Query OK, 0 rows affected (0.01 sec) ``` ```sql +SHOW STATS_META WHERE db_name='test' and table_name='t'; +``` + +``` +---------+------------+----------------+---------------------+--------------+-----------+ | Db_name | Table_name | Partition_name | Update_time | Modify_count | Row_count | +---------+------------+----------------+---------------------+--------------+-----------+ @@ -47,23 +72,19 @@ SHOW STATS_META WHERE db_name='test' and table_name='t'; 1 row in set (0.00 sec) ``` -{{< copyable "sql" >}} - ```sql DROP STATS t; ``` -```sql +``` Query OK, 0 rows affected (0.00 sec) ``` -{{< copyable "sql" >}} - ```sql SHOW STATS_META WHERE db_name='test' and table_name='t'; ``` -```sql +``` Empty set (0.00 sec) ``` diff --git a/sql-statements/sql-statement-show-analyze-status.md b/sql-statements/sql-statement-show-analyze-status.md index 2f127c1f9617e..59efee4dc4483 100644 --- a/sql-statements/sql-statement-show-analyze-status.md +++ b/sql-statements/sql-statement-show-analyze-status.md @@ -13,6 +13,21 @@ Starting from TiDB v6.1.0, you can view the history tasks within the last 7 days Starting from TiDB v7.3.0, you can view the progress of the current `ANALYZE` task through the system table `mysql.analyze_jobs` or `SHOW ANALYZE STATUS`. +Currently, the `SHOW ANALYZE STATUS` statement returns the following columns: + +| Column name | Description | +| :--------------- | :------------- | +| `Table_schema` | The database name | +| `Table_name` | The table name | +| `Partition_name` | The partition name | +| `Job_info` | The task information. If an index is analyzed, this information will include the index name. When `tidb_analyze_version =2`, this information will include configuration items such as sample rate. | +| `Processed_rows` | The number of rows that have been analyzed | +| `Start_time` | The time at which the task starts | +| `State` | The state of a task, including `pending`, `running`, `finished`, and `failed` | +| `Fail_reason` | The reason why the task fails. If the execution is successful, the value is `NULL`. | +| `Instance` | The TiDB instance that executes the task | +| `Process_id` | The process ID that executes the task | + ## Synopsis ```ebnf+diagram @@ -71,4 +86,4 @@ This statement is a TiDB extension to MySQL syntax. ## See also -* [ANALYZE_STATUS table](/information-schema/information-schema-analyze-status.md) \ No newline at end of file +* [ANALYZE_STATUS table](/information-schema/information-schema-analyze-status.md) diff --git a/sql-statements/sql-statement-show-column-stats-usage.md b/sql-statements/sql-statement-show-column-stats-usage.md new file mode 100644 index 0000000000000..b7d28adc4d04e --- /dev/null +++ b/sql-statements/sql-statement-show-column-stats-usage.md @@ -0,0 +1,58 @@ +--- +title: SHOW COLUMN_STATS_USAGE +summary: An overview of the usage of SHOW COLUMN_STATS_USAGE for TiDB database. +--- + +# SHOW COLUMN_STATS_USAGE + +The `SHOW COLUMN_STATS_USAGE` statement shows the last usage time and collection time of column statistics. You can also use it to locate `PREDICATE COLUMNS` and columns on which statistics have been collected. + +Currently, the `SHOW COLUMN_STATS_USAGE` statement returns the following columns: + +| Column name | Description | +| -------- | ------------- | +| `Db_name` | The database name | +| `Table_name` | The table name | +| `Partition_name` | The partition name | +| `Column_name` | The column name | +| `Last_used_at` | The last time when the column statistics were used in the query optimization | +| `Last_analyzed_at` | The last time when the column statistics were collected | + +## Synopsis + +```ebnf+diagram +ShowColumnStatsUsageStmt ::= + "SHOW" "COLUMN_STATS_USAGE" ShowLikeOrWhere? + +ShowLikeOrWhere ::= + "LIKE" SimpleExpr +| "WHERE" Expression +``` + +## Examples + +```sql +SHOW COLUMN_STATS_USAGE; +``` + +``` ++---------+------------+----------------+-------------+--------------+---------------------+ +| Db_name | Table_name | Partition_name | Column_name | Last_used_at | Last_analyzed_at | ++---------+------------+----------------+-------------+--------------+---------------------+ +| test | t1 | | id | NULL | 2024-05-10 11:04:23 | +| test | t1 | | b | NULL | 2024-05-10 11:04:23 | +| test | t1 | | pad | NULL | 2024-05-10 11:04:23 | +| test | t | | a | NULL | 2024-05-10 11:37:06 | +| test | t | | b | NULL | 2024-05-10 11:37:06 | ++---------+------------+----------------+-------------+--------------+---------------------+ +5 rows in set (0.00 sec) +``` + +## MySQL compatibility + +This statement is a TiDB extension to MySQL syntax. + +## See also + +* [`ANALYZE`](/sql-statements/sql-statement-analyze-table.md) +* [Introduction to Statistics](/statistics.md) \ No newline at end of file diff --git a/sql-statements/sql-statement-show-stats-buckets.md b/sql-statements/sql-statement-show-stats-buckets.md new file mode 100644 index 0000000000000..6d8540b2adb18 --- /dev/null +++ b/sql-statements/sql-statement-show-stats-buckets.md @@ -0,0 +1,64 @@ +--- +title: SHOW STATS_BUCKETS +summary: An overview of the usage of SHOW STATS_BUCKETS for TiDB database. +--- + +# SHOW STATS_BUCKETS + +The `SHOW STATS_BUCKETS` statement shows the bucket information in [statistics](/statistics.md). + +Currently, the `SHOW STATS_BUCKETS` statement returns the following columns: + +| Column name | Description | +| :-------- | :------------- | +| `Db_name` | The database name | +| `Table_name` | The table name | +| `Partition_name` | The partition name | +| `Column_name` | The column name (when `is_index` is `0`) or the index name (when `is_index` is `1`) | +| `Is_index` | Whether it is an index column or not | +| `Bucket_id` | The ID of a bucket | +| `Count` | The number of all the values that falls on the bucket and the previous buckets | +| `Repeats` | The occurrence number of the maximum value | +| `Lower_bound` | The minimum value | +| `Upper_bound` | The maximum value | +| `Ndv` | The number of different values in the bucket. When `tidb_analyze_version` = `1`, `Ndv` is always `0`, which has no actual meaning. | + +## Synopsis + +```ebnf+diagram +ShowStatsBucketsStmt ::= + "SHOW" "STATS_BUCKETS" ShowLikeOrWhere? + +ShowLikeOrWhere ::= + "LIKE" SimpleExpr +| "WHERE" Expression +``` + +## Examples + +```sql +SHOW STATS_BUCKETS WHERE Table_name='t'; +``` + +``` ++---------+------------+----------------+-------------+----------+-----------+-------+---------+--------------------------+--------------------------+------+ +| Db_name | Table_name | Partition_name | Column_name | Is_index | Bucket_id | Count | Repeats | Lower_Bound | Upper_Bound | Ndv | ++---------+------------+----------------+-------------+----------+-----------+-------+---------+--------------------------+--------------------------+------+ +| test | t | | a | 0 | 0 | 1 | 1 | 2023-12-27 00:00:00 | 2023-12-27 00:00:00 | 0 | +| test | t | | a | 0 | 1 | 2 | 1 | 2023-12-28 00:00:00 | 2023-12-28 00:00:00 | 0 | +| test | t | | ia | 1 | 0 | 1 | 1 | (NULL, 2) | (NULL, 2) | 0 | +| test | t | | ia | 1 | 1 | 2 | 1 | (NULL, 4) | (NULL, 4) | 0 | +| test | t | | ia | 1 | 2 | 3 | 1 | (2023-12-27 00:00:00, 1) | (2023-12-27 00:00:00, 1) | 0 | +| test | t | | ia | 1 | 3 | 4 | 1 | (2023-12-28 00:00:00, 3) | (2023-12-28 00:00:00, 3) | 0 | ++---------+------------+----------------+-------------+----------+-----------+-------+---------+--------------------------+--------------------------+------+ +6 rows in set (0.00 sec) +``` + +## MySQL compatibility + +This statement is a TiDB extension to MySQL syntax. + +## See also + +* [`ANALYZE`](/sql-statements/sql-statement-analyze-table.md) +* [Introduction to Statistics](/statistics.md) \ No newline at end of file diff --git a/sql-statements/sql-statement-show-stats-healthy.md b/sql-statements/sql-statement-show-stats-healthy.md index 631e2d97c1d28..0339aac26fdd7 100644 --- a/sql-statements/sql-statement-show-stats-healthy.md +++ b/sql-statements/sql-statement-show-stats-healthy.md @@ -9,6 +9,15 @@ The `SHOW STATS_HEALTHY` statement shows an estimation of how accurate statistic The health of a table can be improved by running the [`ANALYZE`](/sql-statements/sql-statement-analyze-table.md) statement. `ANALYZE` runs automatically when the health drops below the [`tidb_auto_analyze_ratio`](/system-variables.md#tidb_auto_analyze_ratio) threshold. +Currently, the `SHOW STATS_HEALTHY` statement returns the following columns: + +| Column name | Description | +| -------- | ------------- | +| `Db_name` | The database name | +| `Table_name` | The table name | +| `Partition_name` | The partition name | +| `Healthy` | The healthy percentage between 0 and 100 | + ## Synopsis ```ebnf+diagram diff --git a/sql-statements/sql-statement-show-stats-histograms.md b/sql-statements/sql-statement-show-stats-histograms.md index 564204a62942b..7e071b8833a81 100644 --- a/sql-statements/sql-statement-show-stats-histograms.md +++ b/sql-statements/sql-statement-show-stats-histograms.md @@ -8,6 +8,26 @@ summary: An overview of the usage of SHOW HISTOGRAMS for TiDB database. This statement shows the histogram information collected by the [`ANALYZE` statement](/sql-statements/sql-statement-analyze-table.md) as part of database [statistics](/statistics.md). +Currently, the `SHOW STATS_HISTOGRAMS` statement returns the following columns: + +| Column name | Description | +| -------- | ------------- | +| Db_name | Database name | +| `Table_name` | The table name | +| `Partition_name` | The partition name | +| `Column_name` | The column name | +| `Is_index` | Whether it is an index column or not | +| `Update_time` | The update time | +| `Distinct_count` | The distinct count | +| `Null_count` | NULL count | +| `Avg_col_size` | The average col size | +| `Correlation` | Correlation | +| `Load_status` | Load status, such as `allEvicted` and `allLoaded` | +| `Total_mem_usage` | The total memory usage | +| `Hist_mem_usage` | The historical memory usage | +| `Topn_mem_usage` | The TopN memory usage | +| `Cms_mem_usage` | The CMS memory usage | + ## Synopsis ```ebnf+diagram diff --git a/sql-statements/sql-statement-show-stats-locked.md b/sql-statements/sql-statement-show-stats-locked.md index a3f9d66e41273..165578f5dfcd8 100644 --- a/sql-statements/sql-statement-show-stats-locked.md +++ b/sql-statements/sql-statement-show-stats-locked.md @@ -7,6 +7,15 @@ summary: An overview of the usage of SHOW STATS_LOCKED for the TiDB database. `SHOW STATS_LOCKED` shows the tables whose statistics are locked. +Currently, the `SHOW STATS_LOCKED` statement returns the following columns: + +| Column name | Description | +| -------- | ------------- | +| `Db_name` | The database name | +| `Table_name` | The table name | +| `Partition_name` | The partition name | +| `Status` | The statistics status, such as `locked` | + ## Synopsis ```ebnf+diagram diff --git a/sql-statements/sql-statement-show-stats-topn.md b/sql-statements/sql-statement-show-stats-topn.md new file mode 100644 index 0000000000000..767c4de4daa08 --- /dev/null +++ b/sql-statements/sql-statement-show-stats-topn.md @@ -0,0 +1,60 @@ +--- +title: SHOW STATS_TOPN +summary: An overview of the usage of SHOW STATS_TOPN for TiDB database. +--- + +# SHOW STATS_TOPN + +The `SHOW STATS_TOPN` statement shows the Top-N information in [statistics](/statistics.md). + +Currently, the `SHOW STATS_TOPN` statement returns the following columns: + +| Column name | Description | +| ---- | ----| +| `Db_name` | The database name | +| `Table_name` | The table name | +| `Partition_name` | The partition name | +| `Column_name` | The column name (when `is_index` is `0`) or the index name (when `is_index` is `1`) | +| `Is_index` | Whether it is an index column or not | +| `Value` | The value of this column | +| `Count` | How many times the value appears | + +## Synopsis + +```ebnf+diagram +ShowStatsTopnStmt ::= + "SHOW" "STATS_TOPN" ShowLikeOrWhere? + +ShowLikeOrWhere ::= + "LIKE" SimpleExpr +| "WHERE" Expression +``` + +## Example + +```sql +SHOW STATS_TOPN WHERE Table_name='t'; +``` + +``` ++---------+------------+----------------+-------------+----------+--------------------------+-------+ +| Db_name | Table_name | Partition_name | Column_name | Is_index | Value | Count | ++---------+------------+----------------+-------------+----------+--------------------------+-------+ +| test | t | | a | 0 | 2023-12-27 00:00:00 | 1 | +| test | t | | a | 0 | 2023-12-28 00:00:00 | 1 | +| test | t | | ia | 1 | (NULL, 2) | 1 | +| test | t | | ia | 1 | (NULL, 4) | 1 | +| test | t | | ia | 1 | (2023-12-27 00:00:00, 1) | 1 | +| test | t | | ia | 1 | (2023-12-28 00:00:00, 3) | 1 | ++---------+------------+----------------+-------------+----------+--------------------------+-------+ +6 rows in set (0.00 sec) +``` + +## MySQL compatibility + +This statement is a TiDB extension to MySQL syntax. + +## See also + +* [`ANALYZE`](/sql-statements/sql-statement-analyze-table.md) +* [Introduction to Statistics](/statistics.md) \ No newline at end of file diff --git a/statistics.md b/statistics.md index bd47c973187d3..b2084804467ea 100644 --- a/statistics.md +++ b/statistics.md @@ -11,7 +11,7 @@ TiDB uses statistics as input to the optimizer to estimate the number of rows pr ### Automatic update -For the `INSERT`, `DELETE`, or `UPDATE` statements, TiDB automatically updates the number of rows and modified rows in statistics. +For the [`INSERT`](/sql-statements/sql-statement-insert.md), [`DELETE`](/sql-statements/sql-statement-delete.md), or [`UPDATE`](/sql-statements/sql-statement-update.md) statements, TiDB automatically updates the number of rows and modified rows in statistics. @@ -131,8 +131,6 @@ If a table has many columns, collecting statistics on all the columns can cause - To collect statistics on specific columns, use the following syntax: - {{< copyable "sql" >}} - ```sql ANALYZE TABLE TableName COLUMNS ColumnNameList [WITH NUM BUCKETS|TOPN|CMSKETCH DEPTH|CMSKETCH WIDTH]|[WITH NUM SAMPLES|WITH FLOATNUM SAMPLERATE]; ``` @@ -161,8 +159,6 @@ If a table has many columns, collecting statistics on all the columns can cause 2. After the query pattern of your business is relatively stable, collect statistics on `PREDICATE COLUMNS` by using the following syntax: - {{< copyable "sql" >}} - ```sql ANALYZE TABLE TableName PREDICATE COLUMNS [WITH NUM BUCKETS|TOPN|CMSKETCH DEPTH|CMSKETCH WIDTH]|[WITH NUM SAMPLES|WITH FLOATNUM SAMPLERATE]; ``` @@ -171,13 +167,11 @@ If a table has many columns, collecting statistics on all the columns can cause > **Note:** > - > - If the `mysql.column_stats_usage` system table does not contain any `PREDICATE COLUMNS` recorded for that table, the preceding syntax collects statistics on all columns and all indexes in that table. + > - If the [`mysql.column_stats_usage`](/mysql-schema.md) system table does not contain any `PREDICATE COLUMNS` recorded for that table, the preceding syntax collects statistics on all columns and all indexes in that table. > - Any columns excluded from collection (either by manually listing columns or using `PREDICATE COLUMNS`) will not have their statistics overwritten. When executing a new type of SQL query, the optimizer will use the old statistics for such columns if it exists or pseudo column statistics if columns never had statistics collected. The next ANALYZE using `PREDICATE COLUMNS` will collect the statistics on those columns. - To collect statistics on all columns and indexes, use the following syntax: - {{< copyable "sql" >}} - ```sql ANALYZE TABLE TableName ALL COLUMNS [WITH NUM BUCKETS|TOPN|CMSKETCH DEPTH|CMSKETCH WIDTH]|[WITH NUM SAMPLES|WITH FLOATNUM SAMPLERATE]; ``` @@ -186,16 +180,12 @@ If a table has many columns, collecting statistics on all the columns can cause - To collect statistics on all partitions in `PartitionNameList` in `TableName`, use the following syntax: - {{< copyable "sql" >}} - ```sql ANALYZE TABLE TableName PARTITION PartitionNameList [WITH NUM BUCKETS|TOPN|CMSKETCH DEPTH|CMSKETCH WIDTH]|[WITH NUM SAMPLES|WITH FLOATNUM SAMPLERATE]; ``` - To collect index statistics on all partitions in `PartitionNameList` in `TableName`, use the following syntax: - {{< copyable "sql" >}} - ```sql ANALYZE TABLE TableName PARTITION PartitionNameList INDEX [IndexNameList] [WITH NUM BUCKETS|TOPN|CMSKETCH DEPTH|CMSKETCH WIDTH]|[WITH NUM SAMPLES|WITH FLOATNUM SAMPLERATE]; ``` @@ -206,8 +196,6 @@ If a table has many columns, collecting statistics on all the columns can cause > > Currently, collecting statistics on `PREDICATE COLUMNS` is an experimental feature. It is not recommended that you use it in production environments. - {{< copyable "sql" >}} - ```sql ANALYZE TABLE TableName PARTITION PartitionNameList [COLUMNS ColumnNameList|PREDICATE COLUMNS|ALL COLUMNS] [WITH NUM BUCKETS|TOPN|CMSKETCH DEPTH|CMSKETCH WIDTH]|[WITH NUM SAMPLES|WITH FLOATNUM SAMPLERATE]; ``` @@ -341,22 +329,7 @@ If you want to persist the column configuration in the `ANALYZE` statement (incl - When TiDB collects statistics automatically or when you manually collect statistics by executing the `ANALYZE` statement without specifying the column configuration, TiDB continues using the previously persisted configuration for statistics collection. - When you manually execute the `ANALYZE` statement multiple times with column configuration specified, TiDB overwrites the previously recorded persistent configuration using the new configuration specified by the latest `ANALYZE` statement. -To locate `PREDICATE COLUMNS` and columns on which statistics have been collected, use the following syntax: - -```sql -SHOW COLUMN_STATS_USAGE [ShowLikeOrWhere]; -``` - -The `SHOW COLUMN_STATS_USAGE` statement returns the following 6 columns: - -| Column name | Description | -| -------- | ------------- | -| `Db_name` | The database name | -| `Table_name` | The table name | -| `Partition_name` | The partition name | -| `Column_name` | The column name | -| `Last_used_at` | The last time when the column statistics were used in the query optimization | -| `Last_analyzed_at` | The last time when the column statistics were collected | +To locate `PREDICATE COLUMNS` and columns on which statistics have been collected, use the [`SHOW COLUMN_STATS_USAGE`](/sql-statements/sql-statement-show-column-stats-usage.md) statement. In the following example, after executing `ANALYZE TABLE t PREDICATE COLUMNS;`, TiDB collects statistics on columns `b`, `c`, and `d`, where column `b` is a `PREDICATE COLUMN` and columns `c` and `d` are index columns. @@ -400,7 +373,7 @@ WHERE db_name = 'test' AND table_name = 't' AND last_analyzed_at IS NOT NULL; ## Versions of statistics -The `tidb_analyze_version` variable controls the statistics collected by TiDB. Currently, two versions of statistics are supported: `tidb_analyze_version = 1` and `tidb_analyze_version = 2`. +The [`tidb_analyze_version`](/system-variables.md#tidb_analyze_version-new-in-v510) variable controls the statistics collected by TiDB. Currently, two versions of statistics are supported: `tidb_analyze_version = 1` and `tidb_analyze_version = 2`. - For TiDB Self-Hosted, the default value of this variable changes from `1` to `2` starting from v5.3.0. - For TiDB Cloud, the default value of this variable changes from `1` to `2` starting from v6.5.0. @@ -458,30 +431,7 @@ You can view the `ANALYZE` status and statistics information using the following ### `ANALYZE` state -When executing the `ANALYZE` statement, you can view the current state of `ANALYZE` using the following SQL statement: - -{{< copyable "sql" >}} - -```sql -SHOW ANALYZE STATUS [ShowLikeOrWhere] -``` - -This statement returns the state of `ANALYZE`. You can use `ShowLikeOrWhere` to filter the information you need. - -Currently, the `SHOW ANALYZE STATUS` statement returns the following 11 columns: - -| Column name | Description | -| :-------- | :------------- | -| table_schema | The database name | -| table_name | The table name | -| partition_name| The partition name | -| job_info | The task information. If an index is analyzed, this information will include the index name. When `tidb_analyze_version =2`, this information will include configuration items such as sample rate. | -| processed_rows | The number of rows that have been analyzed | -| start_time | The time at which the task starts | -| state | The state of a task, including `pending`, `running`, `finished`, and `failed` | -| fail_reason | The reason why the task fails. If the execution is successful, the value is `NULL`. | -| instance | The TiDB instance that executes the task | -| process_id | The process ID that executes the task | +When executing the `ANALYZE` statement, you can view the current state of `ANALYZE` using [`SHOW ANALYZE STATUS`](/sql-statements/sql-statement-show-analyze-status.md). Starting from TiDB v6.1.0, the `SHOW ANALYZE STATUS` statement supports showing cluster-level tasks. Even after a TiDB restart, you can still view task records before the restart using this statement. Before TiDB v6.1.0, the `SHOW ANALYZE STATUS` statement can only show instance-level tasks, and task records are cleared after a TiDB restart. @@ -502,172 +452,27 @@ mysql> SHOW ANALYZE STATUS [ShowLikeOrWhere]; ### Metadata of tables -You can use the `SHOW STATS_META` statement to view the total number of rows and the number of updated rows. - -{{< copyable "sql" >}} - -```sql -SHOW STATS_META [ShowLikeOrWhere]; -``` - -The syntax of `ShowLikeOrWhereOpt` is as follows: - -![ShowLikeOrWhereOpt](/media/sqlgram/ShowLikeOrWhereOpt.png) - -Currently, the `SHOW STATS_META` statement returns the following 6 columns: - -| Column name | Description | -| :-------- | :------------- | -| `db_name` | The database name | -| `table_name` | The table name | -| `partition_name`| The partition name | -| `update_time` | The time of the update | -| `modify_count` | The number of modified rows | -| `row_count` | The total number of rows | - -> **Note:** -> -> When TiDB automatically updates the total number of rows and the number of modified rows according to DML statements, `update_time` is also updated. Therefore, `update_time` does not necessarily indicate the last time when the `ANALYZE` statement is executed. +You can use the [`SHOW STATS_META`](/sql-statements/sql-statement-show-stats-meta.md) statement to view the total number of rows and the number of updated rows. ### Health state of tables -You can use the `SHOW STATS_HEALTHY` statement to check the health state of tables and roughly estimate the accuracy of the statistics. When `modify_count` >= `row_count`, the health state is 0; when `modify_count` < `row_count`, the health state is (1 - `modify_count`/`row_count`) * 100. - -The syntax is as follows: - -{{< copyable "sql" >}} - -```sql -SHOW STATS_HEALTHY [ShowLikeOrWhere]; -``` - -The synopsis of `SHOW STATS_HEALTHY` is: - -![ShowStatsHealthy](/media/sqlgram/ShowStatsHealthy.png) - -Currently, the `SHOW STATS_HEALTHY` statement returns the following 4 columns: - -| Column name | Description | -| :-------- | :------------- | -| `db_name` | The database name | -| `table_name` | The table name | -| `partition_name` | The partition name | -| `healthy` | The health state of tables | +You can use the [`SHOW STATS_HEALTHY`](/sql-statements/sql-statement-show-stats-healthy.md) statement to check the health state of tables and roughly estimate the accuracy of the statistics. When `modify_count` >= `row_count`, the health state is 0; when `modify_count` < `row_count`, the health state is (1 - `modify_count`/`row_count`) * 100. ### Metadata of columns -You can use the `SHOW STATS_HISTOGRAMS` statement to view the number of different values and the number of `NULL` in all the columns. - -Syntax as follows: - -{{< copyable "sql" >}} - -```sql -SHOW STATS_HISTOGRAMS [ShowLikeOrWhere] -``` - -This statement returns the number of different values and the number of `NULL` in all the columns. You can use `ShowLikeOrWhere` to filter the information you need. - -Currently, the `SHOW STATS_HISTOGRAMS` statement returns the following 10 columns: - -| Column name | Description | -| :-------- | :------------- | -| `db_name` | The database name | -| `table_name` | The table name | -| `partition_name` | The partition name | -| `column_name` | The column name (when `is_index` is `0`) or the index name (when `is_index` is `1`) | -| `is_index` | Whether it is an index column or not | -| `update_time` | The time of the update | -| `distinct_count` | The number of different values | -| `null_count` | The number of `NULL` | -| `avg_col_size` | The average length of columns | -| correlation | The Pearson correlation coefficient of the column and the integer primary key, which indicates the degree of association between the two columns| +You can use the [`SHOW STATS_HISTOGRAMS`](/sql-statements/sql-statement-show-stats-histograms.md) statement to view the number of different values and the number of `NULL` in all the columns. ### Buckets of histogram -You can use the `SHOW STATS_BUCKETS` statement to view each bucket of the histogram. - -The syntax is as follows: - -{{< copyable "sql" >}} - -```sql -SHOW STATS_BUCKETS [ShowLikeOrWhere] -``` - -The diagram is as follows: - -![SHOW STATS_BUCKETS](/media/sqlgram/SHOW_STATS_BUCKETS.png) - -This statement returns information about all the buckets. You can use `ShowLikeOrWhere` to filter the information you need. - -Currently, the `SHOW STATS_BUCKETS` statement returns the following 11 columns: - -| Column name | Description | -| :-------- | :------------- | -| `db_name` | The database name | -| `table_name` | The table name | -| `partition_name` | The partition name | -| `column_name` | The column name (when `is_index` is `0`) or the index name (when `is_index` is `1`) | -| `is_index` | Whether it is an index column or not | -| `bucket_id` | The ID of a bucket | -| `count` | The number of all the values that falls on the bucket and the previous buckets | -| `repeats` | The occurrence number of the maximum value | -| `lower_bound` | The minimum value | -| `upper_bound` | The maximum value | -| `ndv` | The number of different values in the bucket. When `tidb_analyze_version` = `1`, `ndv` is always `0`, which has no actual meaning. | +You can use the [`SHOW STATS_BUCKETS`](/sql-statements/sql-statement-show-stats-buckets.md statement to view each bucket of the histogram. ### Top-N information -You can use the `SHOW STATS_TOPN` statement to view the Top-N information currently collected by TiDB. - -The syntax is as follows: - -{{< copyable "sql" >}} - -```sql -SHOW STATS_TOPN [ShowLikeOrWhere]; -``` - -Currently, the `SHOW STATS_TOPN` statement returns the following 7 columns: - -| Column name | Description | -| ---- | ----| -| `db_name` | The database name | -| `table_name` | The table name | -| `partition_name` | The partition name | -| `column_name` | The column name (when `is_index` is `0`) or the index name (when `is_index` is `1`) | -| `is_index` | Whether it is an index column or not | -| `value` | The value of this column | -| `count` | How many times the value appears | +You can use the [`SHOW STATS_TOPN`](/sql-statements/sql-statement-show-stats-topn.md) statement to view the Top-N information currently collected by TiDB. ## Delete statistics -You can run the `DROP STATS` statement to delete statistics. - -{{< copyable "sql" >}} - -```sql -DROP STATS TableName -``` - -The preceding statement deletes all statistics of `TableName`. If a partitioned table is specified, this statement will delete statistics of all partitions in this table as well as GlobalStats generated in dynamic pruning mode. - -{{< copyable "sql" >}} - -```sql -DROP STATS TableName PARTITION PartitionNameList; -``` - -This preceding statement only deletes statistics of the specified partitions in `PartitionNameList`. - -{{< copyable "sql" >}} - -```sql -DROP STATS TableName GLOBAL; -``` - -The preceding statement only deletes GlobalStats generated in dynamic pruning mode of the specified table. +You can run the [`DROP STATS`](/sql-statements/sql-statement-drop-stats.md) statement to delete statistics. ## Load statistics @@ -725,24 +530,18 @@ The interface to export statistics is as follows: + To obtain the JSON format statistics of the `${table_name}` table in the `${db_name}` database: - {{< copyable "" >}} - ``` http://${tidb-server-ip}:${tidb-server-status-port}/stats/dump/${db_name}/${table_name} ``` For example: - {{< copyable "" >}} - - ``` + ```shell curl -s http://127.0.0.1:10080/stats/dump/test/t1 -o /tmp/t1.json ``` + To obtain the JSON format statistics of the `${table_name}` table in the `${db_name}` database at specific time: - {{< copyable "" >}} - ``` http://${tidb-server-ip}:${tidb-server-status-port}/stats/dump/${db_name}/${table_name}/${yyyyMMddHHmmss} ``` @@ -755,11 +554,11 @@ The interface to export statistics is as follows: Generally, the imported statistics refer to the JSON file obtained using the export interface. -Syntax: +Loading statistics can be done with the [`LOAD STATS`](/sql-statements/sql-statement-load-stats.md) statement. -{{< copyable "sql" >}} +For example: -``` +```sql LOAD STATS 'file_name' ``` @@ -836,7 +635,7 @@ mysql> SHOW WARNINGS; 1 row in set (0.00 sec) ``` -In addition, you can also lock the statistics of a partition using `LOCK STATS`. For example: +In addition, you can also lock the statistics of a partition using [`LOCK STATS`](/sql-statements/sql-statement-lock-stats.md). For example: Create a partition table `t`, and insert data into it. When the statistics of partition `p1` are not locked, the `ANALYZE` statement can be successfully executed.