From 2c9132dba3ca4a982d7c5352fe9bd87a61e40468 Mon Sep 17 00:00:00 2001 From: Yecheng Fu Date: Tue, 30 Jun 2020 16:54:07 +0800 Subject: [PATCH 1/3] en: SQL Prepare Plan Cache --- TOC.md | 1 + sql-prepare-plan-cache.md | 127 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 128 insertions(+) create mode 100644 sql-prepare-plan-cache.md diff --git a/TOC.md b/TOC.md index ce4643a991335..830affce05a6d 100644 --- a/TOC.md +++ b/TOC.md @@ -102,6 +102,7 @@ + [Join Reorder](/join-reorder.md) + Physical Optimization + [Statistics](/statistics.md) + + [Prepare Plan Cache](/sql-prepare-plan-cache.md) + Control Execution Plan + [Optimizer Hints](/optimizer-hints.md) + [SQL Plan Management](/sql-plan-management.md) diff --git a/sql-prepare-plan-cache.md b/sql-prepare-plan-cache.md new file mode 100644 index 0000000000000..3a8a3acc59d72 --- /dev/null +++ b/sql-prepare-plan-cache.md @@ -0,0 +1,127 @@ +--- +title: SQL Prepare Execution Plan Cache +summary: Learn about SQL Prepare Execution Plan Cache in TiDB. +category: reference +--- + +# SQL Prepare Execution Plan Cache + +TiDB supports execution plan caching for `Prepare` / `Execute` queries. + +There are two forms of `Prepare` / `Execute` queries: + +- in the binary communication protocol, use `COM_STMT_PREPARE` and + `COM_STMT_EXECUTE` to execute general parameterized queries; +- in the text communication protocol, use `COM_QUERY` to execute `Prepare` and + `Execution` queries. + +The optimizer handles these two types of queries in the same way: in preparing, the parameterized query will be parsed into an AST (Abstract Syntax Tree) and cached; in later executing, the execution plan will be generated based on saved AST and specific parameter values. + +When the execution plan cache is turned on, in the first execution every `Prepare` statement will check whether the current query can use the execution plan cache, and if it can be used, then put the generated execution plan into a cache implemented by LRU (Least Recently Used) linked list. In the subsequent `Execute` queries, the execution plan will be obtained from the cache and checked for availability. If the check succeeds, the step of generating an execution plan is skipped, otherwise, the execution plan is regenerated and saved in the cache. + +In the current version, when the `Prepare` statement meets any of the following conditions, the query cannot use the execution plan cache: + +- the query contains variables other than `?` (including system variables or user-defined +variables); +- the query contains sub-queries; +- the query contains functions that cannot be cached, such as `current_user()`, `database()`, and `last_insert_id()`, etc.; +- the `Order By` statement of the query contains `?`; +- the `Group By` statement of the query contains `?`; +- the `Limit [Offset]` statement of the query contains `?`; +- the window frame definition of the `Window` function contains `?`; +- partition tables are involved in the query. + +The LRU linked list is designed as a session-level cache because `Prepare` / +`Execute` cannot be executed across sessions. Each element of the LRU list is a +key-value pair, value is the execution plan, and the key is composed of the +following parts: + +- the name of the database where `Execute` is executed; +- the identifier of the `Prepare` statement, the name after the `PREPARE` + keyword; +- the current schema version, which will be updated after every successful DDL statement; +- the SQL Mode when executing `Execute`; +- the current time zone, which is the value of the system variable + `time_zone`. + +Any change in the above information (e.g. switching databases, renaming `Prepare` statement, executing DDL statements, or modifying the value of SQL mode / `time_zone`), or the LRU cache elimination mechanism will cause the execution plan cache miss in executing. + +After the execution plan cache is obtained from the cache, TiDB will first check whether the execution plan is still valid. If the current `Execute` statement is executed in an explicit transaction, and the referenced table is modified in the transaction pre-order statement, the cached execution plan accessing this table does not contain the `UnionScan` operator, then it cannot be executed. + +After the validation test is passed, the scan range of the execution plan will +be adjusted accordingly according to the current parameter values, and then used +to perform data querying. + +There are two points worth noting about execution plan caching and query +performance: + +- Considering that the parameters of `Execute` will be different, the execution plan cache will prohibit some aggressive query optimization methods that are closely related to specific parameter values to ensure adaptability, resulting in the query plan may not be optimal for certain parameter values. For example, the filter condition of the query is `where a > ? And a < ?`, the parameters of the first `Execute` statement are 2 and 1 respectively, considering that these two parameters maybe be 1 and 2 in the next execution time, the optimizer will not generate the optimal `TableDual` execution plan that is specific to current parameter values; +- If cache invalidation and elimination are not considered, an execution plan cache is applied to various parameter values, which in theory will also result in non-optimal execution plans for certain values. For example, if the filter condition is `where a < ?` and the parameter value used for the first execution is 1, then the optimizer generates the optimal `IndexScan` execution plan and puts it into the cache. In the subsequent executions, if the value becomes 10000, the `TableScan` plan will be the better. Due to the execution plan, the previously generated `IndexScan` will be used for execution. Therefore, the execution plan cache is more suitable for business scenarios where the query is simple (the ratio of compilation is high) and the execution plan is relatively fixed. + +The execution plan cache is currently disabled by default. You can enable this function by turning on the [`prepare-plan-cache`](/tidb-configuration-file.md#prepared-plan-cache) in the configuration file. + +> **Note:** +> +> The execution plan cache function applies only for `Prepare` / `Execute` queries and does not take effect for normal queries. + +After the execution plan cache function is enabled, you can use the session-level system variable `last_plan_from_cache` to see whether the previous `Execute` statement used the cached execution plan, for example: + +{{< copyable "sql" >}} + +```sql +MySQL [test]> create table t(a int); +Query OK, 0 rows affected (0.00 sec) +MySQL [test]> prepare stmt from 'select * from t where a = ?'; +Query OK, 0 rows affected (0.00 sec) +MySQL [test]> set @a = 1; +Query OK, 0 rows affected (0.00 sec) +-- The first execution generates an execution plan and saves it in the cache +MySQL [test]> execute stmt using @a; +Empty set (0.00 sec) +MySQL [test]> select @@last_plan_from_cache; ++------------------------+ +| @@last_plan_from_cache | ++------------------------+ +| 0 | ++------------------------+ +1 row in set (0.00 sec) +-- The second execution hits the cache +MySQL [test]> execute stmt using @a; +Empty set (0.00 sec) +MySQL [test]> select @@last_plan_from_cache; ++------------------------+ +| @@last_plan_from_cache | ++------------------------+ +| 1 | ++------------------------+ +1 row in set (0.00 sec) +``` + +If you find that a certain set of `Prepare` / `Execute` caused unexpected behavior due to the execution plan cache, you can use SQL Hint `ignore_plan_cache()` to skip using the execution plan cache for the current statement. Still using the above statement as an example: + +{{< copyable "sql" >}} + +```sql +MySQL [test]> prepare stmt from 'select /*+ ignore_plan_cache() */ * from t where a = ?'; +Query OK, 0 rows affected (0.00 sec) +MySQL [test]> set @a = 1; +Query OK, 0 rows affected (0.00 sec) +MySQL [test]> execute stmt using @a; +Empty set (0.00 sec) +MySQL [test]> select @@last_plan_from_cache; ++------------------------+ +| @@last_plan_from_cache | ++------------------------+ +| 0 | ++------------------------+ +1 row in set (0.00 sec) +MySQL [test]> execute stmt using @a; +Empty set (0.00 sec) +MySQL [test]> select @@last_plan_from_cache; ++------------------------+ +| @@last_plan_from_cache | ++------------------------+ +| 0 | ++------------------------+ +1 row in set (0.00 sec) +``` From a2da097733c4c6ba234c60384e0d61f9556f7980 Mon Sep 17 00:00:00 2001 From: yikeke Date: Fri, 10 Jul 2020 13:20:50 +0800 Subject: [PATCH 2/3] minor edits to improve format and unify doc style --- sql-prepare-plan-cache.md | 70 +++++++++++++++++++-------------------- 1 file changed, 34 insertions(+), 36 deletions(-) diff --git a/sql-prepare-plan-cache.md b/sql-prepare-plan-cache.md index 3a8a3acc59d72..fe09078e705c8 100644 --- a/sql-prepare-plan-cache.md +++ b/sql-prepare-plan-cache.md @@ -1,7 +1,6 @@ --- title: SQL Prepare Execution Plan Cache summary: Learn about SQL Prepare Execution Plan Cache in TiDB. -category: reference --- # SQL Prepare Execution Plan Cache @@ -10,61 +9,58 @@ TiDB supports execution plan caching for `Prepare` / `Execute` queries. There are two forms of `Prepare` / `Execute` queries: -- in the binary communication protocol, use `COM_STMT_PREPARE` and - `COM_STMT_EXECUTE` to execute general parameterized queries; -- in the text communication protocol, use `COM_QUERY` to execute `Prepare` and - `Execution` queries. +- In the binary communication protocol, use `COM_STMT_PREPARE` and + `COM_STMT_EXECUTE` to execute general parameterized SQL queries; +- In the text communication protocol, use `COM_QUERY` to execute `Prepare` and + `Execution` SQL queries. -The optimizer handles these two types of queries in the same way: in preparing, the parameterized query will be parsed into an AST (Abstract Syntax Tree) and cached; in later executing, the execution plan will be generated based on saved AST and specific parameter values. +The optimizer handles these two types of queries in the same way: when preparing, the parameterized query is parsed into an AST (Abstract Syntax Tree) and cached; in later execution, the execution plan is generated based on the stored AST and specific parameter values. -When the execution plan cache is turned on, in the first execution every `Prepare` statement will check whether the current query can use the execution plan cache, and if it can be used, then put the generated execution plan into a cache implemented by LRU (Least Recently Used) linked list. In the subsequent `Execute` queries, the execution plan will be obtained from the cache and checked for availability. If the check succeeds, the step of generating an execution plan is skipped, otherwise, the execution plan is regenerated and saved in the cache. +When the execution plan cache is enabled, in the first execution every `Prepare` statement checks whether the current query can use the execution plan cache, and if the query can use it, then put the generated execution plan into a cache implemented by LRU (Least Recently Used) linked list. In the subsequent `Execute` queries, the execution plan is obtained from the cache and checked for availability. If the check succeeds, the step of generating an execution plan is skipped. Otherwise, the execution plan is regenerated and saved in the cache. -In the current version, when the `Prepare` statement meets any of the following conditions, the query cannot use the execution plan cache: +In the current version of TiDB, when the `Prepare` statement meets any of the following conditions, the query cannot use the execution plan cache: -- the query contains variables other than `?` (including system variables or user-defined -variables); -- the query contains sub-queries; -- the query contains functions that cannot be cached, such as `current_user()`, `database()`, and `last_insert_id()`, etc.; -- the `Order By` statement of the query contains `?`; -- the `Group By` statement of the query contains `?`; -- the `Limit [Offset]` statement of the query contains `?`; -- the window frame definition of the `Window` function contains `?`; -- partition tables are involved in the query. +- The query contains variables other than `?` (including system variables or user-defined variables); +- The query contains sub-queries; +- The query contains functions that cannot be cached, such as `current_user()`, `database()`, and `last_insert_id()`; +- The `Order By` statement of the query contains `?`; +- The `Group By` statement of the query contains `?`; +- The `Limit [Offset]` statement of the query contains `?`; +- The window frame definition of the `Window` function contains `?`; +- Partition tables are involved in the query. The LRU linked list is designed as a session-level cache because `Prepare` / `Execute` cannot be executed across sessions. Each element of the LRU list is a -key-value pair, value is the execution plan, and the key is composed of the +key-value pair. The value is the execution plan, and the key is composed of the following parts: -- the name of the database where `Execute` is executed; -- the identifier of the `Prepare` statement, the name after the `PREPARE` +- The name of the database where `Execute` is executed; +- The identifier of the `Prepare` statement, that is, the name after the `PREPARE` keyword; -- the current schema version, which will be updated after every successful DDL statement; -- the SQL Mode when executing `Execute`; -- the current time zone, which is the value of the system variable - `time_zone`. +- The current schema version, which is updated after every successfully executed DDL statement; +- The SQL mode when executing `Execute`; +- The current time zone, which is the value of the `time_zone` system variable. -Any change in the above information (e.g. switching databases, renaming `Prepare` statement, executing DDL statements, or modifying the value of SQL mode / `time_zone`), or the LRU cache elimination mechanism will cause the execution plan cache miss in executing. +Any change in the above information (e.g. switching databases, renaming `Prepare` statement, executing DDL statements, or modifying the value of SQL mode / `time_zone`), or the LRU cache elimination mechanism causes the execution plan cache miss when executing. -After the execution plan cache is obtained from the cache, TiDB will first check whether the execution plan is still valid. If the current `Execute` statement is executed in an explicit transaction, and the referenced table is modified in the transaction pre-order statement, the cached execution plan accessing this table does not contain the `UnionScan` operator, then it cannot be executed. +After the execution plan cache is obtained from the cache, TiDB first checks whether the execution plan is still valid. If the current `Execute` statement is executed in an explicit transaction, and the referenced table is modified in the transaction pre-order statement, the cached execution plan accessing this table does not contain the `UnionScan` operator, then it cannot be executed. -After the validation test is passed, the scan range of the execution plan will -be adjusted accordingly according to the current parameter values, and then used +After the validation test is passed, the scan range of the execution plan is adjusted according to the current parameter values, and then used to perform data querying. There are two points worth noting about execution plan caching and query performance: -- Considering that the parameters of `Execute` will be different, the execution plan cache will prohibit some aggressive query optimization methods that are closely related to specific parameter values to ensure adaptability, resulting in the query plan may not be optimal for certain parameter values. For example, the filter condition of the query is `where a > ? And a < ?`, the parameters of the first `Execute` statement are 2 and 1 respectively, considering that these two parameters maybe be 1 and 2 in the next execution time, the optimizer will not generate the optimal `TableDual` execution plan that is specific to current parameter values; -- If cache invalidation and elimination are not considered, an execution plan cache is applied to various parameter values, which in theory will also result in non-optimal execution plans for certain values. For example, if the filter condition is `where a < ?` and the parameter value used for the first execution is 1, then the optimizer generates the optimal `IndexScan` execution plan and puts it into the cache. In the subsequent executions, if the value becomes 10000, the `TableScan` plan will be the better. Due to the execution plan, the previously generated `IndexScan` will be used for execution. Therefore, the execution plan cache is more suitable for business scenarios where the query is simple (the ratio of compilation is high) and the execution plan is relatively fixed. +- Considering that the parameters of `Execute` are different, the execution plan cache prohibits some aggressive query optimization methods that are closely related to specific parameter values to ensure adaptability. This causes that the query plan may not be optimal for certain parameter values. For example, the filter condition of the query is `where a > ? And a < ?`, the parameters of the first `Execute` statement are `2` and `1` respectively. Considering that these two parameters maybe be `1` and `2` in the next execution time, the optimizer does not generate the optimal `TableDual` execution plan that is specific to current parameter values; +- If cache invalidation and elimination are not considered, an execution plan cache is applied to various parameter values, which in theory also result in non-optimal execution plans for certain values. For example, if the filter condition is `where a < ?` and the parameter value used for the first execution is `1`, then the optimizer generates the optimal `IndexScan` execution plan and puts it into the cache. In the subsequent executions, if the value becomes `10000`, the `TableScan` plan might be the better one. But due to the execution plan cache, the previously generated `IndexScan` is used for execution. Therefore, the execution plan cache is more suitable for application scenarios where the query is simple (the ratio of compilation is high) and the execution plan is relatively fixed. -The execution plan cache is currently disabled by default. You can enable this function by turning on the [`prepare-plan-cache`](/tidb-configuration-file.md#prepared-plan-cache) in the configuration file. +Currently, the execution plan cache is disabled by default. You can enable this feature by enabling the [`prepare-plan-cache`](/tidb-configuration-file.md#prepared-plan-cache) in the configuration file. > **Note:** > -> The execution plan cache function applies only for `Prepare` / `Execute` queries and does not take effect for normal queries. +> The execution plan cache feature applies only for `Prepare` / `Execute` queries and does not take effect for normal queries. -After the execution plan cache function is enabled, you can use the session-level system variable `last_plan_from_cache` to see whether the previous `Execute` statement used the cached execution plan, for example: +After the execution plan cache feature is enabled, you can use the session-level system variable `last_plan_from_cache` to see whether the previous `Execute` statement used the cached execution plan, for example: {{< copyable "sql" >}} @@ -75,7 +71,8 @@ MySQL [test]> prepare stmt from 'select * from t where a = ?'; Query OK, 0 rows affected (0.00 sec) MySQL [test]> set @a = 1; Query OK, 0 rows affected (0.00 sec) --- The first execution generates an execution plan and saves it in the cache + +-- The first execution generates an execution plan and saves it in the cache. MySQL [test]> execute stmt using @a; Empty set (0.00 sec) MySQL [test]> select @@last_plan_from_cache; @@ -85,7 +82,8 @@ MySQL [test]> select @@last_plan_from_cache; | 0 | +------------------------+ 1 row in set (0.00 sec) --- The second execution hits the cache + +-- The second execution hits the cache. MySQL [test]> execute stmt using @a; Empty set (0.00 sec) MySQL [test]> select @@last_plan_from_cache; @@ -97,7 +95,7 @@ MySQL [test]> select @@last_plan_from_cache; 1 row in set (0.00 sec) ``` -If you find that a certain set of `Prepare` / `Execute` caused unexpected behavior due to the execution plan cache, you can use SQL Hint `ignore_plan_cache()` to skip using the execution plan cache for the current statement. Still using the above statement as an example: +If you find that a certain set of `Prepare` / `Execute` has unexpected behavior due to the execution plan cache, you can use the `ignore_plan_cache()` SQL hint to skip using the execution plan cache for the current statement. Still, use the above statement as an example: {{< copyable "sql" >}} From eb56b2c1b93682909e4844bf77f2749efb7aa0b4 Mon Sep 17 00:00:00 2001 From: Yecheng Fu Date: Fri, 10 Jul 2020 15:49:38 +0800 Subject: [PATCH 3/3] Apply suggestions from code review Co-authored-by: Keke Yi <40977455+yikeke@users.noreply.github.com> --- TOC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/TOC.md b/TOC.md index 830affce05a6d..dab191c9ac21a 100644 --- a/TOC.md +++ b/TOC.md @@ -102,7 +102,7 @@ + [Join Reorder](/join-reorder.md) + Physical Optimization + [Statistics](/statistics.md) - + [Prepare Plan Cache](/sql-prepare-plan-cache.md) + + [Prepare Execution Plan Cache](/sql-prepare-plan-cache.md) + Control Execution Plan + [Optimizer Hints](/optimizer-hints.md) + [SQL Plan Management](/sql-plan-management.md)