From e17f5f2857a96c2f5e608395e2f5ba5d1d3c4ba2 Mon Sep 17 00:00:00 2001 From: en-jin19 Date: Tue, 29 Mar 2022 14:57:51 +0800 Subject: [PATCH 01/15] BR: Add a new doc about the batch create table --- TOC.md | 2 ++ br/backup-and-restore-faq.md | 6 ++++ br/br-batch-create-table.md | 65 ++++++++++++++++++++++++++++++++++++ 3 files changed, 73 insertions(+) create mode 100644 br/br-batch-create-table.md diff --git a/TOC.md b/TOC.md index b662aa456ca4b..83dccb5bcccbe 100644 --- a/TOC.md +++ b/TOC.md @@ -72,6 +72,7 @@ - [Back up and Restore Data on Azure Blob Storage](/br/backup-and-restore-azblob.md) - BR Features - [Auto Tune](/br/br-auto-tune.md) + - [Batch Create Table](/br/br-batch-create-table.md) - [BR FAQ](/br/backup-and-restore-faq.md) - [Configure Time Zone](/configure-time-zone.md) - [Daily Checklist](/daily-check.md) @@ -203,6 +204,7 @@ - [External Storages](/br/backup-and-restore-storages.md) - BR Features - [Auto Tune](/br/br-auto-tune.md) + - [Batch Create Table](/br/br-batch-create-table.md) - [BR FAQ](/br/backup-and-restore-faq.md) - TiDB Binlog - [Overview](/tidb-binlog/tidb-binlog-overview.md) diff --git a/br/backup-and-restore-faq.md b/br/backup-and-restore-faq.md index 6dca31b9c41c4..b16bd186ac77c 100644 --- a/br/backup-and-restore-faq.md +++ b/br/backup-and-restore-faq.md @@ -164,6 +164,12 @@ You can use [`filter.rules`](https://github.com/pingcap/tiflow/blob/7c3c2336f981 Yes. BR backs up the [`SHARD_ROW_ID_BITS` and `PRE_SPLIT_REGIONS`](/sql-statements/sql-statement-split-region.md#pre_split_regions) information of a table. The data of the restored table is also split into multiple Regions. +## What should I do when the `entry too large, the max entry size is 6291456, the size of data is 7690800` error reported during data restoration using BR? + +You can try reducing the size of the concurrent batch table creation by setting `--ddl-batch-size` to `128` or a smaller value. + +Suppose that the value of [`--ddl-batch-size`](/br/br-batch-create-table.md#how to use) is greater than `1`, and you use BR to restore the backup data. In this case, TiDB writes the queue of DDL jobs that create tables to TiKV. The total schema size of all tables sent by TiDB at one time should not exceed 6 MB, because the default maximum value of job messages that TiDB can send at one time is `6 MB`. You are **not recommended** to modify this value. For details, see [txn-entry-size-limit](/tidb-configuration-file.md#txn-entry-size-limit-new-in-50) and [raft-entry-max-size](/tikv-configuration-file.md#raft-entry-max-size). Therefore, if you set the too large value to `--ddl-batch-size`, the schema size of the batch table sent by TiDB at a time exceeds the specified value, resulting in BR reporting `entry too large, the max entry size is 6291456, the size of data is 7690800` error. + ## Why is the `region is unavailable` error reported for a SQL query after I use BR to restore the backup data? If the cluster backed up using BR has TiFlash, `TableInfo` stores the TiFlash information when BR restores the backup data. If the cluster to be restored does not have TiFlash, the `region is unavailable` error is reported. diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md new file mode 100644 index 0000000000000..3aae2a31012f2 --- /dev/null +++ b/br/br-batch-create-table.md @@ -0,0 +1,65 @@ +--- +title: BR Batch Create Table +summary: Learn how to use BR batch create table feature. When restoring data, BR can use batch create table feature to speed up the restoration. +--- + +# BR Batch Create Table + +When restoring data using Backup & Restore (BR), BR creates databases and tables on the downstream TiDB cluster first, and then restores data. In the versions earlier than TiDB v6.0, BR uses the [serial execution](#Implementation) scheme to create tables in the restoration process. However, when restoring data with a large number (nearly 50000) of tables, this scheme takes much time to create tables. + +To speed up the table creation process, and thereby reduce the time for restoring data, in v6.0, TiDB introduces BR batch create table feature. This feature is enabled by default. + +> **Note:** +> +> - To use the BR batch create table feature, both TiDB and BR should be in 6.0 or later versions. If either TiDB or BR is in the version lower than 6.0, BR uses the serial execution scheme. +> - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are in 6.0 or later versions. In this case, BR enables batch create table feature by default without additional configuration. + +## User scenario + +When you need to restore data with a considerable number of tables, for example, 50000 tables, you can use BR batch create table feature to speed up the restoration process. + +For the detailed effect, see [Test batch create table feature](#feature-test). + +## Use batch create table + +BR enables batch create table feature and configures `--ddl-batch-size=128` by default in v6.0 or later versions. Therefore, you do not need to configure this parameter additionally. `--ddl-batch-size=128` means that BR creates tables in multiple batches, and each batch has 128 tables. + +To disable this feature, you can set `--ddl-batch-size` to `0` by the following command: + +{{< copyable "shell-regular" >}} + +```shell +br restore full -s local:///br_data/ --pd 172.16.5.198:2379 --log-file restore.log --ddl-batch-size=0 +``` + +After disabling the feature, BR uses the [serial execution scheme](#implementation) instead. + +## Implementation principles + +- Serial execution scheme before v6.0: + + In the versions earlier than 6.0, BR uses the serial execution scheme. When restoring data, BR creates the database and table in the target TiDB first, then starts restoring data. To create tables, after calling TiDB API, BR uses the SQL statement `Create Table`. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the schema version changes correspondingly, and each change synchronizes to other BRs and other TiDB DDL workers. Hence, when restoring a large number of tables, the serial execution scheme takes too much time. + +- Batch create table scheme since v6.0: + + The batch create table feature uses the concurrent batch table creation scheme. From v6.0, by default, BR creates tables in multiple batches, and each batch has 128 tables. Using this scheme, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. + +## Test batch create table + +This section describes the information of testing batch create table feature. The test environment is as follows: + +- Cluster configurations: + + - 15 TiKV instances. Each TiKV instance has 16 cores, 80 GB memory, and 16 threads to process RPC requests ([`import.num-threads`](/tikv-configuration-file.md#num-threads) = 16). + - 3 TiDB instances. Each TiDB instance has 16 cores, 32 GB memory. + - 3 PD instances. Each PD instance has 16 cores, 32 GB memory. + +- Data to be restored: 16.16 TB + +The test result is as follows: + +``` +‘[2022/03/12 22:37:49.060 +08:00] [INFO] [collector.go:67] ["Full restore success summary"] [total-ranges=751760] [ranges-succeed=751760] [ranges-failed=0] [split-region=1h33m18.078448449s] [restore-ranges=542693] [total-take=1h41m35.471476438s] [restore-data-size(after-compressed)=8.337TB] [Size=8336694965072] [BackupTS=431773933856882690] [total-kv=148015861383] [total-kv-size=16.16TB] [average-speed=2.661GB/s]’ +``` + +In the result, you can find that the average speed of restoring one TiKV instance is as high as 181.65 MB/s (`average-speed(GB/s)`/`tikv_count` = `181.65(MB/s)`). \ No newline at end of file From 5354405e28969b8422cf2cbe03fda9518168b31e Mon Sep 17 00:00:00 2001 From: en-jin19 Date: Tue, 29 Mar 2022 15:32:25 +0800 Subject: [PATCH 02/15] update translations --- br/backup-and-restore-faq.md | 6 +++--- br/br-batch-create-table.md | 30 +++++++++++++++--------------- 2 files changed, 18 insertions(+), 18 deletions(-) diff --git a/br/backup-and-restore-faq.md b/br/backup-and-restore-faq.md index b16bd186ac77c..0722276590c6c 100644 --- a/br/backup-and-restore-faq.md +++ b/br/backup-and-restore-faq.md @@ -164,11 +164,11 @@ You can use [`filter.rules`](https://github.com/pingcap/tiflow/blob/7c3c2336f981 Yes. BR backs up the [`SHARD_ROW_ID_BITS` and `PRE_SPLIT_REGIONS`](/sql-statements/sql-statement-split-region.md#pre_split_regions) information of a table. The data of the restored table is also split into multiple Regions. -## What should I do when the `entry too large, the max entry size is 6291456, the size of data is 7690800` error reported during data restoration using BR? +## What should I do if the `entry too large, the max entry size is 6291456, the size of data is 7690800` error is reported when restoring data using BR? -You can try reducing the size of the concurrent batch table creation by setting `--ddl-batch-size` to `128` or a smaller value. +Try reducing the size of the concurrent batch table creation by setting `--ddl-batch-size` to `128` or a smaller value. -Suppose that the value of [`--ddl-batch-size`](/br/br-batch-create-table.md#how to use) is greater than `1`, and you use BR to restore the backup data. In this case, TiDB writes the queue of DDL jobs that create tables to TiKV. The total schema size of all tables sent by TiDB at one time should not exceed 6 MB, because the default maximum value of job messages that TiDB can send at one time is `6 MB`. You are **not recommended** to modify this value. For details, see [txn-entry-size-limit](/tidb-configuration-file.md#txn-entry-size-limit-new-in-50) and [raft-entry-max-size](/tikv-configuration-file.md#raft-entry-max-size). Therefore, if you set the too large value to `--ddl-batch-size`, the schema size of the batch table sent by TiDB at a time exceeds the specified value, resulting in BR reporting `entry too large, the max entry size is 6291456, the size of data is 7690800` error. +Suppose that the value of [`--ddl-batch-size`](/br/br-batch-create-table.md#how to use) is greater than `1`, and you use BR to restore the backup data. In this case, TiDB writes the queue of DDL jobs that create tables to TiKV. At this time, the total schema size of all tables sent by TiDB at a time should not exceed 6 MB, because the default maximum value of job messages that TiDB can send at a time is `6 MB` (you are **not recommended** to modify this value; for details, see [`txn-entry-size-limit`](/tidb-configuration-file.md#txn-entry-size-limit-new-in-50) and [`raft-entry-max-size`](/tikv-configuration-file.md#raft-entry-max-size). Therefore, if you set the too large value to `--ddl-batch-size`, the schema size of the batch table sent by TiDB at a time exceeds the specified value, resulting in BR reporting `entry too large, the max entry size is 6291456, the size of data is 7690800` error. ## Why is the `region is unavailable` error reported for a SQL query after I use BR to restore the backup data? diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index 3aae2a31012f2..c6d8757a9538a 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -5,24 +5,24 @@ summary: Learn how to use BR batch create table feature. When restoring data, BR # BR Batch Create Table -When restoring data using Backup & Restore (BR), BR creates databases and tables on the downstream TiDB cluster first, and then restores data. In the versions earlier than TiDB v6.0, BR uses the [serial execution](#Implementation) scheme to create tables in the restoration process. However, when restoring data with a large number (nearly 50000) of tables, this scheme takes much time to create tables. +When restoring data using Backup & Restore (BR), BR creates the databases and tables in the target TiDB first, then starts restoring data. In the versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) scheme to create tables in the restoration process. However, when restoring data with a large number (nearly 50000) of tables, this scheme takes much time to create tables. -To speed up the table creation process, and thereby reduce the time for restoring data, in v6.0, TiDB introduces BR batch create table feature. This feature is enabled by default. +To speed up the table creation process, and thereby reduce the time for restoring data, the BR batch create table feature is introduced in TiDB v6.0.0. This feature is enabled by default. > **Note:** > -> - To use the BR batch create table feature, both TiDB and BR should be in 6.0 or later versions. If either TiDB or BR is in the version lower than 6.0, BR uses the serial execution scheme. -> - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are in 6.0 or later versions. In this case, BR enables batch create table feature by default without additional configuration. +> - To use the BR batch create table feature, both TiDB and BR should be in 6.0.0 or later versions. If either TiDB or BR is in the version lower than 6.0.0, BR uses the serial execution scheme. +> - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are in 6.0.0 or later versions. In this case, BR enables batch create table feature by default without additional configuration. ## User scenario -When you need to restore data with a considerable number of tables, for example, 50000 tables, you can use BR batch create table feature to speed up the restoration process. +When you need to restore data with a considerable number of tables, for example, 50000 tables, you can use BR batch create table feature to speed up the restoration process. -For the detailed effect, see [Test batch create table feature](#feature-test). +For the detailed effect, see [Test batch create table feature](#test-batch-create-table). ## Use batch create table -BR enables batch create table feature and configures `--ddl-batch-size=128` by default in v6.0 or later versions. Therefore, you do not need to configure this parameter additionally. `--ddl-batch-size=128` means that BR creates tables in multiple batches, and each batch has 128 tables. +BR enables batch create table feature and configures `--ddl-batch-size=128` by default in 6.0.0 or later versions. Therefore, you do not need to configure this parameter additionally. `--ddl-batch-size=128` means that BR creates tables in multiple batches, and each batch has 128 tables. To disable this feature, you can set `--ddl-batch-size` to `0` by the following command: @@ -32,17 +32,17 @@ To disable this feature, you can set `--ddl-batch-size` to `0` by the following br restore full -s local:///br_data/ --pd 172.16.5.198:2379 --log-file restore.log --ddl-batch-size=0 ``` -After disabling the feature, BR uses the [serial execution scheme](#implementation) instead. +After disabling the feature, BR uses the [serial execution scheme](#implementation-principles) instead. ## Implementation principles -- Serial execution scheme before v6.0: +- Serial execution scheme before v6.0.0: - In the versions earlier than 6.0, BR uses the serial execution scheme. When restoring data, BR creates the database and table in the target TiDB first, then starts restoring data. To create tables, after calling TiDB API, BR uses the SQL statement `Create Table`. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the schema version changes correspondingly, and each change synchronizes to other BRs and other TiDB DDL workers. Hence, when restoring a large number of tables, the serial execution scheme takes too much time. + In the versions earlier than 6.0.0, BR uses the serial execution scheme. When restoring data, BR creates the databases and tables in the target TiDB first, then starts restoring data. To create tables, after calling TiDB API, BR uses the SQL statement `Create Table`. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the schema version changes correspondingly, and each version change synchronizes to other BRs and other TiDB DDL workers. Hence, when restoring a large number of tables, the serial execution scheme takes too much time. -- Batch create table scheme since v6.0: +- Batch create table scheme since v6.0.0: - The batch create table feature uses the concurrent batch table creation scheme. From v6.0, by default, BR creates tables in multiple batches, and each batch has 128 tables. Using this scheme, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. + The batch create table feature uses the concurrent batch table creation scheme. From v6.0.0, by default, BR creates tables in multiple batches, and each batch has 128 tables. Using this scheme, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. ## Test batch create table @@ -50,9 +50,9 @@ This section describes the information of testing batch create table feature. Th - Cluster configurations: - - 15 TiKV instances. Each TiKV instance has 16 cores, 80 GB memory, and 16 threads to process RPC requests ([`import.num-threads`](/tikv-configuration-file.md#num-threads) = 16). - - 3 TiDB instances. Each TiDB instance has 16 cores, 32 GB memory. - - 3 PD instances. Each PD instance has 16 cores, 32 GB memory. + - 15 TiKV instances. Each TiKV instance has 16 CPU cores, 80 GB memory, and 16 threads to process RPC requests ([`import.num-threads`](/tikv-configuration-file.md#num-threads) = 16). + - 3 TiDB instances. Each TiDB instance has 16 CPU cores, 32 GB memory. + - 3 PD instances. Each PD instance has 16 CPU cores, 32 GB memory. - Data to be restored: 16.16 TB From b49571f1e1297a68d28a63b0a000e4abbcf8f904 Mon Sep 17 00:00:00 2001 From: en-jin19 Date: Tue, 29 Mar 2022 15:54:55 +0800 Subject: [PATCH 03/15] Fix a CI error --- br/backup-and-restore-faq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/br/backup-and-restore-faq.md b/br/backup-and-restore-faq.md index 0722276590c6c..34240d3ccc0d7 100644 --- a/br/backup-and-restore-faq.md +++ b/br/backup-and-restore-faq.md @@ -168,7 +168,7 @@ Yes. BR backs up the [`SHARD_ROW_ID_BITS` and `PRE_SPLIT_REGIONS`](/sql-statemen Try reducing the size of the concurrent batch table creation by setting `--ddl-batch-size` to `128` or a smaller value. -Suppose that the value of [`--ddl-batch-size`](/br/br-batch-create-table.md#how to use) is greater than `1`, and you use BR to restore the backup data. In this case, TiDB writes the queue of DDL jobs that create tables to TiKV. At this time, the total schema size of all tables sent by TiDB at a time should not exceed 6 MB, because the default maximum value of job messages that TiDB can send at a time is `6 MB` (you are **not recommended** to modify this value; for details, see [`txn-entry-size-limit`](/tidb-configuration-file.md#txn-entry-size-limit-new-in-50) and [`raft-entry-max-size`](/tikv-configuration-file.md#raft-entry-max-size). Therefore, if you set the too large value to `--ddl-batch-size`, the schema size of the batch table sent by TiDB at a time exceeds the specified value, resulting in BR reporting `entry too large, the max entry size is 6291456, the size of data is 7690800` error. +Suppose that the value of [`--ddl-batch-size`](/br/br-batch-create-table.md#how to use) is greater than `1`, and you use BR to restore the backup data. In this case, TiDB writes the queue of DDL jobs that create tables to TiKV. At this time, the total schema size of all tables sent by TiDB at a time should not exceed 6 MB, because the default maximum value of job messages that TiDB can send at a time is `6 MB` (you are **not recommended** to modify this value; for details, see [`txn-entry-size-limit`](/tidb-configuration-file.md#txn-entry-size-limit-new-in-v50) and [`raft-entry-max-size`](/tikv-configuration-file.md#raft-entry-max-size). Therefore, if you set the too large value to `--ddl-batch-size`, the schema size of the batch table sent by TiDB at a time exceeds the specified value, resulting in BR reporting `entry too large, the max entry size is 6291456, the size of data is 7690800` error. ## Why is the `region is unavailable` error reported for a SQL query after I use BR to restore the backup data? From 511da179846b6f07b57fad34c5ea0981994aa56a Mon Sep 17 00:00:00 2001 From: Enwei Date: Tue, 29 Mar 2022 20:24:03 +0800 Subject: [PATCH 04/15] Apply suggestions from code review Co-authored-by: fengou1 <85682690+fengou1@users.noreply.github.com> --- br/backup-and-restore-faq.md | 2 +- br/br-batch-create-table.md | 12 ++++++------ 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/br/backup-and-restore-faq.md b/br/backup-and-restore-faq.md index 34240d3ccc0d7..b41299fa7ec09 100644 --- a/br/backup-and-restore-faq.md +++ b/br/backup-and-restore-faq.md @@ -168,7 +168,7 @@ Yes. BR backs up the [`SHARD_ROW_ID_BITS` and `PRE_SPLIT_REGIONS`](/sql-statemen Try reducing the size of the concurrent batch table creation by setting `--ddl-batch-size` to `128` or a smaller value. -Suppose that the value of [`--ddl-batch-size`](/br/br-batch-create-table.md#how to use) is greater than `1`, and you use BR to restore the backup data. In this case, TiDB writes the queue of DDL jobs that create tables to TiKV. At this time, the total schema size of all tables sent by TiDB at a time should not exceed 6 MB, because the default maximum value of job messages that TiDB can send at a time is `6 MB` (you are **not recommended** to modify this value; for details, see [`txn-entry-size-limit`](/tidb-configuration-file.md#txn-entry-size-limit-new-in-v50) and [`raft-entry-max-size`](/tikv-configuration-file.md#raft-entry-max-size). Therefore, if you set the too large value to `--ddl-batch-size`, the schema size of the batch table sent by TiDB at a time exceeds the specified value, resulting in BR reporting `entry too large, the max entry size is 6291456, the size of data is 7690800` error. +When using BR to restore the back up data with the value of [`--ddl-batch-size`](/br/br-batch-create-table.md#how to use) is greater than `1`, TiDB writes a DDL job to the DDL jobs queue that is maintained by TiKV. At this time, the total size of all tables schema sent by TiDB at a time should not exceed 6 MB, because the maximum value of job messages is `6 MB` by default (you are **not recommended** to modify this value; for details, see [`txn-entry-size-limit`](/tidb-configuration-file.md#txn-entry-size-limit-new-in-v50) and [`raft-entry-max-size`](/tikv-configuration-file.md#raft-entry-max-size). Therefore, if you set the too large value to `--ddl-batch-size`, the schema size of the batch table sent by TiDB at a time exceeds the specified value, resulting in BR reporting `entry too large, the max entry size is 6291456, the size of data is 7690800` error. ## Why is the `region is unavailable` error reported for a SQL query after I use BR to restore the backup data? diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index c6d8757a9538a..e4012ea76607e 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -5,24 +5,24 @@ summary: Learn how to use BR batch create table feature. When restoring data, BR # BR Batch Create Table -When restoring data using Backup & Restore (BR), BR creates the databases and tables in the target TiDB first, then starts restoring data. In the versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) scheme to create tables in the restoration process. However, when restoring data with a large number (nearly 50000) of tables, this scheme takes much time to create tables. +When restoring data using Backup & Restore (BR), BR creates the databases and tables in the target TiDB first, then starts restoring table data. In the versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) scheme to create tables in the restoration process. However, when restoring data with a large number (nearly 50000) of tables, this scheme takes much time to create tables. To speed up the table creation process, and thereby reduce the time for restoring data, the BR batch create table feature is introduced in TiDB v6.0.0. This feature is enabled by default. > **Note:** > -> - To use the BR batch create table feature, both TiDB and BR should be in 6.0.0 or later versions. If either TiDB or BR is in the version lower than 6.0.0, BR uses the serial execution scheme. +> - To use the BR batch create table feature, both TiDB and BR should be in v6.0.0 or later. If either TiDB or BR is in the version lower than v6.0.0, BR uses the serial execution scheme. > - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are in 6.0.0 or later versions. In this case, BR enables batch create table feature by default without additional configuration. ## User scenario -When you need to restore data with a considerable number of tables, for example, 50000 tables, you can use BR batch create table feature to speed up the restoration process. +When you need to restore data with massive tables, for example, 50000 tables, you can use BR batch create table feature to speed up the restoration process. For the detailed effect, see [Test batch create table feature](#test-batch-create-table). ## Use batch create table -BR enables batch create table feature and configures `--ddl-batch-size=128` by default in 6.0.0 or later versions. Therefore, you do not need to configure this parameter additionally. `--ddl-batch-size=128` means that BR creates tables in multiple batches, and each batch has 128 tables. +BR enables batch create table feature and configures `--ddl-batch-size=128` by default in v6.0.0 or later. Therefore, you do not need to configure this parameter additionally. `--ddl-batch-size=128` means that BR creates tables in multiple batches, and each batch has 128 tables. To disable this feature, you can set `--ddl-batch-size` to `0` by the following command: @@ -51,8 +51,8 @@ This section describes the information of testing batch create table feature. Th - Cluster configurations: - 15 TiKV instances. Each TiKV instance has 16 CPU cores, 80 GB memory, and 16 threads to process RPC requests ([`import.num-threads`](/tikv-configuration-file.md#num-threads) = 16). - - 3 TiDB instances. Each TiDB instance has 16 CPU cores, 32 GB memory. - - 3 PD instances. Each PD instance has 16 CPU cores, 32 GB memory. + - 3 TiDB instances. Each TiDB instance is equipped with 16 CPU cores, 32 GB memory. + - 3 PD instances. Each PD instance is equipped with 16 CPU cores, 32 GB memory. - Data to be restored: 16.16 TB From 0afbb34219022aa284b2f1cbfacd6b30251d949f Mon Sep 17 00:00:00 2001 From: TomShawn <41534398+TomShawn@users.noreply.github.com> Date: Wed, 30 Mar 2022 10:56:05 +0800 Subject: [PATCH 05/15] Update br/backup-and-restore-faq.md Co-authored-by: Enwei --- br/backup-and-restore-faq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/br/backup-and-restore-faq.md b/br/backup-and-restore-faq.md index b41299fa7ec09..a5370a841b613 100644 --- a/br/backup-and-restore-faq.md +++ b/br/backup-and-restore-faq.md @@ -164,7 +164,7 @@ You can use [`filter.rules`](https://github.com/pingcap/tiflow/blob/7c3c2336f981 Yes. BR backs up the [`SHARD_ROW_ID_BITS` and `PRE_SPLIT_REGIONS`](/sql-statements/sql-statement-split-region.md#pre_split_regions) information of a table. The data of the restored table is also split into multiple Regions. -## What should I do if the `entry too large, the max entry size is 6291456, the size of data is 7690800` error is reported when restoring data using BR? +## What should I do if the restore fails with the error message `the entry too large, the max entry size is 6291456, the size of data is 7690800`? Try reducing the size of the concurrent batch table creation by setting `--ddl-batch-size` to `128` or a smaller value. From 5c02df39a0ad9938a0ed84705345923e1150ca3e Mon Sep 17 00:00:00 2001 From: en-jin19 Date: Wed, 30 Mar 2022 15:47:17 +0800 Subject: [PATCH 06/15] update two parts --- br/br-batch-create-table.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index e4012ea76607e..7827f4564424d 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -42,7 +42,7 @@ After disabling the feature, BR uses the [serial execution scheme](#implementati - Batch create table scheme since v6.0.0: - The batch create table feature uses the concurrent batch table creation scheme. From v6.0.0, by default, BR creates tables in multiple batches, and each batch has 128 tables. Using this scheme, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. + From v6.0.0, by default, BR creates tables in multiple batches, and each batch has 128 tables. Using this scheme, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. ## Test batch create table @@ -50,7 +50,7 @@ This section describes the information of testing batch create table feature. Th - Cluster configurations: - - 15 TiKV instances. Each TiKV instance has 16 CPU cores, 80 GB memory, and 16 threads to process RPC requests ([`import.num-threads`](/tikv-configuration-file.md#num-threads) = 16). + - 15 TiKV instances. Each TiKV instance is equipped with 16 CPU cores, 80 GB memory, and 16 threads to process RPC requests ([`import.num-threads`](/tikv-configuration-file.md#num-threads) = 16). - 3 TiDB instances. Each TiDB instance is equipped with 16 CPU cores, 32 GB memory. - 3 PD instances. Each PD instance is equipped with 16 CPU cores, 32 GB memory. From d7bc6ff87f96e89a1d5450996ad0b909101ca3d7 Mon Sep 17 00:00:00 2001 From: Enwei Date: Thu, 31 Mar 2022 20:07:41 +0800 Subject: [PATCH 07/15] Apply suggestions from code review Co-authored-by: TomShawn <41534398+TomShawn@users.noreply.github.com> --- br/backup-and-restore-faq.md | 4 ++-- br/br-batch-create-table.md | 26 +++++++++++++------------- 2 files changed, 15 insertions(+), 15 deletions(-) diff --git a/br/backup-and-restore-faq.md b/br/backup-and-restore-faq.md index a5370a841b613..2e6ece4643fc2 100644 --- a/br/backup-and-restore-faq.md +++ b/br/backup-and-restore-faq.md @@ -166,9 +166,9 @@ Yes. BR backs up the [`SHARD_ROW_ID_BITS` and `PRE_SPLIT_REGIONS`](/sql-statemen ## What should I do if the restore fails with the error message `the entry too large, the max entry size is 6291456, the size of data is 7690800`? -Try reducing the size of the concurrent batch table creation by setting `--ddl-batch-size` to `128` or a smaller value. +You can try to reduce the number of tables to be created in a batch by setting `--ddl-batch-size` to `128` or a smaller value. -When using BR to restore the back up data with the value of [`--ddl-batch-size`](/br/br-batch-create-table.md#how to use) is greater than `1`, TiDB writes a DDL job to the DDL jobs queue that is maintained by TiKV. At this time, the total size of all tables schema sent by TiDB at a time should not exceed 6 MB, because the maximum value of job messages is `6 MB` by default (you are **not recommended** to modify this value; for details, see [`txn-entry-size-limit`](/tidb-configuration-file.md#txn-entry-size-limit-new-in-v50) and [`raft-entry-max-size`](/tikv-configuration-file.md#raft-entry-max-size). Therefore, if you set the too large value to `--ddl-batch-size`, the schema size of the batch table sent by TiDB at a time exceeds the specified value, resulting in BR reporting `entry too large, the max entry size is 6291456, the size of data is 7690800` error. +When using BR to restore the backup data with the value of [`--ddl-batch-size`](/br/br-batch-create-table.md#how to use) greater than `1`, TiDB writes a DDL job of table creation to the DDL jobs queue that is maintained by TiKV. At this time, the total size of all tables schema sent by TiDB at one time should not exceed 6 MB, because the maximum value of job messages is `6 MB` by default (it is **not recommended** to modify this value. For details, see [`txn-entry-size-limit`](/tidb-configuration-file.md#txn-entry-size-limit-new-in-v50) and [`raft-entry-max-size`](/tikv-configuration-file.md#raft-entry-max-size)). Therefore, if you set `--ddl-batch-size` to an excessively large value, the schema size of the tables sent by TiDB in a batch at one time exceeds the specified value, which causes BR to report the `entry too large, the max entry size is 6291456, the size of data is 7690800` error. ## Why is the `region is unavailable` error reported for a SQL query after I use BR to restore the backup data? diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index 7827f4564424d..697c627bce5ed 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -1,30 +1,30 @@ --- title: BR Batch Create Table -summary: Learn how to use BR batch create table feature. When restoring data, BR can use batch create table feature to speed up the restoration. +summary: Learn how to use the BR batch create table feature. When restoring data, BR can create tables in batches to speed up the restore process. --- # BR Batch Create Table -When restoring data using Backup & Restore (BR), BR creates the databases and tables in the target TiDB first, then starts restoring table data. In the versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) scheme to create tables in the restoration process. However, when restoring data with a large number (nearly 50000) of tables, this scheme takes much time to create tables. +When restoring data, Backup & Restore (BR) creates databases and tables in the target TiDB before it starts to restore the table data. In versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) to create tables in the restore process. However, when BR restores data with a large number (nearly 50000) of tables, this serial implementation of table creation takes much time. To speed up the table creation process, and thereby reduce the time for restoring data, the BR batch create table feature is introduced in TiDB v6.0.0. This feature is enabled by default. > **Note:** > -> - To use the BR batch create table feature, both TiDB and BR should be in v6.0.0 or later. If either TiDB or BR is in the version lower than v6.0.0, BR uses the serial execution scheme. -> - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are in 6.0.0 or later versions. In this case, BR enables batch create table feature by default without additional configuration. +> - To use the BR batch create table feature, both TiDB and BR are expected to be of v6.0.0 or later. If either TiDB or BR is earlier than v6.0.0, BR uses the serial execution implementation. +> - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are of v6.0.0 or later versions, or your TiDB and BR are upgraded from a version earlier than v6.0.0 to v6.0.0 or later. In this case, BR enables the batch create table feature by default without additional configuration. -## User scenario +## Usage scenario -When you need to restore data with massive tables, for example, 50000 tables, you can use BR batch create table feature to speed up the restoration process. +If you need to restore data with a massive amount of tables, for example, 50000 tables, you can use the BR batch create table feature to speed up the restore process. -For the detailed effect, see [Test batch create table feature](#test-batch-create-table). +For the detailed effect, see [Test against the batch create table feature](#test-batch-create-table). ## Use batch create table -BR enables batch create table feature and configures `--ddl-batch-size=128` by default in v6.0.0 or later. Therefore, you do not need to configure this parameter additionally. `--ddl-batch-size=128` means that BR creates tables in multiple batches, and each batch has 128 tables. +BR enables the batch create table feature by default, with the default configuration of `--ddl-batch-size=128` in v6.0.0 or later to speed up the restore process. Therefore, you do not need to configure this parameter. `--ddl-batch-size=128` means that BR creates tables in multiple batches, and each batch has 128 tables. -To disable this feature, you can set `--ddl-batch-size` to `0` by the following command: +To disable this feature, you can set `--ddl-batch-size` to `0`. See the following example command: {{< copyable "shell-regular" >}} @@ -32,7 +32,7 @@ To disable this feature, you can set `--ddl-batch-size` to `0` by the following br restore full -s local:///br_data/ --pd 172.16.5.198:2379 --log-file restore.log --ddl-batch-size=0 ``` -After disabling the feature, BR uses the [serial execution scheme](#implementation-principles) instead. +After this feature is disabled, BR uses the [serial execution implementation](#implementation-principles) instead. ## Implementation principles @@ -40,13 +40,13 @@ After disabling the feature, BR uses the [serial execution scheme](#implementati In the versions earlier than 6.0.0, BR uses the serial execution scheme. When restoring data, BR creates the databases and tables in the target TiDB first, then starts restoring data. To create tables, after calling TiDB API, BR uses the SQL statement `Create Table`. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the schema version changes correspondingly, and each version change synchronizes to other BRs and other TiDB DDL workers. Hence, when restoring a large number of tables, the serial execution scheme takes too much time. -- Batch create table scheme since v6.0.0: +- Batch create table implementation since v6.0.0: From v6.0.0, by default, BR creates tables in multiple batches, and each batch has 128 tables. Using this scheme, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. -## Test batch create table +## Test against the batch create table feature -This section describes the information of testing batch create table feature. The test environment is as follows: +This section describes the information of testing the batch create table feature. The test environment is as follows: - Cluster configurations: From 4adf3768ee90f029361f000a3e9d0d5a56bf0d73 Mon Sep 17 00:00:00 2001 From: fengou1 <85682690+fengou1@users.noreply.github.com> Date: Fri, 1 Apr 2022 08:30:52 +0800 Subject: [PATCH 08/15] Update br/br-batch-create-table.md Co-authored-by: Enwei --- br/br-batch-create-table.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index 697c627bce5ed..78fb4c9801683 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -36,7 +36,7 @@ After this feature is disabled, BR uses the [serial execution implementation](#i ## Implementation principles -- Serial execution scheme before v6.0.0: +- Serial execution solution before v6.0.0: In the versions earlier than 6.0.0, BR uses the serial execution scheme. When restoring data, BR creates the databases and tables in the target TiDB first, then starts restoring data. To create tables, after calling TiDB API, BR uses the SQL statement `Create Table`. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the schema version changes correspondingly, and each version change synchronizes to other BRs and other TiDB DDL workers. Hence, when restoring a large number of tables, the serial execution scheme takes too much time. From 33f967d62ad86e597b3afd5c60fb7a64f86d1974 Mon Sep 17 00:00:00 2001 From: fengou1 <85682690+fengou1@users.noreply.github.com> Date: Fri, 1 Apr 2022 08:59:55 +0800 Subject: [PATCH 09/15] Update br/br-batch-create-table.md --- br/br-batch-create-table.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index 78fb4c9801683..b928d48a3fe47 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -38,7 +38,7 @@ After this feature is disabled, BR uses the [serial execution implementation](#i - Serial execution solution before v6.0.0: - In the versions earlier than 6.0.0, BR uses the serial execution scheme. When restoring data, BR creates the databases and tables in the target TiDB first, then starts restoring data. To create tables, after calling TiDB API, BR uses the SQL statement `Create Table`. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the schema version changes correspondingly, and each version change synchronizes to other BRs and other TiDB DDL workers. Hence, when restoring a large number of tables, the serial execution scheme takes too much time. + In the versions earlier, BR uses the serial execution implementation. When restoring data, BR creates the databases and tables in the target TiDB first, then starts restoring data. To create tables, BR calls TiDB internal API, more like BR uses the SQL statement `Create Table`. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly, and each version change synchronizes to other TiDB DDL workers (including BR). Hence, when restoring a large number of tables, the serial execution implementation takes too much time. - Batch create table implementation since v6.0.0: From 5ee5a498ca7ac6b5fed8dd4405ef8da97079ebff Mon Sep 17 00:00:00 2001 From: Enwei Date: Fri, 1 Apr 2022 12:53:21 +0800 Subject: [PATCH 10/15] Apply suggestions from code review --- br/br-batch-create-table.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index b928d48a3fe47..b321b83f99c39 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -38,7 +38,7 @@ After this feature is disabled, BR uses the [serial execution implementation](#i - Serial execution solution before v6.0.0: - In the versions earlier, BR uses the serial execution implementation. When restoring data, BR creates the databases and tables in the target TiDB first, then starts restoring data. To create tables, BR calls TiDB internal API, more like BR uses the SQL statement `Create Table`. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly, and each version change synchronizes to other TiDB DDL workers (including BR). Hence, when restoring a large number of tables, the serial execution implementation takes too much time. + In the versions earlier, BR uses the serial execution implementation. When restoring data, BR creates the databases and tables in the target TiDB first, then starts restoring data. BR calls TiDB internal API to create tables, which operation looks like BR executes the SQL `Create Table` statement. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly, and each version change synchronizes to other TiDB DDL workers (including BR). Hence, when restoring a large number of tables, the serial execution implementation takes too much time. - Batch create table implementation since v6.0.0: From d3f78eb4d1032ca7dbd4d79c00cf9c721bf67682 Mon Sep 17 00:00:00 2001 From: en-jin19 Date: Fri, 1 Apr 2022 15:38:46 +0800 Subject: [PATCH 11/15] update the translation --- br/br-batch-create-table.md | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index b321b83f99c39..55c3b6fe0fda3 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -1,28 +1,28 @@ --- -title: BR Batch Create Table -summary: Learn how to use the BR batch create table feature. When restoring data, BR can create tables in batches to speed up the restore process. +title: Batch Create Table +summary: Learn how to use the Batch Create Table feature. When restoring data, BR can create tables in batches to speed up the restore process. --- -# BR Batch Create Table +# Batch Create Table -When restoring data, Backup & Restore (BR) creates databases and tables in the target TiDB before it starts to restore the table data. In versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) to create tables in the restore process. However, when BR restores data with a large number (nearly 50000) of tables, this serial implementation of table creation takes much time. +When restoring data, Backup & Restore (BR) creates databases and tables in the target TiDB before restoring the table data. In versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) solution to create tables in the restore process. However, when BR restores data with a large number (nearly 50000) of tables, this solution takes much time on creating tables. -To speed up the table creation process, and thereby reduce the time for restoring data, the BR batch create table feature is introduced in TiDB v6.0.0. This feature is enabled by default. +To speed up the table creation process, and thereby reduce the time for restoring data, the Batch Create Table feature is introduced in TiDB v6.0.0. This feature is enabled by default. > **Note:** > -> - To use the BR batch create table feature, both TiDB and BR are expected to be of v6.0.0 or later. If either TiDB or BR is earlier than v6.0.0, BR uses the serial execution implementation. -> - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are of v6.0.0 or later versions, or your TiDB and BR are upgraded from a version earlier than v6.0.0 to v6.0.0 or later. In this case, BR enables the batch create table feature by default without additional configuration. +> - To use the Batch Create Table feature, both TiDB and BR are expected to be of v6.0.0 or later. If either TiDB or BR is earlier than v6.0.0, BR uses the serial execution solution. +> - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are of v6.0.0 or later versions, or your TiDB and BR are upgraded from a version earlier than v6.0.0 to v6.0.0 or later. In this case, BR enables the Batch Create Table feature by default, without additional configuration. ## Usage scenario -If you need to restore data with a massive amount of tables, for example, 50000 tables, you can use the BR batch create table feature to speed up the restore process. +If you need to restore data with a massive amount of tables, for example, 50000 tables, you can use the Batch Create Table feature to speed up the restore process. -For the detailed effect, see [Test against the batch create table feature](#test-batch-create-table). +For the detailed effect, see [Test for the Batch Create Table Feature](#test-batch-create-table). -## Use batch create table +## Use the Batch Create Table feature -BR enables the batch create table feature by default, with the default configuration of `--ddl-batch-size=128` in v6.0.0 or later to speed up the restore process. Therefore, you do not need to configure this parameter. `--ddl-batch-size=128` means that BR creates tables in multiple batches, and each batch has 128 tables. +BR enables the Batch Create Table Feature by default, with the default configuration of `--ddl-batch-size=128` in v6.0.0 or later to speed up the restore process. Therefore, you do not need to configure this parameter. `--ddl-batch-size=128` means that BR creates tables in multiple batches, and each batch has 128 tables. To disable this feature, you can set `--ddl-batch-size` to `0`. See the following example command: @@ -32,21 +32,21 @@ To disable this feature, you can set `--ddl-batch-size` to `0`. See the followin br restore full -s local:///br_data/ --pd 172.16.5.198:2379 --log-file restore.log --ddl-batch-size=0 ``` -After this feature is disabled, BR uses the [serial execution implementation](#implementation-principles) instead. +After this feature is disabled, BR uses the [serial execution solution](#implementation-principles) instead. ## Implementation principles - Serial execution solution before v6.0.0: - In the versions earlier, BR uses the serial execution implementation. When restoring data, BR creates the databases and tables in the target TiDB first, then starts restoring data. BR calls TiDB internal API to create tables, which operation looks like BR executes the SQL `Create Table` statement. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly, and each version change synchronizes to other TiDB DDL workers (including BR). Hence, when restoring a large number of tables, the serial execution implementation takes too much time. + When restoring data using BR, BR creates databases and tables in the target TiDB before restoring the table data.To create tables, BR calls TiDB internal API first, and then process table creation tasks, which operation looks like BR executes the SQL `Create Table` statement. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly, and each version change synchronizes to other TiDB DDL workers (including BR). Hence, when restoring a large number of tables, the serial execution solution takes too much time. -- Batch create table implementation since v6.0.0: +- Batch create table solution since v6.0.0: - From v6.0.0, by default, BR creates tables in multiple batches, and each batch has 128 tables. Using this scheme, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. + By default, BR creates tables in multiple batches, and each batch has 128 tables. Using this solution, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. -## Test against the batch create table feature +## Test for the Batch Create Table feature -This section describes the information of testing the batch create table feature. The test environment is as follows: +This section describes the test information about the Batch Create Table feature. The test environment is as follows: - Cluster configurations: @@ -62,4 +62,4 @@ The test result is as follows: ‘[2022/03/12 22:37:49.060 +08:00] [INFO] [collector.go:67] ["Full restore success summary"] [total-ranges=751760] [ranges-succeed=751760] [ranges-failed=0] [split-region=1h33m18.078448449s] [restore-ranges=542693] [total-take=1h41m35.471476438s] [restore-data-size(after-compressed)=8.337TB] [Size=8336694965072] [BackupTS=431773933856882690] [total-kv=148015861383] [total-kv-size=16.16TB] [average-speed=2.661GB/s]’ ``` -In the result, you can find that the average speed of restoring one TiKV instance is as high as 181.65 MB/s (`average-speed(GB/s)`/`tikv_count` = `181.65(MB/s)`). \ No newline at end of file +In the test result, you can find that the average speed of restoring one TiKV instance is as high as 181.65 MB/s (`average-speed(GB/s)`/`tikv_count` = `181.65(MB/s)`). \ No newline at end of file From c9d4a9a94b21714f6da5937d675df09cb14c59ac Mon Sep 17 00:00:00 2001 From: en-jin19 Date: Fri, 1 Apr 2022 17:07:01 +0800 Subject: [PATCH 12/15] change "solution" to "implementation" --- br/br-batch-create-table.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index 55c3b6fe0fda3..e465fc46e911c 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -5,13 +5,13 @@ summary: Learn how to use the Batch Create Table feature. When restoring data, B # Batch Create Table -When restoring data, Backup & Restore (BR) creates databases and tables in the target TiDB before restoring the table data. In versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) solution to create tables in the restore process. However, when BR restores data with a large number (nearly 50000) of tables, this solution takes much time on creating tables. +When restoring data, Backup & Restore (BR) creates databases and tables in the target TiDB before restoring the table data. In versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) implementation to create tables in the restore process. However, when BR restores data with a large number (nearly 50000) of tables, this implementation takes much time on creating tables. To speed up the table creation process, and thereby reduce the time for restoring data, the Batch Create Table feature is introduced in TiDB v6.0.0. This feature is enabled by default. > **Note:** > -> - To use the Batch Create Table feature, both TiDB and BR are expected to be of v6.0.0 or later. If either TiDB or BR is earlier than v6.0.0, BR uses the serial execution solution. +> - To use the Batch Create Table feature, both TiDB and BR are expected to be of v6.0.0 or later. If either TiDB or BR is earlier than v6.0.0, BR uses the serial execution implementation. > - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are of v6.0.0 or later versions, or your TiDB and BR are upgraded from a version earlier than v6.0.0 to v6.0.0 or later. In this case, BR enables the Batch Create Table feature by default, without additional configuration. ## Usage scenario @@ -32,17 +32,17 @@ To disable this feature, you can set `--ddl-batch-size` to `0`. See the followin br restore full -s local:///br_data/ --pd 172.16.5.198:2379 --log-file restore.log --ddl-batch-size=0 ``` -After this feature is disabled, BR uses the [serial execution solution](#implementation-principles) instead. +After this feature is disabled, BR uses the [serial execution implementation](#implementation-principles) instead. ## Implementation principles -- Serial execution solution before v6.0.0: +- Serial execution implementation before v6.0.0: - When restoring data using BR, BR creates databases and tables in the target TiDB before restoring the table data.To create tables, BR calls TiDB internal API first, and then process table creation tasks, which operation looks like BR executes the SQL `Create Table` statement. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly, and each version change synchronizes to other TiDB DDL workers (including BR). Hence, when restoring a large number of tables, the serial execution solution takes too much time. + When restoring data, BR creates databases and tables in the target TiDB before restoring the table data.To create tables, BR calls TiDB internal API first, and then process table creation tasks, which operation looks like BR executes the SQL `Create Table` statement. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly, and each version change synchronizes to other TiDB DDL workers (including BR). Hence, when restoring a large number of tables, the serial execution implementation takes too much time. -- Batch create table solution since v6.0.0: +- Batch create table implementation since v6.0.0: - By default, BR creates tables in multiple batches, and each batch has 128 tables. Using this solution, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. + By default, BR creates tables in multiple batches, and each batch has 128 tables. Using this implementation, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. ## Test for the Batch Create Table feature From b7016ecb34172b28cb786a377d6839cd38861cdf Mon Sep 17 00:00:00 2001 From: Enwei Date: Fri, 1 Apr 2022 17:31:12 +0800 Subject: [PATCH 13/15] Apply suggestions from code review --- br/br-batch-create-table.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index e465fc46e911c..82c064a795d6d 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -5,7 +5,7 @@ summary: Learn how to use the Batch Create Table feature. When restoring data, B # Batch Create Table -When restoring data, Backup & Restore (BR) creates databases and tables in the target TiDB before restoring the table data. In versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) implementation to create tables in the restore process. However, when BR restores data with a large number (nearly 50000) of tables, this implementation takes much time on creating tables. +When restoring data, Backup & Restore (BR) creates databases and tables in the target TiDB and then restores the backed-up data to the tables. In versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) implementation to create tables in the restore process. However, when BR restores data with a large number (nearly 50000) of tables, this implementation takes much time on creating tables. To speed up the table creation process, and thereby reduce the time for restoring data, the Batch Create Table feature is introduced in TiDB v6.0.0. This feature is enabled by default. @@ -38,7 +38,7 @@ After this feature is disabled, BR uses the [serial execution implementation](#i - Serial execution implementation before v6.0.0: - When restoring data, BR creates databases and tables in the target TiDB before restoring the table data.To create tables, BR calls TiDB internal API first, and then process table creation tasks, which operation looks like BR executes the SQL `Create Table` statement. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly, and each version change synchronizes to other TiDB DDL workers (including BR). Hence, when restoring a large number of tables, the serial execution implementation takes too much time. + When restoring data, BR creates databases and tables in the target TiDB and then restores the backed-up data to the tables. To create tables, BR calls TiDB internal API first, and then processes table creation tasks, which operation looks like BR executes the SQL `Create Table` statement. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly, and each version change synchronizes to other TiDB DDL workers (including BR). Hence, when restoring a large number of tables, the serial execution implementation takes too much time. - Batch create table implementation since v6.0.0: From 0bf96d767ef4c7d4c2d3eccd70c507240a83b133 Mon Sep 17 00:00:00 2001 From: Enwei Date: Fri, 1 Apr 2022 17:51:32 +0800 Subject: [PATCH 14/15] fix a CI error --- br/br-batch-create-table.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index 82c064a795d6d..aa10d182942e7 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -18,7 +18,7 @@ To speed up the table creation process, and thereby reduce the time for restorin If you need to restore data with a massive amount of tables, for example, 50000 tables, you can use the Batch Create Table feature to speed up the restore process. -For the detailed effect, see [Test for the Batch Create Table Feature](#test-batch-create-table). +For the detailed effect, see [Test for the Batch Create Table Feature](#test-for-the-batch-create-table-feature). ## Use the Batch Create Table feature From de38ae9cc7c29870f0f013000970a5ac449639d0 Mon Sep 17 00:00:00 2001 From: Enwei Date: Fri, 1 Apr 2022 19:56:33 +0800 Subject: [PATCH 15/15] Apply suggestions from code review Co-authored-by: TomShawn <41534398+TomShawn@users.noreply.github.com> --- br/br-batch-create-table.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index aa10d182942e7..b10e066f6185c 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -5,14 +5,14 @@ summary: Learn how to use the Batch Create Table feature. When restoring data, B # Batch Create Table -When restoring data, Backup & Restore (BR) creates databases and tables in the target TiDB and then restores the backed-up data to the tables. In versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) implementation to create tables in the restore process. However, when BR restores data with a large number (nearly 50000) of tables, this implementation takes much time on creating tables. +When restoring data, Backup & Restore (BR) creates databases and tables in the target TiDB cluster and then restores the backup data to the tables. In versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) implementation to create tables in the restore process. However, when BR restores data with a large number of tables (nearly 50000), this implementation takes much time on creating tables. -To speed up the table creation process, and thereby reduce the time for restoring data, the Batch Create Table feature is introduced in TiDB v6.0.0. This feature is enabled by default. +To speed up the table creation process and reduce the time for restoring data, the Batch Create Table feature is introduced in TiDB v6.0.0. This feature is enabled by default. > **Note:** > > - To use the Batch Create Table feature, both TiDB and BR are expected to be of v6.0.0 or later. If either TiDB or BR is earlier than v6.0.0, BR uses the serial execution implementation. -> - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are of v6.0.0 or later versions, or your TiDB and BR are upgraded from a version earlier than v6.0.0 to v6.0.0 or later. In this case, BR enables the Batch Create Table feature by default, without additional configuration. +> - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are of v6.0.0 or later versions, or your TiDB and BR are upgraded from a version earlier than v6.0.0 to v6.0.0 or later. In this case, BR enables the Batch Create Table feature by default. ## Usage scenario @@ -22,7 +22,7 @@ For the detailed effect, see [Test for the Batch Create Table Feature](#test-for ## Use the Batch Create Table feature -BR enables the Batch Create Table Feature by default, with the default configuration of `--ddl-batch-size=128` in v6.0.0 or later to speed up the restore process. Therefore, you do not need to configure this parameter. `--ddl-batch-size=128` means that BR creates tables in multiple batches, and each batch has 128 tables. +BR enables the Batch Create Table feature by default, with the default configuration of `--ddl-batch-size=128` in v6.0.0 or later to speed up the restore process. Therefore, you do not need to configure this parameter. `--ddl-batch-size=128` means that BR creates tables in batches, each batch with 128 tables. To disable this feature, you can set `--ddl-batch-size` to `0`. See the following example command: @@ -38,11 +38,11 @@ After this feature is disabled, BR uses the [serial execution implementation](#i - Serial execution implementation before v6.0.0: - When restoring data, BR creates databases and tables in the target TiDB and then restores the backed-up data to the tables. To create tables, BR calls TiDB internal API first, and then processes table creation tasks, which operation looks like BR executes the SQL `Create Table` statement. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly, and each version change synchronizes to other TiDB DDL workers (including BR). Hence, when restoring a large number of tables, the serial execution implementation takes too much time. + When restoring data, BR creates databases and tables in the target TiDB cluster and then restores the backup data to the tables. To create tables, BR calls TiDB internal API first, and then processes table creation tasks, which works similarly to executing the `Create Table` statement by BR. The TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly and each version change is synchronized to other TiDB DDL workers (including BR). Therefore, when BR restores a large number of tables, the serial execution implementation is time-consuming. - Batch create table implementation since v6.0.0: - By default, BR creates tables in multiple batches, and each batch has 128 tables. Using this implementation, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. + By default, BR creates tables in multiple batches, and each batch has 128 tables. Using this implementation, when BR creates one batch of tables, the TiDB schema version only changes once. This implementation significantly increases the speed of table creation. ## Test for the Batch Create Table feature @@ -54,7 +54,7 @@ This section describes the test information about the Batch Create Table feature - 3 TiDB instances. Each TiDB instance is equipped with 16 CPU cores, 32 GB memory. - 3 PD instances. Each PD instance is equipped with 16 CPU cores, 32 GB memory. -- Data to be restored: 16.16 TB +- The size of data to be restored: 16.16 TB The test result is as follows: @@ -62,4 +62,4 @@ The test result is as follows: ‘[2022/03/12 22:37:49.060 +08:00] [INFO] [collector.go:67] ["Full restore success summary"] [total-ranges=751760] [ranges-succeed=751760] [ranges-failed=0] [split-region=1h33m18.078448449s] [restore-ranges=542693] [total-take=1h41m35.471476438s] [restore-data-size(after-compressed)=8.337TB] [Size=8336694965072] [BackupTS=431773933856882690] [total-kv=148015861383] [total-kv-size=16.16TB] [average-speed=2.661GB/s]’ ``` -In the test result, you can find that the average speed of restoring one TiKV instance is as high as 181.65 MB/s (`average-speed(GB/s)`/`tikv_count` = `181.65(MB/s)`). \ No newline at end of file +From the test result, you can see that the average speed of restoring one TiKV instance is as high as 181.65 MB/s (which equals to `average-speed`/`tikv_count`). \ No newline at end of file