From 84206d23b1ac8784bea53011d479bf6aa03a1985 Mon Sep 17 00:00:00 2001 From: en-jin19 Date: Tue, 29 Mar 2022 14:57:51 +0800 Subject: [PATCH 01/15] BR: Add a new doc about the batch create table --- TOC.md | 2 ++ br/backup-and-restore-faq.md | 6 ++++ br/br-batch-create-table.md | 65 ++++++++++++++++++++++++++++++++++++ 3 files changed, 73 insertions(+) create mode 100644 br/br-batch-create-table.md diff --git a/TOC.md b/TOC.md index c8b23709cf798..d7230f8c82e93 100644 --- a/TOC.md +++ b/TOC.md @@ -72,6 +72,7 @@ - [Back up and Restore Data on Azure Blob Storage](/br/backup-and-restore-azblob.md) - BR Features - [Auto Tune](/br/br-auto-tune.md) + - [Batch Create Table](/br/br-batch-create-table.md) - [BR FAQ](/br/backup-and-restore-faq.md) - [Configure Time Zone](/configure-time-zone.md) - [Daily Checklist](/daily-check.md) @@ -203,6 +204,7 @@ - [External Storages](/br/backup-and-restore-storages.md) - BR Features - [Auto Tune](/br/br-auto-tune.md) + - [Batch Create Table](/br/br-batch-create-table.md) - [BR FAQ](/br/backup-and-restore-faq.md) - TiDB Binlog - [Overview](/tidb-binlog/tidb-binlog-overview.md) diff --git a/br/backup-and-restore-faq.md b/br/backup-and-restore-faq.md index 6dca31b9c41c4..b16bd186ac77c 100644 --- a/br/backup-and-restore-faq.md +++ b/br/backup-and-restore-faq.md @@ -164,6 +164,12 @@ You can use [`filter.rules`](https://github.com/pingcap/tiflow/blob/7c3c2336f981 Yes. BR backs up the [`SHARD_ROW_ID_BITS` and `PRE_SPLIT_REGIONS`](/sql-statements/sql-statement-split-region.md#pre_split_regions) information of a table. The data of the restored table is also split into multiple Regions. +## What should I do when the `entry too large, the max entry size is 6291456, the size of data is 7690800` error reported during data restoration using BR? + +You can try reducing the size of the concurrent batch table creation by setting `--ddl-batch-size` to `128` or a smaller value. + +Suppose that the value of [`--ddl-batch-size`](/br/br-batch-create-table.md#how to use) is greater than `1`, and you use BR to restore the backup data. In this case, TiDB writes the queue of DDL jobs that create tables to TiKV. The total schema size of all tables sent by TiDB at one time should not exceed 6 MB, because the default maximum value of job messages that TiDB can send at one time is `6 MB`. You are **not recommended** to modify this value. For details, see [txn-entry-size-limit](/tidb-configuration-file.md#txn-entry-size-limit-new-in-50) and [raft-entry-max-size](/tikv-configuration-file.md#raft-entry-max-size). Therefore, if you set the too large value to `--ddl-batch-size`, the schema size of the batch table sent by TiDB at a time exceeds the specified value, resulting in BR reporting `entry too large, the max entry size is 6291456, the size of data is 7690800` error. + ## Why is the `region is unavailable` error reported for a SQL query after I use BR to restore the backup data? If the cluster backed up using BR has TiFlash, `TableInfo` stores the TiFlash information when BR restores the backup data. If the cluster to be restored does not have TiFlash, the `region is unavailable` error is reported. diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md new file mode 100644 index 0000000000000..3aae2a31012f2 --- /dev/null +++ b/br/br-batch-create-table.md @@ -0,0 +1,65 @@ +--- +title: BR Batch Create Table +summary: Learn how to use BR batch create table feature. When restoring data, BR can use batch create table feature to speed up the restoration. +--- + +# BR Batch Create Table + +When restoring data using Backup & Restore (BR), BR creates databases and tables on the downstream TiDB cluster first, and then restores data. In the versions earlier than TiDB v6.0, BR uses the [serial execution](#Implementation) scheme to create tables in the restoration process. However, when restoring data with a large number (nearly 50000) of tables, this scheme takes much time to create tables. + +To speed up the table creation process, and thereby reduce the time for restoring data, in v6.0, TiDB introduces BR batch create table feature. This feature is enabled by default. + +> **Note:** +> +> - To use the BR batch create table feature, both TiDB and BR should be in 6.0 or later versions. If either TiDB or BR is in the version lower than 6.0, BR uses the serial execution scheme. +> - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are in 6.0 or later versions. In this case, BR enables batch create table feature by default without additional configuration. + +## User scenario + +When you need to restore data with a considerable number of tables, for example, 50000 tables, you can use BR batch create table feature to speed up the restoration process. + +For the detailed effect, see [Test batch create table feature](#feature-test). + +## Use batch create table + +BR enables batch create table feature and configures `--ddl-batch-size=128` by default in v6.0 or later versions. Therefore, you do not need to configure this parameter additionally. `--ddl-batch-size=128` means that BR creates tables in multiple batches, and each batch has 128 tables. + +To disable this feature, you can set `--ddl-batch-size` to `0` by the following command: + +{{< copyable "shell-regular" >}} + +```shell +br restore full -s local:///br_data/ --pd 172.16.5.198:2379 --log-file restore.log --ddl-batch-size=0 +``` + +After disabling the feature, BR uses the [serial execution scheme](#implementation) instead. + +## Implementation principles + +- Serial execution scheme before v6.0: + + In the versions earlier than 6.0, BR uses the serial execution scheme. When restoring data, BR creates the database and table in the target TiDB first, then starts restoring data. To create tables, after calling TiDB API, BR uses the SQL statement `Create Table`. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the schema version changes correspondingly, and each change synchronizes to other BRs and other TiDB DDL workers. Hence, when restoring a large number of tables, the serial execution scheme takes too much time. + +- Batch create table scheme since v6.0: + + The batch create table feature uses the concurrent batch table creation scheme. From v6.0, by default, BR creates tables in multiple batches, and each batch has 128 tables. Using this scheme, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. + +## Test batch create table + +This section describes the information of testing batch create table feature. The test environment is as follows: + +- Cluster configurations: + + - 15 TiKV instances. Each TiKV instance has 16 cores, 80 GB memory, and 16 threads to process RPC requests ([`import.num-threads`](/tikv-configuration-file.md#num-threads) = 16). + - 3 TiDB instances. Each TiDB instance has 16 cores, 32 GB memory. + - 3 PD instances. Each PD instance has 16 cores, 32 GB memory. + +- Data to be restored: 16.16 TB + +The test result is as follows: + +``` +‘[2022/03/12 22:37:49.060 +08:00] [INFO] [collector.go:67] ["Full restore success summary"] [total-ranges=751760] [ranges-succeed=751760] [ranges-failed=0] [split-region=1h33m18.078448449s] [restore-ranges=542693] [total-take=1h41m35.471476438s] [restore-data-size(after-compressed)=8.337TB] [Size=8336694965072] [BackupTS=431773933856882690] [total-kv=148015861383] [total-kv-size=16.16TB] [average-speed=2.661GB/s]’ +``` + +In the result, you can find that the average speed of restoring one TiKV instance is as high as 181.65 MB/s (`average-speed(GB/s)`/`tikv_count` = `181.65(MB/s)`). \ No newline at end of file From 7340da82dcbce4fc22fcdd4a77bfa54625ffb43e Mon Sep 17 00:00:00 2001 From: en-jin19 Date: Tue, 29 Mar 2022 15:32:25 +0800 Subject: [PATCH 02/15] update translations --- br/backup-and-restore-faq.md | 6 +++--- br/br-batch-create-table.md | 30 +++++++++++++++--------------- 2 files changed, 18 insertions(+), 18 deletions(-) diff --git a/br/backup-and-restore-faq.md b/br/backup-and-restore-faq.md index b16bd186ac77c..0722276590c6c 100644 --- a/br/backup-and-restore-faq.md +++ b/br/backup-and-restore-faq.md @@ -164,11 +164,11 @@ You can use [`filter.rules`](https://github.com/pingcap/tiflow/blob/7c3c2336f981 Yes. BR backs up the [`SHARD_ROW_ID_BITS` and `PRE_SPLIT_REGIONS`](/sql-statements/sql-statement-split-region.md#pre_split_regions) information of a table. The data of the restored table is also split into multiple Regions. -## What should I do when the `entry too large, the max entry size is 6291456, the size of data is 7690800` error reported during data restoration using BR? +## What should I do if the `entry too large, the max entry size is 6291456, the size of data is 7690800` error is reported when restoring data using BR? -You can try reducing the size of the concurrent batch table creation by setting `--ddl-batch-size` to `128` or a smaller value. +Try reducing the size of the concurrent batch table creation by setting `--ddl-batch-size` to `128` or a smaller value. -Suppose that the value of [`--ddl-batch-size`](/br/br-batch-create-table.md#how to use) is greater than `1`, and you use BR to restore the backup data. In this case, TiDB writes the queue of DDL jobs that create tables to TiKV. The total schema size of all tables sent by TiDB at one time should not exceed 6 MB, because the default maximum value of job messages that TiDB can send at one time is `6 MB`. You are **not recommended** to modify this value. For details, see [txn-entry-size-limit](/tidb-configuration-file.md#txn-entry-size-limit-new-in-50) and [raft-entry-max-size](/tikv-configuration-file.md#raft-entry-max-size). Therefore, if you set the too large value to `--ddl-batch-size`, the schema size of the batch table sent by TiDB at a time exceeds the specified value, resulting in BR reporting `entry too large, the max entry size is 6291456, the size of data is 7690800` error. +Suppose that the value of [`--ddl-batch-size`](/br/br-batch-create-table.md#how to use) is greater than `1`, and you use BR to restore the backup data. In this case, TiDB writes the queue of DDL jobs that create tables to TiKV. At this time, the total schema size of all tables sent by TiDB at a time should not exceed 6 MB, because the default maximum value of job messages that TiDB can send at a time is `6 MB` (you are **not recommended** to modify this value; for details, see [`txn-entry-size-limit`](/tidb-configuration-file.md#txn-entry-size-limit-new-in-50) and [`raft-entry-max-size`](/tikv-configuration-file.md#raft-entry-max-size). Therefore, if you set the too large value to `--ddl-batch-size`, the schema size of the batch table sent by TiDB at a time exceeds the specified value, resulting in BR reporting `entry too large, the max entry size is 6291456, the size of data is 7690800` error. ## Why is the `region is unavailable` error reported for a SQL query after I use BR to restore the backup data? diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index 3aae2a31012f2..c6d8757a9538a 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -5,24 +5,24 @@ summary: Learn how to use BR batch create table feature. When restoring data, BR # BR Batch Create Table -When restoring data using Backup & Restore (BR), BR creates databases and tables on the downstream TiDB cluster first, and then restores data. In the versions earlier than TiDB v6.0, BR uses the [serial execution](#Implementation) scheme to create tables in the restoration process. However, when restoring data with a large number (nearly 50000) of tables, this scheme takes much time to create tables. +When restoring data using Backup & Restore (BR), BR creates the databases and tables in the target TiDB first, then starts restoring data. In the versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) scheme to create tables in the restoration process. However, when restoring data with a large number (nearly 50000) of tables, this scheme takes much time to create tables. -To speed up the table creation process, and thereby reduce the time for restoring data, in v6.0, TiDB introduces BR batch create table feature. This feature is enabled by default. +To speed up the table creation process, and thereby reduce the time for restoring data, the BR batch create table feature is introduced in TiDB v6.0.0. This feature is enabled by default. > **Note:** > -> - To use the BR batch create table feature, both TiDB and BR should be in 6.0 or later versions. If either TiDB or BR is in the version lower than 6.0, BR uses the serial execution scheme. -> - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are in 6.0 or later versions. In this case, BR enables batch create table feature by default without additional configuration. +> - To use the BR batch create table feature, both TiDB and BR should be in 6.0.0 or later versions. If either TiDB or BR is in the version lower than 6.0.0, BR uses the serial execution scheme. +> - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are in 6.0.0 or later versions. In this case, BR enables batch create table feature by default without additional configuration. ## User scenario -When you need to restore data with a considerable number of tables, for example, 50000 tables, you can use BR batch create table feature to speed up the restoration process. +When you need to restore data with a considerable number of tables, for example, 50000 tables, you can use BR batch create table feature to speed up the restoration process. -For the detailed effect, see [Test batch create table feature](#feature-test). +For the detailed effect, see [Test batch create table feature](#test-batch-create-table). ## Use batch create table -BR enables batch create table feature and configures `--ddl-batch-size=128` by default in v6.0 or later versions. Therefore, you do not need to configure this parameter additionally. `--ddl-batch-size=128` means that BR creates tables in multiple batches, and each batch has 128 tables. +BR enables batch create table feature and configures `--ddl-batch-size=128` by default in 6.0.0 or later versions. Therefore, you do not need to configure this parameter additionally. `--ddl-batch-size=128` means that BR creates tables in multiple batches, and each batch has 128 tables. To disable this feature, you can set `--ddl-batch-size` to `0` by the following command: @@ -32,17 +32,17 @@ To disable this feature, you can set `--ddl-batch-size` to `0` by the following br restore full -s local:///br_data/ --pd 172.16.5.198:2379 --log-file restore.log --ddl-batch-size=0 ``` -After disabling the feature, BR uses the [serial execution scheme](#implementation) instead. +After disabling the feature, BR uses the [serial execution scheme](#implementation-principles) instead. ## Implementation principles -- Serial execution scheme before v6.0: +- Serial execution scheme before v6.0.0: - In the versions earlier than 6.0, BR uses the serial execution scheme. When restoring data, BR creates the database and table in the target TiDB first, then starts restoring data. To create tables, after calling TiDB API, BR uses the SQL statement `Create Table`. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the schema version changes correspondingly, and each change synchronizes to other BRs and other TiDB DDL workers. Hence, when restoring a large number of tables, the serial execution scheme takes too much time. + In the versions earlier than 6.0.0, BR uses the serial execution scheme. When restoring data, BR creates the databases and tables in the target TiDB first, then starts restoring data. To create tables, after calling TiDB API, BR uses the SQL statement `Create Table`. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the schema version changes correspondingly, and each version change synchronizes to other BRs and other TiDB DDL workers. Hence, when restoring a large number of tables, the serial execution scheme takes too much time. -- Batch create table scheme since v6.0: +- Batch create table scheme since v6.0.0: - The batch create table feature uses the concurrent batch table creation scheme. From v6.0, by default, BR creates tables in multiple batches, and each batch has 128 tables. Using this scheme, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. + The batch create table feature uses the concurrent batch table creation scheme. From v6.0.0, by default, BR creates tables in multiple batches, and each batch has 128 tables. Using this scheme, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. ## Test batch create table @@ -50,9 +50,9 @@ This section describes the information of testing batch create table feature. Th - Cluster configurations: - - 15 TiKV instances. Each TiKV instance has 16 cores, 80 GB memory, and 16 threads to process RPC requests ([`import.num-threads`](/tikv-configuration-file.md#num-threads) = 16). - - 3 TiDB instances. Each TiDB instance has 16 cores, 32 GB memory. - - 3 PD instances. Each PD instance has 16 cores, 32 GB memory. + - 15 TiKV instances. Each TiKV instance has 16 CPU cores, 80 GB memory, and 16 threads to process RPC requests ([`import.num-threads`](/tikv-configuration-file.md#num-threads) = 16). + - 3 TiDB instances. Each TiDB instance has 16 CPU cores, 32 GB memory. + - 3 PD instances. Each PD instance has 16 CPU cores, 32 GB memory. - Data to be restored: 16.16 TB From 6b5773f3cddc9473a396c3245d86befcd5fac6aa Mon Sep 17 00:00:00 2001 From: en-jin19 Date: Tue, 29 Mar 2022 15:54:55 +0800 Subject: [PATCH 03/15] Fix a CI error --- br/backup-and-restore-faq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/br/backup-and-restore-faq.md b/br/backup-and-restore-faq.md index 0722276590c6c..34240d3ccc0d7 100644 --- a/br/backup-and-restore-faq.md +++ b/br/backup-and-restore-faq.md @@ -168,7 +168,7 @@ Yes. BR backs up the [`SHARD_ROW_ID_BITS` and `PRE_SPLIT_REGIONS`](/sql-statemen Try reducing the size of the concurrent batch table creation by setting `--ddl-batch-size` to `128` or a smaller value. -Suppose that the value of [`--ddl-batch-size`](/br/br-batch-create-table.md#how to use) is greater than `1`, and you use BR to restore the backup data. In this case, TiDB writes the queue of DDL jobs that create tables to TiKV. At this time, the total schema size of all tables sent by TiDB at a time should not exceed 6 MB, because the default maximum value of job messages that TiDB can send at a time is `6 MB` (you are **not recommended** to modify this value; for details, see [`txn-entry-size-limit`](/tidb-configuration-file.md#txn-entry-size-limit-new-in-50) and [`raft-entry-max-size`](/tikv-configuration-file.md#raft-entry-max-size). Therefore, if you set the too large value to `--ddl-batch-size`, the schema size of the batch table sent by TiDB at a time exceeds the specified value, resulting in BR reporting `entry too large, the max entry size is 6291456, the size of data is 7690800` error. +Suppose that the value of [`--ddl-batch-size`](/br/br-batch-create-table.md#how to use) is greater than `1`, and you use BR to restore the backup data. In this case, TiDB writes the queue of DDL jobs that create tables to TiKV. At this time, the total schema size of all tables sent by TiDB at a time should not exceed 6 MB, because the default maximum value of job messages that TiDB can send at a time is `6 MB` (you are **not recommended** to modify this value; for details, see [`txn-entry-size-limit`](/tidb-configuration-file.md#txn-entry-size-limit-new-in-v50) and [`raft-entry-max-size`](/tikv-configuration-file.md#raft-entry-max-size). Therefore, if you set the too large value to `--ddl-batch-size`, the schema size of the batch table sent by TiDB at a time exceeds the specified value, resulting in BR reporting `entry too large, the max entry size is 6291456, the size of data is 7690800` error. ## Why is the `region is unavailable` error reported for a SQL query after I use BR to restore the backup data? From c93b87a1353e07adc6f2a9f6ed055de4e0dc713c Mon Sep 17 00:00:00 2001 From: Enwei Date: Tue, 29 Mar 2022 20:24:03 +0800 Subject: [PATCH 04/15] Apply suggestions from code review Co-authored-by: fengou1 <85682690+fengou1@users.noreply.github.com> --- br/backup-and-restore-faq.md | 2 +- br/br-batch-create-table.md | 12 ++++++------ 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/br/backup-and-restore-faq.md b/br/backup-and-restore-faq.md index 34240d3ccc0d7..b41299fa7ec09 100644 --- a/br/backup-and-restore-faq.md +++ b/br/backup-and-restore-faq.md @@ -168,7 +168,7 @@ Yes. BR backs up the [`SHARD_ROW_ID_BITS` and `PRE_SPLIT_REGIONS`](/sql-statemen Try reducing the size of the concurrent batch table creation by setting `--ddl-batch-size` to `128` or a smaller value. -Suppose that the value of [`--ddl-batch-size`](/br/br-batch-create-table.md#how to use) is greater than `1`, and you use BR to restore the backup data. In this case, TiDB writes the queue of DDL jobs that create tables to TiKV. At this time, the total schema size of all tables sent by TiDB at a time should not exceed 6 MB, because the default maximum value of job messages that TiDB can send at a time is `6 MB` (you are **not recommended** to modify this value; for details, see [`txn-entry-size-limit`](/tidb-configuration-file.md#txn-entry-size-limit-new-in-v50) and [`raft-entry-max-size`](/tikv-configuration-file.md#raft-entry-max-size). Therefore, if you set the too large value to `--ddl-batch-size`, the schema size of the batch table sent by TiDB at a time exceeds the specified value, resulting in BR reporting `entry too large, the max entry size is 6291456, the size of data is 7690800` error. +When using BR to restore the back up data with the value of [`--ddl-batch-size`](/br/br-batch-create-table.md#how to use) is greater than `1`, TiDB writes a DDL job to the DDL jobs queue that is maintained by TiKV. At this time, the total size of all tables schema sent by TiDB at a time should not exceed 6 MB, because the maximum value of job messages is `6 MB` by default (you are **not recommended** to modify this value; for details, see [`txn-entry-size-limit`](/tidb-configuration-file.md#txn-entry-size-limit-new-in-v50) and [`raft-entry-max-size`](/tikv-configuration-file.md#raft-entry-max-size). Therefore, if you set the too large value to `--ddl-batch-size`, the schema size of the batch table sent by TiDB at a time exceeds the specified value, resulting in BR reporting `entry too large, the max entry size is 6291456, the size of data is 7690800` error. ## Why is the `region is unavailable` error reported for a SQL query after I use BR to restore the backup data? diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index c6d8757a9538a..e4012ea76607e 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -5,24 +5,24 @@ summary: Learn how to use BR batch create table feature. When restoring data, BR # BR Batch Create Table -When restoring data using Backup & Restore (BR), BR creates the databases and tables in the target TiDB first, then starts restoring data. In the versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) scheme to create tables in the restoration process. However, when restoring data with a large number (nearly 50000) of tables, this scheme takes much time to create tables. +When restoring data using Backup & Restore (BR), BR creates the databases and tables in the target TiDB first, then starts restoring table data. In the versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) scheme to create tables in the restoration process. However, when restoring data with a large number (nearly 50000) of tables, this scheme takes much time to create tables. To speed up the table creation process, and thereby reduce the time for restoring data, the BR batch create table feature is introduced in TiDB v6.0.0. This feature is enabled by default. > **Note:** > -> - To use the BR batch create table feature, both TiDB and BR should be in 6.0.0 or later versions. If either TiDB or BR is in the version lower than 6.0.0, BR uses the serial execution scheme. +> - To use the BR batch create table feature, both TiDB and BR should be in v6.0.0 or later. If either TiDB or BR is in the version lower than v6.0.0, BR uses the serial execution scheme. > - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are in 6.0.0 or later versions. In this case, BR enables batch create table feature by default without additional configuration. ## User scenario -When you need to restore data with a considerable number of tables, for example, 50000 tables, you can use BR batch create table feature to speed up the restoration process. +When you need to restore data with massive tables, for example, 50000 tables, you can use BR batch create table feature to speed up the restoration process. For the detailed effect, see [Test batch create table feature](#test-batch-create-table). ## Use batch create table -BR enables batch create table feature and configures `--ddl-batch-size=128` by default in 6.0.0 or later versions. Therefore, you do not need to configure this parameter additionally. `--ddl-batch-size=128` means that BR creates tables in multiple batches, and each batch has 128 tables. +BR enables batch create table feature and configures `--ddl-batch-size=128` by default in v6.0.0 or later. Therefore, you do not need to configure this parameter additionally. `--ddl-batch-size=128` means that BR creates tables in multiple batches, and each batch has 128 tables. To disable this feature, you can set `--ddl-batch-size` to `0` by the following command: @@ -51,8 +51,8 @@ This section describes the information of testing batch create table feature. Th - Cluster configurations: - 15 TiKV instances. Each TiKV instance has 16 CPU cores, 80 GB memory, and 16 threads to process RPC requests ([`import.num-threads`](/tikv-configuration-file.md#num-threads) = 16). - - 3 TiDB instances. Each TiDB instance has 16 CPU cores, 32 GB memory. - - 3 PD instances. Each PD instance has 16 CPU cores, 32 GB memory. + - 3 TiDB instances. Each TiDB instance is equipped with 16 CPU cores, 32 GB memory. + - 3 PD instances. Each PD instance is equipped with 16 CPU cores, 32 GB memory. - Data to be restored: 16.16 TB From 73b41ccad111dd24b35249bd11f2880e43d2ab89 Mon Sep 17 00:00:00 2001 From: TomShawn <41534398+TomShawn@users.noreply.github.com> Date: Wed, 30 Mar 2022 10:56:05 +0800 Subject: [PATCH 05/15] Update br/backup-and-restore-faq.md Co-authored-by: Enwei --- br/backup-and-restore-faq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/br/backup-and-restore-faq.md b/br/backup-and-restore-faq.md index b41299fa7ec09..a5370a841b613 100644 --- a/br/backup-and-restore-faq.md +++ b/br/backup-and-restore-faq.md @@ -164,7 +164,7 @@ You can use [`filter.rules`](https://github.com/pingcap/tiflow/blob/7c3c2336f981 Yes. BR backs up the [`SHARD_ROW_ID_BITS` and `PRE_SPLIT_REGIONS`](/sql-statements/sql-statement-split-region.md#pre_split_regions) information of a table. The data of the restored table is also split into multiple Regions. -## What should I do if the `entry too large, the max entry size is 6291456, the size of data is 7690800` error is reported when restoring data using BR? +## What should I do if the restore fails with the error message `the entry too large, the max entry size is 6291456, the size of data is 7690800`? Try reducing the size of the concurrent batch table creation by setting `--ddl-batch-size` to `128` or a smaller value. From c275cf0ecccfbb79ec775ba62946c470fc2781e1 Mon Sep 17 00:00:00 2001 From: en-jin19 Date: Wed, 30 Mar 2022 15:47:17 +0800 Subject: [PATCH 06/15] update two parts --- br/br-batch-create-table.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index e4012ea76607e..7827f4564424d 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -42,7 +42,7 @@ After disabling the feature, BR uses the [serial execution scheme](#implementati - Batch create table scheme since v6.0.0: - The batch create table feature uses the concurrent batch table creation scheme. From v6.0.0, by default, BR creates tables in multiple batches, and each batch has 128 tables. Using this scheme, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. + From v6.0.0, by default, BR creates tables in multiple batches, and each batch has 128 tables. Using this scheme, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. ## Test batch create table @@ -50,7 +50,7 @@ This section describes the information of testing batch create table feature. Th - Cluster configurations: - - 15 TiKV instances. Each TiKV instance has 16 CPU cores, 80 GB memory, and 16 threads to process RPC requests ([`import.num-threads`](/tikv-configuration-file.md#num-threads) = 16). + - 15 TiKV instances. Each TiKV instance is equipped with 16 CPU cores, 80 GB memory, and 16 threads to process RPC requests ([`import.num-threads`](/tikv-configuration-file.md#num-threads) = 16). - 3 TiDB instances. Each TiDB instance is equipped with 16 CPU cores, 32 GB memory. - 3 PD instances. Each PD instance is equipped with 16 CPU cores, 32 GB memory. From 0ed6304c7faf103e78ff30a53c1a9a120c27efde Mon Sep 17 00:00:00 2001 From: Enwei Date: Thu, 31 Mar 2022 20:07:41 +0800 Subject: [PATCH 07/15] Apply suggestions from code review Co-authored-by: TomShawn <41534398+TomShawn@users.noreply.github.com> --- br/backup-and-restore-faq.md | 4 ++-- br/br-batch-create-table.md | 26 +++++++++++++------------- 2 files changed, 15 insertions(+), 15 deletions(-) diff --git a/br/backup-and-restore-faq.md b/br/backup-and-restore-faq.md index a5370a841b613..2e6ece4643fc2 100644 --- a/br/backup-and-restore-faq.md +++ b/br/backup-and-restore-faq.md @@ -166,9 +166,9 @@ Yes. BR backs up the [`SHARD_ROW_ID_BITS` and `PRE_SPLIT_REGIONS`](/sql-statemen ## What should I do if the restore fails with the error message `the entry too large, the max entry size is 6291456, the size of data is 7690800`? -Try reducing the size of the concurrent batch table creation by setting `--ddl-batch-size` to `128` or a smaller value. +You can try to reduce the number of tables to be created in a batch by setting `--ddl-batch-size` to `128` or a smaller value. -When using BR to restore the back up data with the value of [`--ddl-batch-size`](/br/br-batch-create-table.md#how to use) is greater than `1`, TiDB writes a DDL job to the DDL jobs queue that is maintained by TiKV. At this time, the total size of all tables schema sent by TiDB at a time should not exceed 6 MB, because the maximum value of job messages is `6 MB` by default (you are **not recommended** to modify this value; for details, see [`txn-entry-size-limit`](/tidb-configuration-file.md#txn-entry-size-limit-new-in-v50) and [`raft-entry-max-size`](/tikv-configuration-file.md#raft-entry-max-size). Therefore, if you set the too large value to `--ddl-batch-size`, the schema size of the batch table sent by TiDB at a time exceeds the specified value, resulting in BR reporting `entry too large, the max entry size is 6291456, the size of data is 7690800` error. +When using BR to restore the backup data with the value of [`--ddl-batch-size`](/br/br-batch-create-table.md#how to use) greater than `1`, TiDB writes a DDL job of table creation to the DDL jobs queue that is maintained by TiKV. At this time, the total size of all tables schema sent by TiDB at one time should not exceed 6 MB, because the maximum value of job messages is `6 MB` by default (it is **not recommended** to modify this value. For details, see [`txn-entry-size-limit`](/tidb-configuration-file.md#txn-entry-size-limit-new-in-v50) and [`raft-entry-max-size`](/tikv-configuration-file.md#raft-entry-max-size)). Therefore, if you set `--ddl-batch-size` to an excessively large value, the schema size of the tables sent by TiDB in a batch at one time exceeds the specified value, which causes BR to report the `entry too large, the max entry size is 6291456, the size of data is 7690800` error. ## Why is the `region is unavailable` error reported for a SQL query after I use BR to restore the backup data? diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index 7827f4564424d..697c627bce5ed 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -1,30 +1,30 @@ --- title: BR Batch Create Table -summary: Learn how to use BR batch create table feature. When restoring data, BR can use batch create table feature to speed up the restoration. +summary: Learn how to use the BR batch create table feature. When restoring data, BR can create tables in batches to speed up the restore process. --- # BR Batch Create Table -When restoring data using Backup & Restore (BR), BR creates the databases and tables in the target TiDB first, then starts restoring table data. In the versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) scheme to create tables in the restoration process. However, when restoring data with a large number (nearly 50000) of tables, this scheme takes much time to create tables. +When restoring data, Backup & Restore (BR) creates databases and tables in the target TiDB before it starts to restore the table data. In versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) to create tables in the restore process. However, when BR restores data with a large number (nearly 50000) of tables, this serial implementation of table creation takes much time. To speed up the table creation process, and thereby reduce the time for restoring data, the BR batch create table feature is introduced in TiDB v6.0.0. This feature is enabled by default. > **Note:** > -> - To use the BR batch create table feature, both TiDB and BR should be in v6.0.0 or later. If either TiDB or BR is in the version lower than v6.0.0, BR uses the serial execution scheme. -> - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are in 6.0.0 or later versions. In this case, BR enables batch create table feature by default without additional configuration. +> - To use the BR batch create table feature, both TiDB and BR are expected to be of v6.0.0 or later. If either TiDB or BR is earlier than v6.0.0, BR uses the serial execution implementation. +> - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are of v6.0.0 or later versions, or your TiDB and BR are upgraded from a version earlier than v6.0.0 to v6.0.0 or later. In this case, BR enables the batch create table feature by default without additional configuration. -## User scenario +## Usage scenario -When you need to restore data with massive tables, for example, 50000 tables, you can use BR batch create table feature to speed up the restoration process. +If you need to restore data with a massive amount of tables, for example, 50000 tables, you can use the BR batch create table feature to speed up the restore process. -For the detailed effect, see [Test batch create table feature](#test-batch-create-table). +For the detailed effect, see [Test against the batch create table feature](#test-batch-create-table). ## Use batch create table -BR enables batch create table feature and configures `--ddl-batch-size=128` by default in v6.0.0 or later. Therefore, you do not need to configure this parameter additionally. `--ddl-batch-size=128` means that BR creates tables in multiple batches, and each batch has 128 tables. +BR enables the batch create table feature by default, with the default configuration of `--ddl-batch-size=128` in v6.0.0 or later to speed up the restore process. Therefore, you do not need to configure this parameter. `--ddl-batch-size=128` means that BR creates tables in multiple batches, and each batch has 128 tables. -To disable this feature, you can set `--ddl-batch-size` to `0` by the following command: +To disable this feature, you can set `--ddl-batch-size` to `0`. See the following example command: {{< copyable "shell-regular" >}} @@ -32,7 +32,7 @@ To disable this feature, you can set `--ddl-batch-size` to `0` by the following br restore full -s local:///br_data/ --pd 172.16.5.198:2379 --log-file restore.log --ddl-batch-size=0 ``` -After disabling the feature, BR uses the [serial execution scheme](#implementation-principles) instead. +After this feature is disabled, BR uses the [serial execution implementation](#implementation-principles) instead. ## Implementation principles @@ -40,13 +40,13 @@ After disabling the feature, BR uses the [serial execution scheme](#implementati In the versions earlier than 6.0.0, BR uses the serial execution scheme. When restoring data, BR creates the databases and tables in the target TiDB first, then starts restoring data. To create tables, after calling TiDB API, BR uses the SQL statement `Create Table`. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the schema version changes correspondingly, and each version change synchronizes to other BRs and other TiDB DDL workers. Hence, when restoring a large number of tables, the serial execution scheme takes too much time. -- Batch create table scheme since v6.0.0: +- Batch create table implementation since v6.0.0: From v6.0.0, by default, BR creates tables in multiple batches, and each batch has 128 tables. Using this scheme, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. -## Test batch create table +## Test against the batch create table feature -This section describes the information of testing batch create table feature. The test environment is as follows: +This section describes the information of testing the batch create table feature. The test environment is as follows: - Cluster configurations: From 8ac9aae6e6f0fb43b8590f49040194c805b4193b Mon Sep 17 00:00:00 2001 From: fengou1 <85682690+fengou1@users.noreply.github.com> Date: Fri, 1 Apr 2022 08:30:52 +0800 Subject: [PATCH 08/15] Update br/br-batch-create-table.md Co-authored-by: Enwei --- br/br-batch-create-table.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index 697c627bce5ed..78fb4c9801683 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -36,7 +36,7 @@ After this feature is disabled, BR uses the [serial execution implementation](#i ## Implementation principles -- Serial execution scheme before v6.0.0: +- Serial execution solution before v6.0.0: In the versions earlier than 6.0.0, BR uses the serial execution scheme. When restoring data, BR creates the databases and tables in the target TiDB first, then starts restoring data. To create tables, after calling TiDB API, BR uses the SQL statement `Create Table`. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the schema version changes correspondingly, and each version change synchronizes to other BRs and other TiDB DDL workers. Hence, when restoring a large number of tables, the serial execution scheme takes too much time. From b78162b2925e76f0a279f78dbe975cefbdfb9707 Mon Sep 17 00:00:00 2001 From: fengou1 <85682690+fengou1@users.noreply.github.com> Date: Fri, 1 Apr 2022 08:59:55 +0800 Subject: [PATCH 09/15] Update br/br-batch-create-table.md --- br/br-batch-create-table.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index 78fb4c9801683..b928d48a3fe47 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -38,7 +38,7 @@ After this feature is disabled, BR uses the [serial execution implementation](#i - Serial execution solution before v6.0.0: - In the versions earlier than 6.0.0, BR uses the serial execution scheme. When restoring data, BR creates the databases and tables in the target TiDB first, then starts restoring data. To create tables, after calling TiDB API, BR uses the SQL statement `Create Table`. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the schema version changes correspondingly, and each version change synchronizes to other BRs and other TiDB DDL workers. Hence, when restoring a large number of tables, the serial execution scheme takes too much time. + In the versions earlier, BR uses the serial execution implementation. When restoring data, BR creates the databases and tables in the target TiDB first, then starts restoring data. To create tables, BR calls TiDB internal API, more like BR uses the SQL statement `Create Table`. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly, and each version change synchronizes to other TiDB DDL workers (including BR). Hence, when restoring a large number of tables, the serial execution implementation takes too much time. - Batch create table implementation since v6.0.0: From 8491b56016cf8c526fe4b8a7fa277d7c77fc1a74 Mon Sep 17 00:00:00 2001 From: Enwei Date: Fri, 1 Apr 2022 12:53:21 +0800 Subject: [PATCH 10/15] Apply suggestions from code review --- br/br-batch-create-table.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index b928d48a3fe47..b321b83f99c39 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -38,7 +38,7 @@ After this feature is disabled, BR uses the [serial execution implementation](#i - Serial execution solution before v6.0.0: - In the versions earlier, BR uses the serial execution implementation. When restoring data, BR creates the databases and tables in the target TiDB first, then starts restoring data. To create tables, BR calls TiDB internal API, more like BR uses the SQL statement `Create Table`. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly, and each version change synchronizes to other TiDB DDL workers (including BR). Hence, when restoring a large number of tables, the serial execution implementation takes too much time. + In the versions earlier, BR uses the serial execution implementation. When restoring data, BR creates the databases and tables in the target TiDB first, then starts restoring data. BR calls TiDB internal API to create tables, which operation looks like BR executes the SQL `Create Table` statement. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly, and each version change synchronizes to other TiDB DDL workers (including BR). Hence, when restoring a large number of tables, the serial execution implementation takes too much time. - Batch create table implementation since v6.0.0: From 08ced1be044bef49285d98e49659c7042389b181 Mon Sep 17 00:00:00 2001 From: en-jin19 Date: Fri, 1 Apr 2022 15:38:46 +0800 Subject: [PATCH 11/15] update the translation --- br/br-batch-create-table.md | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index b321b83f99c39..55c3b6fe0fda3 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -1,28 +1,28 @@ --- -title: BR Batch Create Table -summary: Learn how to use the BR batch create table feature. When restoring data, BR can create tables in batches to speed up the restore process. +title: Batch Create Table +summary: Learn how to use the Batch Create Table feature. When restoring data, BR can create tables in batches to speed up the restore process. --- -# BR Batch Create Table +# Batch Create Table -When restoring data, Backup & Restore (BR) creates databases and tables in the target TiDB before it starts to restore the table data. In versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) to create tables in the restore process. However, when BR restores data with a large number (nearly 50000) of tables, this serial implementation of table creation takes much time. +When restoring data, Backup & Restore (BR) creates databases and tables in the target TiDB before restoring the table data. In versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) solution to create tables in the restore process. However, when BR restores data with a large number (nearly 50000) of tables, this solution takes much time on creating tables. -To speed up the table creation process, and thereby reduce the time for restoring data, the BR batch create table feature is introduced in TiDB v6.0.0. This feature is enabled by default. +To speed up the table creation process, and thereby reduce the time for restoring data, the Batch Create Table feature is introduced in TiDB v6.0.0. This feature is enabled by default. > **Note:** > -> - To use the BR batch create table feature, both TiDB and BR are expected to be of v6.0.0 or later. If either TiDB or BR is earlier than v6.0.0, BR uses the serial execution implementation. -> - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are of v6.0.0 or later versions, or your TiDB and BR are upgraded from a version earlier than v6.0.0 to v6.0.0 or later. In this case, BR enables the batch create table feature by default without additional configuration. +> - To use the Batch Create Table feature, both TiDB and BR are expected to be of v6.0.0 or later. If either TiDB or BR is earlier than v6.0.0, BR uses the serial execution solution. +> - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are of v6.0.0 or later versions, or your TiDB and BR are upgraded from a version earlier than v6.0.0 to v6.0.0 or later. In this case, BR enables the Batch Create Table feature by default, without additional configuration. ## Usage scenario -If you need to restore data with a massive amount of tables, for example, 50000 tables, you can use the BR batch create table feature to speed up the restore process. +If you need to restore data with a massive amount of tables, for example, 50000 tables, you can use the Batch Create Table feature to speed up the restore process. -For the detailed effect, see [Test against the batch create table feature](#test-batch-create-table). +For the detailed effect, see [Test for the Batch Create Table Feature](#test-batch-create-table). -## Use batch create table +## Use the Batch Create Table feature -BR enables the batch create table feature by default, with the default configuration of `--ddl-batch-size=128` in v6.0.0 or later to speed up the restore process. Therefore, you do not need to configure this parameter. `--ddl-batch-size=128` means that BR creates tables in multiple batches, and each batch has 128 tables. +BR enables the Batch Create Table Feature by default, with the default configuration of `--ddl-batch-size=128` in v6.0.0 or later to speed up the restore process. Therefore, you do not need to configure this parameter. `--ddl-batch-size=128` means that BR creates tables in multiple batches, and each batch has 128 tables. To disable this feature, you can set `--ddl-batch-size` to `0`. See the following example command: @@ -32,21 +32,21 @@ To disable this feature, you can set `--ddl-batch-size` to `0`. See the followin br restore full -s local:///br_data/ --pd 172.16.5.198:2379 --log-file restore.log --ddl-batch-size=0 ``` -After this feature is disabled, BR uses the [serial execution implementation](#implementation-principles) instead. +After this feature is disabled, BR uses the [serial execution solution](#implementation-principles) instead. ## Implementation principles - Serial execution solution before v6.0.0: - In the versions earlier, BR uses the serial execution implementation. When restoring data, BR creates the databases and tables in the target TiDB first, then starts restoring data. BR calls TiDB internal API to create tables, which operation looks like BR executes the SQL `Create Table` statement. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly, and each version change synchronizes to other TiDB DDL workers (including BR). Hence, when restoring a large number of tables, the serial execution implementation takes too much time. + When restoring data using BR, BR creates databases and tables in the target TiDB before restoring the table data.To create tables, BR calls TiDB internal API first, and then process table creation tasks, which operation looks like BR executes the SQL `Create Table` statement. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly, and each version change synchronizes to other TiDB DDL workers (including BR). Hence, when restoring a large number of tables, the serial execution solution takes too much time. -- Batch create table implementation since v6.0.0: +- Batch create table solution since v6.0.0: - From v6.0.0, by default, BR creates tables in multiple batches, and each batch has 128 tables. Using this scheme, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. + By default, BR creates tables in multiple batches, and each batch has 128 tables. Using this solution, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. -## Test against the batch create table feature +## Test for the Batch Create Table feature -This section describes the information of testing the batch create table feature. The test environment is as follows: +This section describes the test information about the Batch Create Table feature. The test environment is as follows: - Cluster configurations: @@ -62,4 +62,4 @@ The test result is as follows: ‘[2022/03/12 22:37:49.060 +08:00] [INFO] [collector.go:67] ["Full restore success summary"] [total-ranges=751760] [ranges-succeed=751760] [ranges-failed=0] [split-region=1h33m18.078448449s] [restore-ranges=542693] [total-take=1h41m35.471476438s] [restore-data-size(after-compressed)=8.337TB] [Size=8336694965072] [BackupTS=431773933856882690] [total-kv=148015861383] [total-kv-size=16.16TB] [average-speed=2.661GB/s]’ ``` -In the result, you can find that the average speed of restoring one TiKV instance is as high as 181.65 MB/s (`average-speed(GB/s)`/`tikv_count` = `181.65(MB/s)`). \ No newline at end of file +In the test result, you can find that the average speed of restoring one TiKV instance is as high as 181.65 MB/s (`average-speed(GB/s)`/`tikv_count` = `181.65(MB/s)`). \ No newline at end of file From 879e174b7acf6e9064730bcc689c7cbf530ffb87 Mon Sep 17 00:00:00 2001 From: en-jin19 Date: Fri, 1 Apr 2022 17:07:01 +0800 Subject: [PATCH 12/15] change "solution" to "implementation" --- br/br-batch-create-table.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index 55c3b6fe0fda3..e465fc46e911c 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -5,13 +5,13 @@ summary: Learn how to use the Batch Create Table feature. When restoring data, B # Batch Create Table -When restoring data, Backup & Restore (BR) creates databases and tables in the target TiDB before restoring the table data. In versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) solution to create tables in the restore process. However, when BR restores data with a large number (nearly 50000) of tables, this solution takes much time on creating tables. +When restoring data, Backup & Restore (BR) creates databases and tables in the target TiDB before restoring the table data. In versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) implementation to create tables in the restore process. However, when BR restores data with a large number (nearly 50000) of tables, this implementation takes much time on creating tables. To speed up the table creation process, and thereby reduce the time for restoring data, the Batch Create Table feature is introduced in TiDB v6.0.0. This feature is enabled by default. > **Note:** > -> - To use the Batch Create Table feature, both TiDB and BR are expected to be of v6.0.0 or later. If either TiDB or BR is earlier than v6.0.0, BR uses the serial execution solution. +> - To use the Batch Create Table feature, both TiDB and BR are expected to be of v6.0.0 or later. If either TiDB or BR is earlier than v6.0.0, BR uses the serial execution implementation. > - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are of v6.0.0 or later versions, or your TiDB and BR are upgraded from a version earlier than v6.0.0 to v6.0.0 or later. In this case, BR enables the Batch Create Table feature by default, without additional configuration. ## Usage scenario @@ -32,17 +32,17 @@ To disable this feature, you can set `--ddl-batch-size` to `0`. See the followin br restore full -s local:///br_data/ --pd 172.16.5.198:2379 --log-file restore.log --ddl-batch-size=0 ``` -After this feature is disabled, BR uses the [serial execution solution](#implementation-principles) instead. +After this feature is disabled, BR uses the [serial execution implementation](#implementation-principles) instead. ## Implementation principles -- Serial execution solution before v6.0.0: +- Serial execution implementation before v6.0.0: - When restoring data using BR, BR creates databases and tables in the target TiDB before restoring the table data.To create tables, BR calls TiDB internal API first, and then process table creation tasks, which operation looks like BR executes the SQL `Create Table` statement. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly, and each version change synchronizes to other TiDB DDL workers (including BR). Hence, when restoring a large number of tables, the serial execution solution takes too much time. + When restoring data, BR creates databases and tables in the target TiDB before restoring the table data.To create tables, BR calls TiDB internal API first, and then process table creation tasks, which operation looks like BR executes the SQL `Create Table` statement. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly, and each version change synchronizes to other TiDB DDL workers (including BR). Hence, when restoring a large number of tables, the serial execution implementation takes too much time. -- Batch create table solution since v6.0.0: +- Batch create table implementation since v6.0.0: - By default, BR creates tables in multiple batches, and each batch has 128 tables. Using this solution, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. + By default, BR creates tables in multiple batches, and each batch has 128 tables. Using this implementation, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. ## Test for the Batch Create Table feature From 45e87bd70bd5cdbab6574701a96fb1099aea0171 Mon Sep 17 00:00:00 2001 From: Enwei Date: Fri, 1 Apr 2022 17:31:12 +0800 Subject: [PATCH 13/15] Apply suggestions from code review --- br/br-batch-create-table.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index e465fc46e911c..82c064a795d6d 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -5,7 +5,7 @@ summary: Learn how to use the Batch Create Table feature. When restoring data, B # Batch Create Table -When restoring data, Backup & Restore (BR) creates databases and tables in the target TiDB before restoring the table data. In versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) implementation to create tables in the restore process. However, when BR restores data with a large number (nearly 50000) of tables, this implementation takes much time on creating tables. +When restoring data, Backup & Restore (BR) creates databases and tables in the target TiDB and then restores the backed-up data to the tables. In versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) implementation to create tables in the restore process. However, when BR restores data with a large number (nearly 50000) of tables, this implementation takes much time on creating tables. To speed up the table creation process, and thereby reduce the time for restoring data, the Batch Create Table feature is introduced in TiDB v6.0.0. This feature is enabled by default. @@ -38,7 +38,7 @@ After this feature is disabled, BR uses the [serial execution implementation](#i - Serial execution implementation before v6.0.0: - When restoring data, BR creates databases and tables in the target TiDB before restoring the table data.To create tables, BR calls TiDB internal API first, and then process table creation tasks, which operation looks like BR executes the SQL `Create Table` statement. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly, and each version change synchronizes to other TiDB DDL workers (including BR). Hence, when restoring a large number of tables, the serial execution implementation takes too much time. + When restoring data, BR creates databases and tables in the target TiDB and then restores the backed-up data to the tables. To create tables, BR calls TiDB internal API first, and then processes table creation tasks, which operation looks like BR executes the SQL `Create Table` statement. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly, and each version change synchronizes to other TiDB DDL workers (including BR). Hence, when restoring a large number of tables, the serial execution implementation takes too much time. - Batch create table implementation since v6.0.0: From 0db72b79aa420b715085fd76ab3b67405b55f550 Mon Sep 17 00:00:00 2001 From: Enwei Date: Fri, 1 Apr 2022 17:51:32 +0800 Subject: [PATCH 14/15] fix a CI error --- br/br-batch-create-table.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index 82c064a795d6d..aa10d182942e7 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -18,7 +18,7 @@ To speed up the table creation process, and thereby reduce the time for restorin If you need to restore data with a massive amount of tables, for example, 50000 tables, you can use the Batch Create Table feature to speed up the restore process. -For the detailed effect, see [Test for the Batch Create Table Feature](#test-batch-create-table). +For the detailed effect, see [Test for the Batch Create Table Feature](#test-for-the-batch-create-table-feature). ## Use the Batch Create Table feature From 3b231b4b5896bf0b56cc78417808e21506ee152b Mon Sep 17 00:00:00 2001 From: Enwei Date: Fri, 1 Apr 2022 19:56:33 +0800 Subject: [PATCH 15/15] Apply suggestions from code review Co-authored-by: TomShawn <41534398+TomShawn@users.noreply.github.com> --- br/br-batch-create-table.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/br/br-batch-create-table.md b/br/br-batch-create-table.md index aa10d182942e7..b10e066f6185c 100644 --- a/br/br-batch-create-table.md +++ b/br/br-batch-create-table.md @@ -5,14 +5,14 @@ summary: Learn how to use the Batch Create Table feature. When restoring data, B # Batch Create Table -When restoring data, Backup & Restore (BR) creates databases and tables in the target TiDB and then restores the backed-up data to the tables. In versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) implementation to create tables in the restore process. However, when BR restores data with a large number (nearly 50000) of tables, this implementation takes much time on creating tables. +When restoring data, Backup & Restore (BR) creates databases and tables in the target TiDB cluster and then restores the backup data to the tables. In versions earlier than TiDB v6.0.0, BR uses the [serial execution](#implementation-principles) implementation to create tables in the restore process. However, when BR restores data with a large number of tables (nearly 50000), this implementation takes much time on creating tables. -To speed up the table creation process, and thereby reduce the time for restoring data, the Batch Create Table feature is introduced in TiDB v6.0.0. This feature is enabled by default. +To speed up the table creation process and reduce the time for restoring data, the Batch Create Table feature is introduced in TiDB v6.0.0. This feature is enabled by default. > **Note:** > > - To use the Batch Create Table feature, both TiDB and BR are expected to be of v6.0.0 or later. If either TiDB or BR is earlier than v6.0.0, BR uses the serial execution implementation. -> - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are of v6.0.0 or later versions, or your TiDB and BR are upgraded from a version earlier than v6.0.0 to v6.0.0 or later. In this case, BR enables the Batch Create Table feature by default, without additional configuration. +> - Suppose that you use a cluster management tool (for example, TiUP), and your TiDB and BR are of v6.0.0 or later versions, or your TiDB and BR are upgraded from a version earlier than v6.0.0 to v6.0.0 or later. In this case, BR enables the Batch Create Table feature by default. ## Usage scenario @@ -22,7 +22,7 @@ For the detailed effect, see [Test for the Batch Create Table Feature](#test-for ## Use the Batch Create Table feature -BR enables the Batch Create Table Feature by default, with the default configuration of `--ddl-batch-size=128` in v6.0.0 or later to speed up the restore process. Therefore, you do not need to configure this parameter. `--ddl-batch-size=128` means that BR creates tables in multiple batches, and each batch has 128 tables. +BR enables the Batch Create Table feature by default, with the default configuration of `--ddl-batch-size=128` in v6.0.0 or later to speed up the restore process. Therefore, you do not need to configure this parameter. `--ddl-batch-size=128` means that BR creates tables in batches, each batch with 128 tables. To disable this feature, you can set `--ddl-batch-size` to `0`. See the following example command: @@ -38,11 +38,11 @@ After this feature is disabled, BR uses the [serial execution implementation](#i - Serial execution implementation before v6.0.0: - When restoring data, BR creates databases and tables in the target TiDB and then restores the backed-up data to the tables. To create tables, BR calls TiDB internal API first, and then processes table creation tasks, which operation looks like BR executes the SQL `Create Table` statement. TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly, and each version change synchronizes to other TiDB DDL workers (including BR). Hence, when restoring a large number of tables, the serial execution implementation takes too much time. + When restoring data, BR creates databases and tables in the target TiDB cluster and then restores the backup data to the tables. To create tables, BR calls TiDB internal API first, and then processes table creation tasks, which works similarly to executing the `Create Table` statement by BR. The TiDB DDL owner creates tables sequentially. Once the DDL owner creates a table, the DDL schema version changes correspondingly and each version change is synchronized to other TiDB DDL workers (including BR). Therefore, when BR restores a large number of tables, the serial execution implementation is time-consuming. - Batch create table implementation since v6.0.0: - By default, BR creates tables in multiple batches, and each batch has 128 tables. Using this implementation, when BR creates one batch of tables, TiDB schema version only changes once. This scheme significantly increases the speed of table creation. + By default, BR creates tables in multiple batches, and each batch has 128 tables. Using this implementation, when BR creates one batch of tables, the TiDB schema version only changes once. This implementation significantly increases the speed of table creation. ## Test for the Batch Create Table feature @@ -54,7 +54,7 @@ This section describes the test information about the Batch Create Table feature - 3 TiDB instances. Each TiDB instance is equipped with 16 CPU cores, 32 GB memory. - 3 PD instances. Each PD instance is equipped with 16 CPU cores, 32 GB memory. -- Data to be restored: 16.16 TB +- The size of data to be restored: 16.16 TB The test result is as follows: @@ -62,4 +62,4 @@ The test result is as follows: ‘[2022/03/12 22:37:49.060 +08:00] [INFO] [collector.go:67] ["Full restore success summary"] [total-ranges=751760] [ranges-succeed=751760] [ranges-failed=0] [split-region=1h33m18.078448449s] [restore-ranges=542693] [total-take=1h41m35.471476438s] [restore-data-size(after-compressed)=8.337TB] [Size=8336694965072] [BackupTS=431773933856882690] [total-kv=148015861383] [total-kv-size=16.16TB] [average-speed=2.661GB/s]’ ``` -In the test result, you can find that the average speed of restoring one TiKV instance is as high as 181.65 MB/s (`average-speed(GB/s)`/`tikv_count` = `181.65(MB/s)`). \ No newline at end of file +From the test result, you can see that the average speed of restoring one TiKV instance is as high as 181.65 MB/s (which equals to `average-speed`/`tikv_count`). \ No newline at end of file