From 8e78e7784be1c0af3927791bd65f5178440bd732 Mon Sep 17 00:00:00 2001 From: yikeke Date: Tue, 28 Jul 2020 18:20:16 +0800 Subject: [PATCH 1/5] Update export-or-backup-using-dumpling.md --- export-or-backup-using-dumpling.md | 87 ++++++++++++++++++++++++------ 1 file changed, 71 insertions(+), 16 deletions(-) diff --git a/export-or-backup-using-dumpling.md b/export-or-backup-using-dumpling.md index e7ae899253ef9..a0d359da946b1 100644 --- a/export-or-backup-using-dumpling.md +++ b/export-or-backup-using-dumpling.md @@ -10,11 +10,19 @@ This document introduces how to use the [Dumpling](https://github.com/pingcap/du For backups of SST files (KV pairs) or backups of incremental data that are not sensitive to latency, refer to [BR](/br/backup-and-restore-tool.md). For real-time backups of incremental data, refer to [TiCDC](/ticdc/ticdc-overview.md). +For detailed usage of Dumpling, use the `--help` command to view it. + When using Dumpling, you need to execute the export command on a running cluster. This document assumes that there is a TiDB instance on the `127.0.0.1:4000` host and that this TiDB instance has a root user without a password. +## Download Dumpling + +To download the latest version of Dumpling, click the [download link](https://download.pingcap.org/dumpling-nightly-linux-amd64.tar.gz). + ## Export data from TiDB -Export data using the following command: +### Export to SQL files + +Dumpling exports data to SQL files by default. You can also export data to SQL files by adding the `--filetype sql` flag: {{< copyable "shell-regular" >}} @@ -22,16 +30,18 @@ Export data using the following command: dumpling \ -u root \ -P 4000 \ - -H 127.0.0.1 \ + -h 127.0.0.1 \ --filetype sql \ --threads 32 \ -o /tmp/test \ - -F $(( 1024 * 1024 * 256 )) + -F 256 ``` -In the above command, `-H`, `-P` and `-u` mean address, port and user, respectively. If password authentication is required, you can pass it to Dumpling with `-p $YOUR_SECRET_PASSWORD`. +In the above command, `-h`, `-P` and `-u` mean address, port and user, respectively. If password authentication is required, you can pass it to Dumpling with `-p $YOUR_SECRET_PASSWORD`. + +### Export to CSV files -Dumpling exports all tables (except for system tables) in the entire database by default. You can use `--where ` to select the records to be exported. If the exported data is in CSV format (CSV files can be exported using `--filetype csv`), you can also use `--sql ` to export records selected by the specified SQL statement. +If Dumpling exports data to CSV files (use `--filetype csv` to export to CSV files), you can also use `--sql ` to export the records selected by the specified SQL statement. For example, you can export all records that match `id < 100` in `test.sbtest1` using the following command: @@ -41,17 +51,23 @@ For example, you can export all records that match `id < 100` in `test.sbtest1` ./dumpling \ -u root \ -P 4000 \ - -H 127.0.0.1 \ + -h 127.0.0.1 \ -o /tmp/test \ --filetype csv \ - --sql "select * from `test`.`sbtest1` where id < 100" + --sql 'select * from `test`.`sbtest1` where id < 100' ``` -Note that the `--sql` option can be used only for exporting CSV files for now. However, you can use `--where` to filter the rows to be exported, and use the following command to export all rows with `id < 100`: - > **Note:** > -> You need to execute the `select * from where id < 100` statement on all tables to be exported. If any table does not have the specified field, then the export fails. +> 1. Currently, the `--sql` option can be used only for exporting to CSV files. +> +> 2. Here you need to execute the `select * from where id <100` statement on all tables to be exported. If some tables do not have specified fields, the export fails. + +### Filter the exported data + +#### Use the `--where` command to filter data + +By default, Dumpling exports the tables of the entire database except the tables in the system databases. You can use `--where ` to select the records to be exported. {{< copyable "shell-regular" >}} @@ -59,24 +75,63 @@ Note that the `--sql` option can be used only for exporting CSV files for now. H ./dumpling \ -u root \ -P 4000 \ - -H 127.0.0.1 \ + -h 127.0.0.1 \ -o /tmp/test \ --where "id < 100" ``` +The above command exports the data that matches `id < 100` from each table. + +#### Use the `--filter` command to filter data + +Dumpling can filter a specific library table by specifying table-filter with `--filter`. The syntax of table-filter is similar to that of .gitignore, [Detailed syntax reference](/table-filter.md). + +{{< copyable "shell-regular" >}} + +```shell +./dumpling \ + -u root \ + -P 4000 \ + -h 127.0.0.1 \ + -o /tmp/test \ + --filter "employees.*" + --filter "*.WorkOrder" +``` + +The above command exports all the tables in the `employees` database and the `WorkOrder` tables in all databases. + +#### Use `-B` or `-T` command to filter data + +Dumpling can also export a specific database/data table with the `-B` or `-T` parameter. + > **Note:** > -> Currently, Dumpling does not support exporting only certain tables specified by users (i.e. `-T` flag, see [this issue](https://github.com/pingcap/dumpling/issues/76)). If you do need this feature, you can use [MyDumper](/backup-and-restore-using-mydumper-lightning.md) instead. +> 1. The `--filter` parameter and the `-T` parameter cannot be used at the same time. +> +> 2. The `-T` parameter can only accept the complete `library name.table name` format, and only the table name is not supported. Example: Dumpling cannot recognize `-T WorkOrder`. + +For example, by specifying: + +-`-B employees` exports the `employees` database +-`-T employees.WorkOrder` exports the `employees.WorkOrder` data table + +### Improve Dumpling export efficiency through concurrency The exported file is stored in the `./export-` directory by default. Commonly used parameters are as follows: - `-o` is used to select the directory where the exported files are stored. -- `-F` option is used to specify the maximum size of a single file (the unit here is byte, different from MyDumper). -- `-r` option is used to specify the maximum number of records (or the number of rows in the database) for a single file. +- `-F` option is used to specify the maximum size of a single file (the unit here is `MiB`; inputs like `5GiB` or `8KB` are also acceptable). +- `-r` option is used to specify the maximum number of records (or the number of rows in the database) for a single file. When it is enabled, Dumpling enables concurrency in the table to improve the speed of exporting large tables. -You can use the above parameters to provide Dumpling with a higher degree of parallelism. +You can use the above parameters to provide Dumpling with a higher degree of concurrency. + +### Adjust Dumpling's data consistency options + +> **Note:** +> +> In most scenarios, you do not need to adjust the default data consistency options of Dumpling. -Another flag that is not mentioned above is `--consistency `, which controls the way in which data is exported for "consistency assurance". For TiDB, consistency is ensured by getting a snapshot of a certain timestamp by default (i.e. `--consistency snapshot`). When using snapshot for consistency, you can use the `--snapshot` parameter to specify the timestamp to be backed up. You can also use the following levels of consistency: +Dumpling uses the `--consistency ` option to control the way in which data is exported for "consistency assurance". For TiDB, data consistency is guaranteed by getting a snapshot of a certain timestamp by default (i.e. `--consistency snapshot`). When using snapshot for consistency, you can use the `--snapshot` parameter to specify the timestamp to be backed up. You can also use the following levels of consistency: - `flush`: Use [`FLUSH TABLES WITH READ LOCK`](https://dev.mysql.com/doc/refman/8.0/en/flush.html#flush-tables-with-read-lock) to ensure consistency. - `snapshot`: Get a consistent snapshot of the specified timestamp and export it. From eeff0bf0ac6d8b181bd00728c7f1896b3044981e Mon Sep 17 00:00:00 2001 From: yikeke Date: Tue, 28 Jul 2020 18:31:13 +0800 Subject: [PATCH 2/5] refine content --- export-or-backup-using-dumpling.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/export-or-backup-using-dumpling.md b/export-or-backup-using-dumpling.md index a0d359da946b1..58f9ef83adb72 100644 --- a/export-or-backup-using-dumpling.md +++ b/export-or-backup-using-dumpling.md @@ -84,7 +84,7 @@ The above command exports the data that matches `id < 100` from each table. #### Use the `--filter` command to filter data -Dumpling can filter a specific library table by specifying table-filter with `--filter`. The syntax of table-filter is similar to that of .gitignore, [Detailed syntax reference](/table-filter.md). +Dumpling can filter specific databases or tables by specifying the table filter with the `--filter` command. The syntax of table filters is similar to that of .gitignore. For details, see [Table Filter](/table-filter.md). {{< copyable "shell-regular" >}} @@ -100,22 +100,22 @@ Dumpling can filter a specific library table by specifying table-filter with `-- The above command exports all the tables in the `employees` database and the `WorkOrder` tables in all databases. -#### Use `-B` or `-T` command to filter data +#### Use the `-B` or `-T` command to filter data -Dumpling can also export a specific database/data table with the `-B` or `-T` parameter. +Dumpling can also export specific databases or tables with the `-B` or `-T` command. > **Note:** > -> 1. The `--filter` parameter and the `-T` parameter cannot be used at the same time. +> 1. The `--filter` command and the `-T` command cannot be used at the same time. > -> 2. The `-T` parameter can only accept the complete `library name.table name` format, and only the table name is not supported. Example: Dumpling cannot recognize `-T WorkOrder`. +> 2. The `-T` command can only accept a complete form of inputs like `database-name.table-name`, and inputs with only the table name are not accepted. Example: Dumpling cannot recognize `-T WorkOrder`. -For example, by specifying: +Examples: -`-B employees` exports the `employees` database --`-T employees.WorkOrder` exports the `employees.WorkOrder` data table +-`-T employees.WorkOrder` exports the `employees.WorkOrder` table -### Improve Dumpling export efficiency through concurrency +### Improve export efficiency through concurrency The exported file is stored in the `./export-` directory by default. Commonly used parameters are as follows: From 48a0ef22e3f555765495b5d6714b722f1f03d732 Mon Sep 17 00:00:00 2001 From: yikeke Date: Tue, 28 Jul 2020 19:08:12 +0800 Subject: [PATCH 3/5] change list style --- export-or-backup-using-dumpling.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/export-or-backup-using-dumpling.md b/export-or-backup-using-dumpling.md index 58f9ef83adb72..fe771e6354204 100644 --- a/export-or-backup-using-dumpling.md +++ b/export-or-backup-using-dumpling.md @@ -59,9 +59,9 @@ For example, you can export all records that match `id < 100` in `test.sbtest1` > **Note:** > -> 1. Currently, the `--sql` option can be used only for exporting to CSV files. +> - Currently, the `--sql` option can be used only for exporting to CSV files. > -> 2. Here you need to execute the `select * from where id <100` statement on all tables to be exported. If some tables do not have specified fields, the export fails. +> - Here you need to execute the `select * from where id <100` statement on all tables to be exported. If some tables do not have specified fields, the export fails. ### Filter the exported data @@ -106,9 +106,9 @@ Dumpling can also export specific databases or tables with the `-B` or `-T` comm > **Note:** > -> 1. The `--filter` command and the `-T` command cannot be used at the same time. +> - The `--filter` command and the `-T` command cannot be used at the same time. > -> 2. The `-T` command can only accept a complete form of inputs like `database-name.table-name`, and inputs with only the table name are not accepted. Example: Dumpling cannot recognize `-T WorkOrder`. +> - The `-T` command can only accept a complete form of inputs like `database-name.table-name`, and inputs with only the table name are not accepted. Example: Dumpling cannot recognize `-T WorkOrder`. Examples: From a2d512a733d2247c5532d8931e37da49c2e522e4 Mon Sep 17 00:00:00 2001 From: Keke Yi <40977455+yikeke@users.noreply.github.com> Date: Tue, 28 Jul 2020 19:08:35 +0800 Subject: [PATCH 4/5] Apply suggestions from code review Co-authored-by: Ran --- export-or-backup-using-dumpling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/export-or-backup-using-dumpling.md b/export-or-backup-using-dumpling.md index fe771e6354204..8cfd6c4b6844a 100644 --- a/export-or-backup-using-dumpling.md +++ b/export-or-backup-using-dumpling.md @@ -84,7 +84,7 @@ The above command exports the data that matches `id < 100` from each table. #### Use the `--filter` command to filter data -Dumpling can filter specific databases or tables by specifying the table filter with the `--filter` command. The syntax of table filters is similar to that of .gitignore. For details, see [Table Filter](/table-filter.md). +Dumpling can filter specific databases or tables by specifying the table filter with the `--filter` command. The syntax of table filters is similar to that of `.gitignore`. For details, see [Table Filter](/table-filter.md). {{< copyable "shell-regular" >}} From 2d0426ae680d9213400dbab745e17586737b415a Mon Sep 17 00:00:00 2001 From: Keke Yi <40977455+yikeke@users.noreply.github.com> Date: Tue, 28 Jul 2020 19:21:16 +0800 Subject: [PATCH 5/5] Apply suggestions from code review Co-authored-by: Ran --- export-or-backup-using-dumpling.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/export-or-backup-using-dumpling.md b/export-or-backup-using-dumpling.md index 8cfd6c4b6844a..2a4aa296271b2 100644 --- a/export-or-backup-using-dumpling.md +++ b/export-or-backup-using-dumpling.md @@ -10,7 +10,7 @@ This document introduces how to use the [Dumpling](https://github.com/pingcap/du For backups of SST files (KV pairs) or backups of incremental data that are not sensitive to latency, refer to [BR](/br/backup-and-restore-tool.md). For real-time backups of incremental data, refer to [TiCDC](/ticdc/ticdc-overview.md). -For detailed usage of Dumpling, use the `--help` command to view it. +For detailed usage of Dumpling, use the `--help` command or refer to [Dumpling User Guide](https://github.com/pingcap/dumpling/blob/master/docs/en/user-guide.md). When using Dumpling, you need to execute the export command on a running cluster. This document assumes that there is a TiDB instance on the `127.0.0.1:4000` host and that this TiDB instance has a root user without a password. @@ -102,7 +102,7 @@ The above command exports all the tables in the `employees` database and the `Wo #### Use the `-B` or `-T` command to filter data -Dumpling can also export specific databases or tables with the `-B` or `-T` command. +Dumpling can also export specific databases with the `-B` command or specific tables with the `-T` command. > **Note:** >