From e46b07f32f69c7f177b54defaa3fdb2d5910eb1b Mon Sep 17 00:00:00 2001 From: yikeke Date: Tue, 28 Jul 2020 17:47:35 +0800 Subject: [PATCH 01/17] align the structure of https://github.com/pingcap/docs-cn/pull/3800/ --- TOC.md | 5 +- ...up-and-restore-using-dumpling-lightning.md | 83 +++++++++++++++++++ download-ecosystem-tools.md | 13 +++ ...-using-dumpling.md => dumpling-overview.md | 0 ecosystem-tool-user-case.md | 6 +- ecosystem-tool-user-guide.md | 4 +- migrate-from-mysql-mydumper-files.md | 2 +- mydumper-overview.md | 8 +- table-filter.md | 2 +- 9 files changed, 112 insertions(+), 11 deletions(-) create mode 100644 backup-and-restore-using-dumpling-lightning.md rename export-or-backup-using-dumpling.md => dumpling-overview.md (100%) diff --git a/TOC.md b/TOC.md index 21f245ddf17e7..1d31ee1a876bb 100644 --- a/TOC.md +++ b/TOC.md @@ -60,12 +60,12 @@ + [Use TiDB Ansible](/scale-tidb-using-ansible.md) + [Use TiDB Operator](https://docs.pingcap.com/tidb-in-kubernetes/v1.1/scale-a-tidb-cluster) + Backup and Restore - + [Use Mydumper and TiDB Lightning](/backup-and-restore-using-mydumper-lightning.md) - + [Use Dumpling for Export or Backup](/export-or-backup-using-dumpling.md) + Use BR Tool + [Use BR Tool](/br/backup-and-restore-tool.md) + [BR Use Cases](/br/backup-and-restore-use-cases.md) + [BR storages](/br/backup-and-restore-storages.md) + + [Use Dumpling and TiDB Lightning (Recommended)](/backup-and-restore-using-dumpling-lightning.md) + + [Use Mydumper and TiDB Lightning](/backup-and-restore-using-mydumper-lightning.md) + [Read Historical Data](/read-historical-data.md) + [Configure Time Zone](/configure-time-zone.md) + [Daily Checklist](/daily-check.md) @@ -183,6 +183,7 @@ + [FAQ](/tidb-lightning/tidb-lightning-faq.md) + [Glossary](/tidb-lightning/tidb-lightning-glossary.md) + [TiCDC](/ticdc/ticdc-overview.md) + + [Dumpling](/dumpling-overview.md) + sync-diff-inspector + [Overview](/sync-diff-inspector/sync-diff-inspector-overview.md) + [Data Check for Tables with Different Schema/Table Names](/sync-diff-inspector/route-diff.md) diff --git a/backup-and-restore-using-dumpling-lightning.md b/backup-and-restore-using-dumpling-lightning.md new file mode 100644 index 0000000000000..365fb188cd985 --- /dev/null +++ b/backup-and-restore-using-dumpling-lightning.md @@ -0,0 +1,83 @@ +--- +title: 使用 Dumpling/TiDB Lightning 进行备份与恢复 +aliases: ['/docs-cn/dev/export-or-backup-using-dumpling/','/zh/tidb/dev/export-or-backup-using-dumpling'] +--- + +# 使用 Dumpling/TiDB Lightning 进行备份与恢复 + +本文档将详细介绍如何使用 Dumpling/TiDB Lightning 对 TiDB 进行全量备份与恢复。增量备份与恢复可使用 [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md)。 + +这里假定 TiDB 服务信息如下: + +|Name|Address|Port|User|Password| +|----|-------|----|----|--------| +|TiDB|127.0.0.1|4000|root|*| + +在这个备份恢复过程中,会用到下面的工具: + +- [Dumpling](/dumpling-overview.md):从 TiDB 导出数据 +- [TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md):导入数据到 TiDB + +## Dumpling/TiDB Lightning 全量备份恢复最佳实践 + +为了快速地备份恢复数据(特别是数据量巨大的库),可以参考以下建议: + +* 导出来的数据文件应当尽可能的小,可以通过设置参数 `-F` 来控制导出来的文件大小。如果后续使用 TiDB Lightning 对备份文件进行恢复,建议把 `dumpling` -F 参数的值设置为 `256m`。 +* 如果导出的表中有数据表的行数非常多,可以通过设置参数 `-r` 来开启表内并发。 + +## 从 TiDB 备份数据 + +使用 `dumpling` 从 TiDB 备份数据的命令如下: + +{{< copyable "shell-regular" >}} + +```bash +./bin/dumpling -h 127.0.0.1 -P 4000 -u root -t 32 -F 256m -T test.t1 -T test.t2 -o ./var/test +``` + +上述命令中,用 `-T test.t1 -T test.t2` 表明只导出 `test`.`t1`,`test`.`t2` 两张表。更多导出数据筛选方式可以参考[筛选导出的数据](/dumpling-overview.md#筛选导出的数据)。 + +`-t 32` 表明使用 32 个线程来导出数据。`-F 256m` 是将实际的表切分成一定大小的 chunk,这里的 chunk 大小为 256MB。 + +从 v4.0.0 版本开始,Dumpling 可以自动延长 GC 时间(Dumpling 需要访问 TiDB 集群的 PD 地址),而 v4.0.0 之前的版本,需要手动调整 GC 时间, 否则 `dumpling` 备份时可能出现以下报错: + +``` +Could not read data from testSchema.testTable: GC life time is shorter than transaction duration, transaction starts at 2019-08-05 21:10:01.451 +0800 CST, GC safe point is 2019-08-05 21:14:53.801 +0800 CST +``` + +手动执行两步命令: + +1. 执行 `dumpling` 命令前,查询 TiDB 集群的 [GC](/garbage-collection-overview.md) 值并使用 MySQL 客户端将其调整为合适的值: + + {{< copyable "sql" >}} + + ```sql + SELECT * FROM mysql.tidb WHERE VARIABLE_NAME = 'tikv_gc_life_time'; + ``` + + ``` + +-----------------------+------------------------------------------------------------------------------------------------+ + | VARIABLE_NAME | VARIABLE_VALUE | + +-----------------------+------------------------------------------------------------------------------------------------+ + | tikv_gc_life_time | 10m0s | + +-----------------------+------------------------------------------------------------------------------------------------+ + 1 rows in set (0.02 sec) + ``` + + {{< copyable "sql" >}} + + ```sql + update mysql.tidb set VARIABLE_VALUE = '720h' where VARIABLE_NAME = 'tikv_gc_life_time'; + ``` + +2. 执行 `dumpling` 命令后,将 TiDB 集群的 GC 值恢复到第 1 步中的初始值: + + {{< copyable "sql" >}} + + ```sql + update mysql.tidb set VARIABLE_VALUE = '10m' where VARIABLE_NAME = 'tikv_gc_life_time'; + ``` + +## 向 TiDB 恢复数据 + +使用 TiDB Lightning 将之前导出的数据导入到 TiDB,完成恢复操作。具体的使用方法见 [TiDB Lightning 使用文档](/tidb-lightning/tidb-lightning-tidb-backend.md) diff --git a/download-ecosystem-tools.md b/download-ecosystem-tools.md index e6a25a3aba15e..adab9abcf8bc5 100644 --- a/download-ecosystem-tools.md +++ b/download-ecosystem-tools.md @@ -59,6 +59,19 @@ Download [DM](https://docs.pingcap.com/tidb-data-migration/v1.0/overview) by usi > > `{version}` in the above download link indicates the version number of DM. For example, the download link for `v1.0.1` is `https://download.pingcap.org/dm-v1.0.1-linux-amd64.tar.gz`. You can check the published DM versions in the [DM Release](https://github.com/pingcap/dm/releases) page. +## Dumpling + +Download [Dumpling](/dumpling-overview.md) from the links below: + +| Installation package | Operating system | Architecture | SHA256 checksum | +|:---|:---|:---|:---| +| `https://download.pingcap.org/tidb-toolkit-{version}-linux-amd64.tar.gz` | Linux | amd64 | `https://download.pingcap.org/tidb-toolkit-{version}-linux-amd64.sha256` | + +> **Note:** +> +> The `{version}` in the download link is the version number of Dumpling. For example, the link for downloading the `v4.0.2` version of Dumpling is `https://download.pingcap.org/tidb-toolkit-v4.0.2-linux-amd64.tar.gz`. You can view the currently released versions in [Dumpling Releases](https://github.com/pingcap/dumpling/releases). +> Dumpling supports arm64 linux. You can replace `amd64` in the download link with `arm64`, which means the `arm64` version of Dumpling. + ## Syncer, Loader, and Mydumper If you want to download the latest version of [Syncer](/syncer-overview.md), [Loader](/loader-overview.md), or [Mydumper](/mydumper-overview.md), directly download the tidb-enterprise-tools package, because all these tools are included in this package. diff --git a/export-or-backup-using-dumpling.md b/dumpling-overview.md similarity index 100% rename from export-or-backup-using-dumpling.md rename to dumpling-overview.md diff --git a/ecosystem-tool-user-case.md b/ecosystem-tool-user-case.md index 5425a6ec0ee8f..b83629e2324dc 100644 --- a/ecosystem-tool-user-case.md +++ b/ecosystem-tool-user-case.md @@ -14,13 +14,13 @@ If you need to import the compatible CSV files exported by other tools to TiDB, ## Import full data from MySQL/Aurora -If you need to import full data from MySQL or Aurora, use [Dumpling](/export-or-backup-using-dumpling.md) first to export data as SQL dump files, and then use [TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md) to import data into the TiDB cluster. +If you need to import full data from MySQL or Aurora, use [Dumpling](/dumpling-overview.md) first to export data as SQL dump files, and then use [TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md) to import data into the TiDB cluster. ## Migrate data from MySQL/Aurora If you need to migrate both full data and incremental data from MySQL/Aurora, use [TiDB Data Migration](https://docs.pingcap.com/tidb-data-migration/v1.0/overview) (DM) to perform the full and incremental data migration. -If the full data volume is large (at the TB level), you can first use [Dumpling](/export-or-backup-using-dumpling.md) and [TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md) to perform the full data migration, and then use DM to perform the incremental data migration. +If the full data volume is large (at the TB level), you can first use [Dumpling](/dumpling-overview.md) and [TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md) to perform the full data migration, and then use DM to perform the incremental data migration. ## Back up and restore TiDB cluster @@ -30,7 +30,7 @@ In addition, BR can also be used to perform [incremental backup](/br/backup-and- ## Migrate data from TiDB -If you need to migrate data from a TiDB cluster to MySQL or to another TiDB cluster, use [Dumpling](/export-or-backup-using-dumpling.md) to export full data from TiDB as SQL dump files, and then use [TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md) to import data to MySQL or another TiDB cluster. +If you need to migrate data from a TiDB cluster to MySQL or to another TiDB cluster, use [Dumpling](/dumpling-overview.md) to export full data from TiDB as SQL dump files, and then use [TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md) to import data to MySQL or another TiDB cluster. If you also need to migrate incremental data, use [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md). diff --git a/ecosystem-tool-user-guide.md b/ecosystem-tool-user-guide.md index de8f1f9af7efb..0b0d47f5629da 100644 --- a/ecosystem-tool-user-guide.md +++ b/ecosystem-tool-user-guide.md @@ -9,7 +9,7 @@ This document introduces the functionalities of TiDB ecosystem tools and their r ## Full data export -[Dumpling](/export-or-backup-using-dumpling.md) is a tool for the logical full data export from MySQL or TiDB. +[Dumpling](/dumpling-overview.md) is a tool for the logical full data export from MySQL or TiDB. The following are the basics of Dumpling: @@ -76,7 +76,7 @@ If the data volume is below the TB level, it is recommended to migrate data from If the data volume is at the TB level, take the following steps: -1. Use [Dumpling](/export-or-backup-using-dumpling.md) to export the full data from MySQL/MariaDB. +1. Use [Dumpling](/dumpling-overview.md) to export the full data from MySQL/MariaDB. 2. Use [TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md) to import the data exported in Step 1 to the TiDB cluster. 3. Use DM to migrate the incremental data from MySQL/MariaDB to TiDB. diff --git a/migrate-from-mysql-mydumper-files.md b/migrate-from-mysql-mydumper-files.md index c7fbc0e7bdbab..2cd17f57a9536 100644 --- a/migrate-from-mysql-mydumper-files.md +++ b/migrate-from-mysql-mydumper-files.md @@ -6,7 +6,7 @@ aliases: ['/docs/dev/migrate-from-mysql-mydumper-files/'] # Migrate Data from MySQL SQL Files -This document describes how to migrate data from MySQL SQL files to TiDB using TiDB Lightning. For details on how to generate MySQL SQL files, refer to [Mydumper](/mydumper-overview.md) or [Dumpling](/export-or-backup-using-dumpling.md). +This document describes how to migrate data from MySQL SQL files to TiDB using TiDB Lightning. For details on how to generate MySQL SQL files, refer to [Mydumper](/mydumper-overview.md) or [Dumpling](/dumpling-overview.md). The data migration process described in this document uses TiDB Lightning. The steps are as follows. diff --git a/mydumper-overview.md b/mydumper-overview.md index a13ad939707d1..7a4ec3edf93b5 100644 --- a/mydumper-overview.md +++ b/mydumper-overview.md @@ -1,10 +1,14 @@ --- -title: Mydumper Instructions +title: Mydumper Instruction summary: Use Mydumper to export data from TiDB. aliases: ['/docs/dev/mydumper-overview/','/docs/dev/reference/tools/mydumper/'] --- -# Mydumper Instructions +# Mydumper Instruction + +> **Warning:** +> +> The maintainers have stopped developing new features for Mydumper, and most of its features have been replaced by [Dumpling](/dumpling-overview.md). It is strongly recommended that you switch to Dumpling. ## What is Mydumper? diff --git a/table-filter.md b/table-filter.md index b5120a821b78a..d436df02442d6 100644 --- a/table-filter.md +++ b/table-filter.md @@ -27,7 +27,7 @@ Table filters can be applied to the tools using multiple `-f` or `--filter` comm # ^~~~~~~~~~~~~~~~~~~~~~~ ``` -* [Dumpling](/export-or-backup-using-dumpling.md): +* [Dumpling](/dumpling-overview.md): {{< copyable "shell-regular" >}} From 98a0add4209a3f045620544db12dae2c7e396192 Mon Sep 17 00:00:00 2001 From: yikeke Date: Tue, 28 Jul 2020 17:51:46 +0800 Subject: [PATCH 02/17] revert --- mydumper-overview.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mydumper-overview.md b/mydumper-overview.md index 7a4ec3edf93b5..1ee0d2f77cb12 100644 --- a/mydumper-overview.md +++ b/mydumper-overview.md @@ -1,10 +1,10 @@ --- -title: Mydumper Instruction +title: Mydumper Instructions summary: Use Mydumper to export data from TiDB. aliases: ['/docs/dev/mydumper-overview/','/docs/dev/reference/tools/mydumper/'] --- -# Mydumper Instruction +# Mydumper Instructions > **Warning:** > From a42e50f4ff156ec3152e1ed9246521e7e7d52efe Mon Sep 17 00:00:00 2001 From: yikeke Date: Wed, 29 Jul 2020 14:00:26 +0800 Subject: [PATCH 03/17] Update dumpling-overview.md --- dumpling-overview.md | 106 +++++++++++++++++++++++++++++++++++-------- 1 file changed, 88 insertions(+), 18 deletions(-) diff --git a/dumpling-overview.md b/dumpling-overview.md index de429bb178c22..13198b8bbdad2 100644 --- a/dumpling-overview.md +++ b/dumpling-overview.md @@ -1,24 +1,42 @@ --- -title: Export or Backup Data Using Dumpling -summary: Use the Dumpling tool to export or backup data in TiDB. -aliases: ['/docs/dev/export-or-backup-using-dumpling/'] +title: Dumpling Overview +summary: Use the Dumpling tool to export data from TiDB. --- -# Export or Backup Data Using Dumpling +# Dumpling Overview -This document introduces how to use the [Dumpling](https://github.com/pingcap/dumpling) tool to export or backup data in TiDB. Dumpling exports data stored in TiDB as SQL or CSV data files and can be used to make a logical full backup or export. +This document introduces the data export tool - [Dumpling](https://github.com/pingcap/dumpling). Dumpling exports data stored in TiDB/MySQL as SQL or CSV data files and can be used to make a logical full backup or export. -For backups of SST files (KV pairs) or backups of incremental data that are not sensitive to latency, refer to [BR](/br/backup-and-restore-tool.md). For real-time backups of incremental data, refer to [TiCDC](/ticdc/ticdc-overview.md). +For backups of SST files (key-value pairs) or backups of incremental data that are not sensitive to latency, refer to [BR](/br/backup-and-restore-tool.md). For real-time backups of incremental data, refer to [TiCDC](/ticdc/ticdc-overview.md). -For detailed usage of Dumpling, use the `--help` command or refer to [Dumpling User Guide](https://github.com/pingcap/dumpling/blob/master/docs/en/user-guide.md). +## Improvements of Dumpling compared with Mydumper + +1. Support exporting data in multiple formats, including SQL and CSV +2. Support the [table-filter](https://github.com/pingcap/tidb-tools/blob/master/pkg/table-filter/README.md) feature, which makes it easier to filter data +3. More optimizations are made for TiDB: + - Support configuring the memory limit of a single TiDB SQL statement + - Support automatic adjustment of TiDB GC time for TiDB v4.0.0 and above + - Use TiDB's hidden column `_tidb_rowid` to optimize the performance of concurrent data export from a single table + - For TiDB, you can set the value of [`tidb_snapshot`](/read-historical-data.md#how-tidb-reads-data-from-history-versions) to specify the time point of the data backup. This ensures the consistency of the backup, instead of using `FLUSH TABLES WITH READ LOCK` to ensure the consistency. + +## Dumpling introduction + +Dumpling is written in Go. The Github project is [pingcap/dumpling](https://github.com/pingcap/dumpling). + +For detailed usage of Dumpling, use the `--help` option or refer to [Parameter list of Dumpling](#parameter-list-of-dumpling). When using Dumpling, you need to execute the export command on a running cluster. This document assumes that there is a TiDB instance on the `127.0.0.1:4000` host and that this TiDB instance has a root user without a password. -## Download Dumpling +Dumpling is included in the tidb-toolkit installation package and can be [download here](/download-ecosystem-tools.md#dumpling). + +## Export data from TiDB/MySQL -To download the latest version of Dumpling, click the [download link](https://download.pingcap.org/dumpling-nightly-linux-amd64.tar.gz). +### Required privileges -## Export data from TiDB +- SELECT +- RELOAD +- LOCK TABLES +- REPLICATION CLIENT ### Export to SQL files @@ -65,7 +83,7 @@ For example, you can export all records that match `id < 100` in `test.sbtest1` ### Filter the exported data -#### Use the `--where` command to filter data +#### Use the `--where` option to filter data By default, Dumpling exports the tables of the entire database except the tables in the system databases. You can use `--where ` to select the records to be exported. @@ -82,9 +100,9 @@ By default, Dumpling exports the tables of the entire database except the tables The above command exports the data that matches `id < 100` from each table. -#### Use the `--filter` command to filter data +#### Use the `--filter` option to filter data -Dumpling can filter specific databases or tables by specifying the table filter with the `--filter` command. The syntax of table filters is similar to that of `.gitignore`. For details, see [Table Filter](/table-filter.md). +Dumpling can filter specific databases or tables by specifying the table filter with the `--filter` option. The syntax of table filters is similar to that of `.gitignore`. For details, see [Table Filter](/table-filter.md). {{< copyable "shell-regular" >}} @@ -100,15 +118,15 @@ Dumpling can filter specific databases or tables by specifying the table filter The above command exports all the tables in the `employees` database and the `WorkOrder` tables in all databases. -#### Use the `-B` or `-T` command to filter data +#### Use the `-B` or `-T` option to filter data -Dumpling can also export specific databases with the `-B` command or specific tables with the `-T` command. +Dumpling can also export specific databases with the `-B` option or specific tables with the `-T` option. > **Note:** > -> - The `--filter` command and the `-T` command cannot be used at the same time. +> - The `--filter` option and the `-T` option cannot be used at the same time. > -> - The `-T` command can only accept a complete form of inputs like `database-name.table-name`, and inputs with only the table name are not accepted. Example: Dumpling cannot recognize `-T WorkOrder`. +> - The `-T` option can only accept a complete form of inputs like `database-name.table-name`, and inputs with only the table name are not accepted. Example: Dumpling cannot recognize `-T WorkOrder`. Examples: @@ -158,7 +176,26 @@ ls -lh /tmp/test | awk '{print $5 "\t" $9}' 190K test.sbtest3.0.sql ``` -In addition, if the data volume is very large, to avoid export failure due to GC during the export process, you can extend the GC time in advance: +### Export historical data snapshot of TiDB + +Dumpling can export the data of a certain [tidb_snapshot](/read-historical-data.md#how-tidb-reads-data-from-history-versions) with the `--snapshot` option specified. + +The `--snapshot` option can be set to a TSO (the `Position` field output by the `SHOW MASTER STATUS` command) or a valid time of the `datetime` data type, for example: + +{{< copyable "shell-regular" >}} + +```shell +./dumpling --snapshot 417773951312461825 +./dumpling --snapshot "2020-07-02 17:12:45" +``` + +The TiDB historical data snapshots when the TSO is `417773951312461825` and the time is `2020-07-02 17:12:45` are exported. + +### TiDB GC settings when exporting a large volume of data + +When exporting data from TiDB, if the TiDB version is greater than v4.0.0 and Dumpling can access the PD address of the TiDB cluster, Dumpling automatically extends the GC time without affecting the original cluster. But for TiDB earlier than v4.0.0, you need to manually modify the GC time. + +In other scenarios, if the data volume is very large, to avoid export failure due to GC during the export process, you can extend the GC time in advance: {{< copyable "sql" >}} @@ -175,3 +212,36 @@ update mysql.tidb set VARIABLE_VALUE = '10m' where VARIABLE_NAME = 'tikv_gc_life ``` Finally, all the exported data can be imported back to TiDB using [Lightning](/tidb-lightning/tidb-lightning-tidb-backend.md). + +## Parameter list of Dumpling + +| Parameters | Usage | Default value | +| --------| --- | --- | +| -B 或 --database | 导出指定数据库 | +| -T 或 --tables-list | 导出指定数据表 | +| -f 或 --filter | 导出能匹配模式的表,语法可参考 [table-filter](https://github.com/pingcap/tidb-tools/blob/master/pkg/table-filter/README.md)(只有英文版) | "\*.\*" 不过滤任何库表 | +| --case-sensitive | table-filter 是否大小写敏感 | false,大小写不敏感 | +| -h 或 --host| 链接节点地址 | "127.0.0.1" | +| -t 或 --threads | 备份并发线程数| 4 | +| -r 或 --rows |将 table 划分成 row 行数据,一般针对大表操作并发生成多个文件。| +| --loglevel | 日志级别 {debug,info,warn,error,dpanic,panic,fatal} | "info" | +| -d 或 --no-data | 不导出数据, 适用于只导出 schema 场景 | +| --no-header | 导出 table csv 数据,不生成 header | +| -W 或 --no-views| 不导出 view | true | +| -m 或 --no-schemas | 不导出 schema , 只导出数据 | +| -s 或--statement-size | 控制 Insert Statement 的大小,单位 bytes | +| -F 或 --filesize | 将 table 数据划分出来的文件大小, 需指明单位 (如 `128B`, `64KiB`, `32MiB`, `1.5GiB`) | +| --filetype| 导出文件类型 csv/sql | "sql" | +| -o 或 --output | 设置导出文件路径 | "./export-${time}" | +| -S 或 --sql | 根据指定的 sql 导出数据,该指令不支持并发导出 | +| --consistency | flush: dump 前用 FTWRL
snapshot: 通过 tso 指定 dump 位置
lock: 对需要 dump 的所有表执行 lock tables read
none: 不加锁 dump,无法保证一致性
auto: MySQL flush, TiDB snapshot | "auto" | +| --snapshot | snapshot tso, 只在 consistency=snapshot 下生效 | +| --where | 对备份的数据表通过 where 条件指定范围 | +| -p 或 --password | 链接密码 | +| -P 或 --port | 链接端口 | 4000 | +| -u 或 --user | 用户名 | "root" | +| --dump-empty-database | 导出空数据库的建库语句 | true | +| --tidbMemQuotaQuery | 导出 TiDB 数据库时单条 query 最大使用的内存 | 34359738368(32GB) | +| --ca | 用于 TLS 连接的 certificate authority 文件的地址 | +| --cert | 用于 TLS 连接的 client certificate 文件的地址 | +| --key | 用于 TLS 连接的 client private key 文件的地址 | From fe6f352b5358b79b7d0fb7eba002c791522fc9e0 Mon Sep 17 00:00:00 2001 From: yikeke Date: Wed, 29 Jul 2020 14:58:07 +0800 Subject: [PATCH 04/17] Update dumpling-overview.md --- dumpling-overview.md | 72 +++++++++++++++++++++++++------------------- 1 file changed, 41 insertions(+), 31 deletions(-) diff --git a/dumpling-overview.md b/dumpling-overview.md index 13198b8bbdad2..ab45bbf92acc1 100644 --- a/dumpling-overview.md +++ b/dumpling-overview.md @@ -23,7 +23,7 @@ For backups of SST files (key-value pairs) or backups of incremental data that a Dumpling is written in Go. The Github project is [pingcap/dumpling](https://github.com/pingcap/dumpling). -For detailed usage of Dumpling, use the `--help` option or refer to [Parameter list of Dumpling](#parameter-list-of-dumpling). +For detailed usage of Dumpling, use the `--help` option or refer to [Option list of Dumpling](#option-list-of-dumpling). When using Dumpling, you need to execute the export command on a running cluster. This document assumes that there is a TiDB instance on the `127.0.0.1:4000` host and that this TiDB instance has a root user without a password. @@ -213,35 +213,45 @@ update mysql.tidb set VARIABLE_VALUE = '10m' where VARIABLE_NAME = 'tikv_gc_life Finally, all the exported data can be imported back to TiDB using [Lightning](/tidb-lightning/tidb-lightning-tidb-backend.md). -## Parameter list of Dumpling +## Option list of Dumpling -| Parameters | Usage | Default value | +| Options | Usage | Default value | | --------| --- | --- | -| -B 或 --database | 导出指定数据库 | -| -T 或 --tables-list | 导出指定数据表 | -| -f 或 --filter | 导出能匹配模式的表,语法可参考 [table-filter](https://github.com/pingcap/tidb-tools/blob/master/pkg/table-filter/README.md)(只有英文版) | "\*.\*" 不过滤任何库表 | -| --case-sensitive | table-filter 是否大小写敏感 | false,大小写不敏感 | -| -h 或 --host| 链接节点地址 | "127.0.0.1" | -| -t 或 --threads | 备份并发线程数| 4 | -| -r 或 --rows |将 table 划分成 row 行数据,一般针对大表操作并发生成多个文件。| -| --loglevel | 日志级别 {debug,info,warn,error,dpanic,panic,fatal} | "info" | -| -d 或 --no-data | 不导出数据, 适用于只导出 schema 场景 | -| --no-header | 导出 table csv 数据,不生成 header | -| -W 或 --no-views| 不导出 view | true | -| -m 或 --no-schemas | 不导出 schema , 只导出数据 | -| -s 或--statement-size | 控制 Insert Statement 的大小,单位 bytes | -| -F 或 --filesize | 将 table 数据划分出来的文件大小, 需指明单位 (如 `128B`, `64KiB`, `32MiB`, `1.5GiB`) | -| --filetype| 导出文件类型 csv/sql | "sql" | -| -o 或 --output | 设置导出文件路径 | "./export-${time}" | -| -S 或 --sql | 根据指定的 sql 导出数据,该指令不支持并发导出 | -| --consistency | flush: dump 前用 FTWRL
snapshot: 通过 tso 指定 dump 位置
lock: 对需要 dump 的所有表执行 lock tables read
none: 不加锁 dump,无法保证一致性
auto: MySQL flush, TiDB snapshot | "auto" | -| --snapshot | snapshot tso, 只在 consistency=snapshot 下生效 | -| --where | 对备份的数据表通过 where 条件指定范围 | -| -p 或 --password | 链接密码 | -| -P 或 --port | 链接端口 | 4000 | -| -u 或 --user | 用户名 | "root" | -| --dump-empty-database | 导出空数据库的建库语句 | true | -| --tidbMemQuotaQuery | 导出 TiDB 数据库时单条 query 最大使用的内存 | 34359738368(32GB) | -| --ca | 用于 TLS 连接的 certificate authority 文件的地址 | -| --cert | 用于 TLS 连接的 client certificate 文件的地址 | -| --key | 用于 TLS 连接的 client private key 文件的地址 | +| -V or --version | Output the Dumpling version and exit directly | +| -B or --database | Export specified databases | +| -T or --tables-list | Export specified tables | +| -f or --filter | Export tables that match the filter pattern. For the filter syntax, see[table-filter](/table-filter.md). | `"\*.\*"` (export all databases or tables) | +| --case-sensitive | whether table-filter is case-sensitive | false (case-insensitive) | +| -h or --host| The address of the linked node | "127.0.0.1" | +| -t or --threads | The number of concurrent backup threads | 4 | +| -r or --rows | Divide the table into specified rows of data (generally applicable for concurrent operations of splitting a large table into multiple files. | +| -L or --logfile | Log output address. If it is empty, the log will be output to the console | "" | +| --loglevel | Log level {debug,info,warn,error,dpanic,panic,fatal} | "info" | +| --logfmt | Log output format {text,json} | "text" | +| -d or --no-data | Do not export data (suitable for scenarios where only the schema is exported) | +| --no-header | Export CSV files of the tables without generating header | +| -W or --no-views| Do not export the views | true | +| -m or --no-schemas | Do not export the schema with only the data exported | +| -s or--statement-size | Control the size of the `INSERT` statements; the unit is bytes | +| -F or --filesize | The file size of the divided tables. The unit must be specified such as `128B`, `64KiB`, `32MiB`, and `1.5GiB`. | +| --filetype| Exported file type (csv/sql) | "sql" | +| -o or --output | Exported file path | "./export-${time}" | +| -S or --sql | Export data according to the specified SQL statement. This command does not support concurrent export. | +| --consistency | flush: use FTWRL before the dump
snapshot: specify the dump files' location through TSO
lock: execute `lock tables read` on all tables to be dumped
none: dump without adding locks, which sacrifices consistency
auto: MySQL flush, TiDB snapshot | "auto" | +| --snapshot | snapshot TSO; valid only when `consistency=snapshot` | +| --where | Specify the scope of the table backup through the `where` condition | +| -p or --password | The password of the linked node | +| -P or --port | The port of the linked node | 4000 | +| -u or --user | The username of the linked node | "root" | +| --dump-empty-database | Export the `CREATE DATABASE` statements of the empty databases | true | +| --tidbMemQuotaQuery | Maximum memory used by a single query when exporting TiDB database | 34359738368(32GB) | +| --ca | The address of the certificate authority file for TLS connection | +| --cert | The address of the client certificate file for TLS connection | +| --key | The address of the client private key file for TLS connection | +| --csv-delimiter | Delimiter of character type variables in CSV files | '"' | +| --csv-separator | Separator of each value in CSV files | ',' | +| --csv-null-value | Representation of null values in CSV files | "\\N" | +| --escape-backslash | Use backslash (`\`) to escape special characters in the export file | true | +| --output-filename-template | The data file names in the [golang arguments](https://golang.org/pkg/text/template/#hdr-Arguments) format
Support the `{{.DB}}`, `{{.Table}}`, and `{{.Index}}` arguments
The three arguments represent the database name, table name, and block ID of the data file | '{{.DB}}.{{.Table}}.{{.Index}}' | +| --status-addr | Dumpling's service address, including the address for Prometheus to pull metrics and pprof debugging | ":8281" | +| --tidb-mem-quota-query | The memory limit of exporting SQL statements by a single Dumpling command, the unit is byte, and the default value is 32 GB | 34359738368 | From 5ca54efb751bd720b78a29ebe932df76d00f2825 Mon Sep 17 00:00:00 2001 From: yikeke Date: Wed, 29 Jul 2020 15:35:52 +0800 Subject: [PATCH 05/17] update backup-and-restore-using-dumpling-lightning.md --- ...up-and-restore-using-dumpling-lightning.md | 50 ++++++++++--------- ...up-and-restore-using-mydumper-lightning.md | 20 ++++---- 2 files changed, 36 insertions(+), 34 deletions(-) diff --git a/backup-and-restore-using-dumpling-lightning.md b/backup-and-restore-using-dumpling-lightning.md index 365fb188cd985..01a873005c8a9 100644 --- a/backup-and-restore-using-dumpling-lightning.md +++ b/backup-and-restore-using-dumpling-lightning.md @@ -1,33 +1,33 @@ --- -title: 使用 Dumpling/TiDB Lightning 进行备份与恢复 +title: Use Dumpling and TiDB Lightning for Data Backup and Restoration aliases: ['/docs-cn/dev/export-or-backup-using-dumpling/','/zh/tidb/dev/export-or-backup-using-dumpling'] --- -# 使用 Dumpling/TiDB Lightning 进行备份与恢复 +# Use Dumpling and TiDB Lightning for Data Backup and Restoration -本文档将详细介绍如何使用 Dumpling/TiDB Lightning 对 TiDB 进行全量备份与恢复。增量备份与恢复可使用 [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md)。 +This document introduces in detail how to use Dumpling and TiDB Lightning to backup and restore full data of TiDB. For incremental backup and restoration, refer to [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md). -这里假定 TiDB 服务信息如下: +Suppose that the TiDB server information is as follows: -|Name|Address|Port|User|Password| +|Server Name|Server Address|Port|User|Password| |----|-------|----|----|--------| |TiDB|127.0.0.1|4000|root|*| -在这个备份恢复过程中,会用到下面的工具: +Use the following tools for data backup and restoration: -- [Dumpling](/dumpling-overview.md):从 TiDB 导出数据 -- [TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md):导入数据到 TiDB +- [Dumpling](/dumpling-overview.md): to export data from TiDB +- [TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md): to import data into TiDB -## Dumpling/TiDB Lightning 全量备份恢复最佳实践 +## Best practices for full backup and restoration using Dumpling/TiDB Lightning -为了快速地备份恢复数据(特别是数据量巨大的库),可以参考以下建议: +To quickly backup and restore data (especially large amounts of data), refer to the following recommendations: -* 导出来的数据文件应当尽可能的小,可以通过设置参数 `-F` 来控制导出来的文件大小。如果后续使用 TiDB Lightning 对备份文件进行恢复,建议把 `dumpling` -F 参数的值设置为 `256m`。 -* 如果导出的表中有数据表的行数非常多,可以通过设置参数 `-r` 来开启表内并发。 +* Keep the exported data file as small as possible. It is recommended to use the `-F` option of Dumpling to set the file size. If you use TiDB Lightning to restore data, it is recommended that you set the value of `-F` to `256m`. +* If some of the exported tables have many rows, you can enable concurrency in the table by setting the `-r` option. -## 从 TiDB 备份数据 +## Backup data from TiDB -使用 `dumpling` 从 TiDB 备份数据的命令如下: +Use the following `dumpling` command to backup data from TiDB. {{< copyable "shell-regular" >}} @@ -35,19 +35,21 @@ aliases: ['/docs-cn/dev/export-or-backup-using-dumpling/','/zh/tidb/dev/export-o ./bin/dumpling -h 127.0.0.1 -P 4000 -u root -t 32 -F 256m -T test.t1 -T test.t2 -o ./var/test ``` -上述命令中,用 `-T test.t1 -T test.t2` 表明只导出 `test`.`t1`,`test`.`t2` 两张表。更多导出数据筛选方式可以参考[筛选导出的数据](/dumpling-overview.md#筛选导出的数据)。 +In this command: -`-t 32` 表明使用 32 个线程来导出数据。`-F 256m` 是将实际的表切分成一定大小的 chunk,这里的 chunk 大小为 256MB。 +- `-T test.t1 -T test.t2` means that only the two tables `test`.`t1` and `test`.`t2` are exported. For more methods to filter exported data, refer to [Filter exported data](/dumpling-overview.md#filter-the0exported-data). +- `-t 32` means that 32 threads are used to export the data. +- `-F 256m` means that a table is partitioned into chunks, and one chunk is 256MB. -从 v4.0.0 版本开始,Dumpling 可以自动延长 GC 时间(Dumpling 需要访问 TiDB 集群的 PD 地址),而 v4.0.0 之前的版本,需要手动调整 GC 时间, 否则 `dumpling` 备份时可能出现以下报错: +Starting from v4.0.0, Dumpling can automatically extends the GC time if it can access the PD address of the TiDB cluster. But for TiDB earlier than v4.0.0, you need to manually modify the GC time. Otherwise, you might bump into the following error: -``` +```log Could not read data from testSchema.testTable: GC life time is shorter than transaction duration, transaction starts at 2019-08-05 21:10:01.451 +0800 CST, GC safe point is 2019-08-05 21:14:53.801 +0800 CST ``` -手动执行两步命令: +The steps to manually modify the GC time are as follows: -1. 执行 `dumpling` 命令前,查询 TiDB 集群的 [GC](/garbage-collection-overview.md) 值并使用 MySQL 客户端将其调整为合适的值: +1. Before executing the `dumpling` command, query the [GC](/garbage-collection-overview.md) value of the TiDB cluster and execute the following statement in the MySQL client to adjust it to a suitable value: {{< copyable "sql" >}} @@ -55,7 +57,7 @@ Could not read data from testSchema.testTable: GC life time is shorter than tran SELECT * FROM mysql.tidb WHERE VARIABLE_NAME = 'tikv_gc_life_time'; ``` - ``` + ```sql +-----------------------+------------------------------------------------------------------------------------------------+ | VARIABLE_NAME | VARIABLE_VALUE | +-----------------------+------------------------------------------------------------------------------------------------+ @@ -70,7 +72,7 @@ Could not read data from testSchema.testTable: GC life time is shorter than tran update mysql.tidb set VARIABLE_VALUE = '720h' where VARIABLE_NAME = 'tikv_gc_life_time'; ``` -2. 执行 `dumpling` 命令后,将 TiDB 集群的 GC 值恢复到第 1 步中的初始值: +2. After executing the `dumpling` command, restore the GC value of the TiDB cluster to the initial value in step 1: {{< copyable "sql" >}} @@ -78,6 +80,6 @@ Could not read data from testSchema.testTable: GC life time is shorter than tran update mysql.tidb set VARIABLE_VALUE = '10m' where VARIABLE_NAME = 'tikv_gc_life_time'; ``` -## 向 TiDB 恢复数据 +## Restore data into TiDB -使用 TiDB Lightning 将之前导出的数据导入到 TiDB,完成恢复操作。具体的使用方法见 [TiDB Lightning 使用文档](/tidb-lightning/tidb-lightning-tidb-backend.md) +To restore data into TiDB, use TiDB Lightning to import the exported data. See [TiDB Lightning Tutorial](/tidb-lightning/tidb-lightning-tidb-backend.md). diff --git a/backup-and-restore-using-mydumper-lightning.md b/backup-and-restore-using-mydumper-lightning.md index 4bcd3b28c0437..2ae17734ff12e 100644 --- a/backup-and-restore-using-mydumper-lightning.md +++ b/backup-and-restore-using-mydumper-lightning.md @@ -7,9 +7,9 @@ aliases: ['/docs/dev/backup-and-restore-using-mydumper-lightning/','/docs/dev/ho This document describes how to perform full backup and restoration of the TiDB data using Mydumper and TiDB Lightning. For incremental backup and restoration, refer to [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md). -Suppose that the TiDB service information is as follows: +Suppose that the TiDB server information is as follows: -|Name|Address|Port|User|Password| +|Server Name|Server Address|Port|User|Password| |:----|:-------|:----|:----|:--------| |TiDB|127.0.0.1|4000|root|*| @@ -32,11 +32,11 @@ Use [Mydumper](/mydumper-overview.md) to export data from TiDB and use [TiDB Lig To quickly backup and restore data (especially large amounts of data), refer to the following recommendations: -* Keep the exported data file as small as possible. It is recommended to use the `-F` parameter to set the file size. If you use TiDB Lightning to restore data, it is recommended that you set the value of `-F` to `256` (MB). If you use `loader` for restoration, it is recommended to set the value to `64` (MB). +* Keep the exported data file as small as possible. It is recommended to use the `-F` option of Mydumper to set the file size. If you use TiDB Lightning to restore data, it is recommended that you set the value of `-F` to `256` (MB). If you use `loader` for restoration, it is recommended to set the value to `64` (MB). ## Backup data from TiDB -Use `mydumper` to backup data from TiDB. +Use the following `mydumper` command to backup data from TiDB: {{< copyable "shell-regular" >}} @@ -44,13 +44,13 @@ Use `mydumper` to backup data from TiDB. ./bin/mydumper -h 127.0.0.1 -P 4000 -u root -t 32 -F 256 -B test -T t1,t2 --skip-tz-utc -o ./var/test ``` -In this command, +In this command: -`-B test` means that the data is exported from the `test` database. -`-T t1,t2` means that only the `t1` and `t2` tables are exported. -`-t 32` means that 32 threads are used to export the data. -`-F 256` means that a table is partitioned into chunks, and one chunk is 256MB. -`--skip-tz-utc` means to ignore the inconsistency of time zone setting between MySQL and the data exporting machine and to disable automatic conversion. +- `-B test` means that the data is exported from the `test` database. +- `-T t1,t2` means that only the `t1` and `t2` tables are exported. +- `-t 32` means that 32 threads are used to export the data. +- `-F 256` means that a table is partitioned into chunks, and one chunk is 256MB. +- `--skip-tz-utc` means to ignore the inconsistency of time zone setting between MySQL and the data exporting machine and to disable automatic conversion. If `mydumper` returns the following error: From 8ba56ea0bee7b07cb8661d9a6950f47c2857ded7 Mon Sep 17 00:00:00 2001 From: yikeke Date: Wed, 29 Jul 2020 15:37:45 +0800 Subject: [PATCH 06/17] ADD summary --- backup-and-restore-using-dumpling-lightning.md | 1 + 1 file changed, 1 insertion(+) diff --git a/backup-and-restore-using-dumpling-lightning.md b/backup-and-restore-using-dumpling-lightning.md index 01a873005c8a9..9feccf2e35069 100644 --- a/backup-and-restore-using-dumpling-lightning.md +++ b/backup-and-restore-using-dumpling-lightning.md @@ -1,5 +1,6 @@ --- title: Use Dumpling and TiDB Lightning for Data Backup and Restoration +summary: Introduce how to use Dumpling and TiDB Lightning to backup and restore full data of TiDB. aliases: ['/docs-cn/dev/export-or-backup-using-dumpling/','/zh/tidb/dev/export-or-backup-using-dumpling'] --- From e3d1fb95e8f7c4a1abaf72e95e6e60baac678e42 Mon Sep 17 00:00:00 2001 From: yikeke Date: Wed, 29 Jul 2020 17:05:49 +0800 Subject: [PATCH 07/17] fix an anchor --- backup-and-restore-using-dumpling-lightning.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backup-and-restore-using-dumpling-lightning.md b/backup-and-restore-using-dumpling-lightning.md index 9feccf2e35069..e90c5369a37d8 100644 --- a/backup-and-restore-using-dumpling-lightning.md +++ b/backup-and-restore-using-dumpling-lightning.md @@ -38,7 +38,7 @@ Use the following `dumpling` command to backup data from TiDB. In this command: -- `-T test.t1 -T test.t2` means that only the two tables `test`.`t1` and `test`.`t2` are exported. For more methods to filter exported data, refer to [Filter exported data](/dumpling-overview.md#filter-the0exported-data). +- `-T test.t1 -T test.t2` means that only the two tables `test`.`t1` and `test`.`t2` are exported. For more methods to filter exported data, refer to [Filter exported data](/dumpling-overview.md#filter-the-exported-data). - `-t 32` means that 32 threads are used to export the data. - `-F 256m` means that a table is partitioned into chunks, and one chunk is 256MB. From 7b3fddb928db183a971c551151fb8c02f951c8ef Mon Sep 17 00:00:00 2001 From: Keke Yi <40977455+yikeke@users.noreply.github.com> Date: Thu, 30 Jul 2020 12:05:07 +0800 Subject: [PATCH 08/17] Update dumpling-overview.md Co-authored-by: Chunzhu Li --- dumpling-overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dumpling-overview.md b/dumpling-overview.md index ab45bbf92acc1..8b1b7be2d058d 100644 --- a/dumpling-overview.md +++ b/dumpling-overview.md @@ -195,7 +195,7 @@ The TiDB historical data snapshots when the TSO is `417773951312461825` and the When exporting data from TiDB, if the TiDB version is greater than v4.0.0 and Dumpling can access the PD address of the TiDB cluster, Dumpling automatically extends the GC time without affecting the original cluster. But for TiDB earlier than v4.0.0, you need to manually modify the GC time. -In other scenarios, if the data volume is very large, to avoid export failure due to GC during the export process, you can extend the GC time in advance: +In other scenarios, if the data size is very large, to avoid export failure due to GC during the export process, you can extend the GC time in advance: {{< copyable "sql" >}} From fbaadcad822718ff11e901c221764fa413b98dbf Mon Sep 17 00:00:00 2001 From: Keke Yi <40977455+yikeke@users.noreply.github.com> Date: Thu, 30 Jul 2020 12:05:36 +0800 Subject: [PATCH 09/17] Apply suggestions from code review --- TOC.md | 2 +- backup-and-restore-using-dumpling-lightning.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/TOC.md b/TOC.md index c5d30bb733685..a3b794259aa92 100644 --- a/TOC.md +++ b/TOC.md @@ -61,7 +61,7 @@ + [Use TiDB Ansible](/scale-tidb-using-ansible.md) + [Use TiDB Operator](https://docs.pingcap.com/tidb-in-kubernetes/v1.1/scale-a-tidb-cluster) + Backup and Restore - + Use BR Tool + + Use BR Tool (Recommended) + [Use BR Tool](/br/backup-and-restore-tool.md) + [BR Use Cases](/br/backup-and-restore-use-cases.md) + [BR storages](/br/backup-and-restore-storages.md) diff --git a/backup-and-restore-using-dumpling-lightning.md b/backup-and-restore-using-dumpling-lightning.md index e90c5369a37d8..ecdb947074526 100644 --- a/backup-and-restore-using-dumpling-lightning.md +++ b/backup-and-restore-using-dumpling-lightning.md @@ -6,7 +6,7 @@ aliases: ['/docs-cn/dev/export-or-backup-using-dumpling/','/zh/tidb/dev/export-o # Use Dumpling and TiDB Lightning for Data Backup and Restoration -This document introduces in detail how to use Dumpling and TiDB Lightning to backup and restore full data of TiDB. For incremental backup and restoration, refer to [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md). +This document introduces in detail how to use Dumpling and TiDB Lightning to backup and restore full data of TiDB. For incremental backup and replication to downstream, refer to [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md). Suppose that the TiDB server information is as follows: From 95cc55de0190a0b73c5adfdfa409038b9d43c55a Mon Sep 17 00:00:00 2001 From: Keke Yi <40977455+yikeke@users.noreply.github.com> Date: Thu, 30 Jul 2020 12:12:38 +0800 Subject: [PATCH 10/17] Apply suggestions from code review Co-authored-by: Chunzhu Li --- dumpling-overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dumpling-overview.md b/dumpling-overview.md index 8b1b7be2d058d..ad24b44dbd3ff 100644 --- a/dumpling-overview.md +++ b/dumpling-overview.md @@ -220,7 +220,7 @@ Finally, all the exported data can be imported back to TiDB using [Lightning](/t | -V or --version | Output the Dumpling version and exit directly | | -B or --database | Export specified databases | | -T or --tables-list | Export specified tables | -| -f or --filter | Export tables that match the filter pattern. For the filter syntax, see[table-filter](/table-filter.md). | `"\*.\*"` (export all databases or tables) | +| -f or --filter | Export tables that match the filter pattern. For the filter syntax, see [table-filter](/table-filter.md). | `"\*.\*"` (export all databases or tables) | | --case-sensitive | whether table-filter is case-sensitive | false (case-insensitive) | | -h or --host| The address of the linked node | "127.0.0.1" | | -t or --threads | The number of concurrent backup threads | 4 | From b05cfd31c71a5d38c4401c390972baf073e96f36 Mon Sep 17 00:00:00 2001 From: Keke Yi <40977455+yikeke@users.noreply.github.com> Date: Thu, 30 Jul 2020 12:21:13 +0800 Subject: [PATCH 11/17] Update dumpling-overview.md Co-authored-by: Chunzhu Li --- dumpling-overview.md | 1 - 1 file changed, 1 deletion(-) diff --git a/dumpling-overview.md b/dumpling-overview.md index ad24b44dbd3ff..926d470ab7369 100644 --- a/dumpling-overview.md +++ b/dumpling-overview.md @@ -244,7 +244,6 @@ Finally, all the exported data can be imported back to TiDB using [Lightning](/t | -P or --port | The port of the linked node | 4000 | | -u or --user | The username of the linked node | "root" | | --dump-empty-database | Export the `CREATE DATABASE` statements of the empty databases | true | -| --tidbMemQuotaQuery | Maximum memory used by a single query when exporting TiDB database | 34359738368(32GB) | | --ca | The address of the certificate authority file for TLS connection | | --cert | The address of the client certificate file for TLS connection | | --key | The address of the client private key file for TLS connection | From aa671d66630ea00961699535642136f97a9d5abc Mon Sep 17 00:00:00 2001 From: Keke Yi <40977455+yikeke@users.noreply.github.com> Date: Thu, 30 Jul 2020 15:25:41 +0800 Subject: [PATCH 12/17] Apply suggestions from code review Co-authored-by: Chunzhu Li --- dumpling-overview.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/dumpling-overview.md b/dumpling-overview.md index 926d470ab7369..ac9b3f9b0734b 100644 --- a/dumpling-overview.md +++ b/dumpling-overview.md @@ -141,7 +141,7 @@ The exported file is stored in the `./export-` directory by - `-F` option is used to specify the maximum size of a single file (the unit here is `MiB`; inputs like `5GiB` or `8KB` are also acceptable). - `-r` option is used to specify the maximum number of records (or the number of rows in the database) for a single file. When it is enabled, Dumpling enables concurrency in the table to improve the speed of exporting large tables. -You can use the above parameters to provide Dumpling with a higher degree of concurrency. +With the above options specified, Dumpling can have a higher degree of parallelism. ### Adjust Dumpling's data consistency options @@ -222,7 +222,7 @@ Finally, all the exported data can be imported back to TiDB using [Lightning](/t | -T or --tables-list | Export specified tables | | -f or --filter | Export tables that match the filter pattern. For the filter syntax, see [table-filter](/table-filter.md). | `"\*.\*"` (export all databases or tables) | | --case-sensitive | whether table-filter is case-sensitive | false (case-insensitive) | -| -h or --host| The address of the linked node | "127.0.0.1" | +| -h or --host| The IP address of the connected database host | "127.0.0.1" | | -t or --threads | The number of concurrent backup threads | 4 | | -r or --rows | Divide the table into specified rows of data (generally applicable for concurrent operations of splitting a large table into multiple files. | | -L or --logfile | Log output address. If it is empty, the log will be output to the console | "" | @@ -240,9 +240,9 @@ Finally, all the exported data can be imported back to TiDB using [Lightning](/t | --consistency | flush: use FTWRL before the dump
snapshot: specify the dump files' location through TSO
lock: execute `lock tables read` on all tables to be dumped
none: dump without adding locks, which sacrifices consistency
auto: MySQL flush, TiDB snapshot | "auto" | | --snapshot | snapshot TSO; valid only when `consistency=snapshot` | | --where | Specify the scope of the table backup through the `where` condition | -| -p or --password | The password of the linked node | -| -P or --port | The port of the linked node | 4000 | -| -u or --user | The username of the linked node | "root" | +| -p or --password | The password of the connected database host | +| -P or --port | The port of the connected database host | 4000 | +| -u or --user | The username of the connected database host | "root" | | --dump-empty-database | Export the `CREATE DATABASE` statements of the empty databases | true | | --ca | The address of the certificate authority file for TLS connection | | --cert | The address of the client certificate file for TLS connection | From aa7b9b0cd60c678ab582bbc267f816b4cfad039b Mon Sep 17 00:00:00 2001 From: Keke Yi <40977455+yikeke@users.noreply.github.com> Date: Thu, 30 Jul 2020 16:12:40 +0800 Subject: [PATCH 13/17] Apply suggestions from code review --- dumpling-overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dumpling-overview.md b/dumpling-overview.md index ac9b3f9b0734b..3dd7e19454fff 100644 --- a/dumpling-overview.md +++ b/dumpling-overview.md @@ -251,6 +251,6 @@ Finally, all the exported data can be imported back to TiDB using [Lightning](/t | --csv-separator | Separator of each value in CSV files | ',' | | --csv-null-value | Representation of null values in CSV files | "\\N" | | --escape-backslash | Use backslash (`\`) to escape special characters in the export file | true | -| --output-filename-template | The data file names in the [golang arguments](https://golang.org/pkg/text/template/#hdr-Arguments) format
Support the `{{.DB}}`, `{{.Table}}`, and `{{.Index}}` arguments
The three arguments represent the database name, table name, and block ID of the data file | '{{.DB}}.{{.Table}}.{{.Index}}' | +| --output-filename-template | The filename templates represented in the format of [golang template](https://golang.org/pkg/text/template/#hdr-Arguments)
Support the `{{.DB}}`, `{{.Table}}`, and `{{.Index}}` arguments
The three arguments represent the database name, table name, and chunk ID of the data file | '{{.DB}}.{{.Table}}.{{.Index}}' | | --status-addr | Dumpling's service address, including the address for Prometheus to pull metrics and pprof debugging | ":8281" | | --tidb-mem-quota-query | The memory limit of exporting SQL statements by a single Dumpling command, the unit is byte, and the default value is 32 GB | 34359738368 | From 8143735b9f5840d287467682e9ef2434a3c44657 Mon Sep 17 00:00:00 2001 From: Keke Yi <40977455+yikeke@users.noreply.github.com> Date: Thu, 30 Jul 2020 17:36:21 +0800 Subject: [PATCH 14/17] Update dumpling-overview.md --- dumpling-overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dumpling-overview.md b/dumpling-overview.md index 3dd7e19454fff..1bfa294c195d4 100644 --- a/dumpling-overview.md +++ b/dumpling-overview.md @@ -237,7 +237,7 @@ Finally, all the exported data can be imported back to TiDB using [Lightning](/t | --filetype| Exported file type (csv/sql) | "sql" | | -o or --output | Exported file path | "./export-${time}" | | -S or --sql | Export data according to the specified SQL statement. This command does not support concurrent export. | -| --consistency | flush: use FTWRL before the dump
snapshot: specify the dump files' location through TSO
lock: execute `lock tables read` on all tables to be dumped
none: dump without adding locks, which sacrifices consistency
auto: MySQL flush, TiDB snapshot | "auto" | +| --consistency | flush: use FTWRL before the dump
snapshot: dump the TiDB data of a specific snapshot of a TSO
lock: execute `lock tables read` on all tables to be dumped
none: dump without adding locks, which cannot guarantee consistency
auto: MySQL defaults to using flush, TiDB defaults to using snapshot | "auto" | | --snapshot | snapshot TSO; valid only when `consistency=snapshot` | | --where | Specify the scope of the table backup through the `where` condition | | -p or --password | The password of the connected database host | From 303f710acb5cb4d6bcb091292f88c81842210b58 Mon Sep 17 00:00:00 2001 From: yikeke Date: Thu, 30 Jul 2020 17:41:12 +0800 Subject: [PATCH 15/17] parameter -> option --- dumpling-overview.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/dumpling-overview.md b/dumpling-overview.md index 1bfa294c195d4..a8fbad076565e 100644 --- a/dumpling-overview.md +++ b/dumpling-overview.md @@ -135,7 +135,7 @@ Examples: ### Improve export efficiency through concurrency -The exported file is stored in the `./export-` directory by default. Commonly used parameters are as follows: +The exported file is stored in the `./export-` directory by default. Commonly used options are as follows: - `-o` is used to select the directory where the exported files are stored. - `-F` option is used to specify the maximum size of a single file (the unit here is `MiB`; inputs like `5GiB` or `8KB` are also acceptable). @@ -149,7 +149,7 @@ With the above options specified, Dumpling can have a higher degree of paralleli > > In most scenarios, you do not need to adjust the default data consistency options of Dumpling. -Dumpling uses the `--consistency ` option to control the way in which data is exported for "consistency assurance". For TiDB, data consistency is guaranteed by getting a snapshot of a certain timestamp by default (i.e. `--consistency snapshot`). When using snapshot for consistency, you can use the `--snapshot` parameter to specify the timestamp to be backed up. You can also use the following levels of consistency: +Dumpling uses the `--consistency ` option to control the way in which data is exported for "consistency assurance". For TiDB, data consistency is guaranteed by getting a snapshot of a certain timestamp by default (i.e. `--consistency snapshot`). When using snapshot for consistency, you can use the `--snapshot` option to specify the timestamp to be backed up. You can also use the following levels of consistency: - `flush`: Use [`FLUSH TABLES WITH READ LOCK`](https://dev.mysql.com/doc/refman/8.0/en/flush.html#flush-tables-with-read-lock) to ensure consistency. - `snapshot`: Get a consistent snapshot of the specified timestamp and export it. From dd37a73a27e7da0e5dd60936e2a495156ff9073f Mon Sep 17 00:00:00 2001 From: Lilian Lee Date: Fri, 31 Jul 2020 13:13:13 +0800 Subject: [PATCH 16/17] Update note format and add inline code format --- download-ecosystem-tools.md | 1 + dumpling-overview.md | 79 ++++++++++++++++++------------------- 2 files changed, 40 insertions(+), 40 deletions(-) diff --git a/download-ecosystem-tools.md b/download-ecosystem-tools.md index adab9abcf8bc5..e684159eed32b 100644 --- a/download-ecosystem-tools.md +++ b/download-ecosystem-tools.md @@ -70,6 +70,7 @@ Download [Dumpling](/dumpling-overview.md) from the links below: > **Note:** > > The `{version}` in the download link is the version number of Dumpling. For example, the link for downloading the `v4.0.2` version of Dumpling is `https://download.pingcap.org/tidb-toolkit-v4.0.2-linux-amd64.tar.gz`. You can view the currently released versions in [Dumpling Releases](https://github.com/pingcap/dumpling/releases). +> > Dumpling supports arm64 linux. You can replace `amd64` in the download link with `arm64`, which means the `arm64` version of Dumpling. ## Syncer, Loader, and Mydumper diff --git a/dumpling-overview.md b/dumpling-overview.md index a8fbad076565e..ddfbc210408ae 100644 --- a/dumpling-overview.md +++ b/dumpling-overview.md @@ -125,13 +125,12 @@ Dumpling can also export specific databases with the `-B` option or specific tab > **Note:** > > - The `--filter` option and the `-T` option cannot be used at the same time. -> > - The `-T` option can only accept a complete form of inputs like `database-name.table-name`, and inputs with only the table name are not accepted. Example: Dumpling cannot recognize `-T WorkOrder`. Examples: --`-B employees` exports the `employees` database --`-T employees.WorkOrder` exports the `employees.WorkOrder` table +- `-B employees` exports the `employees` database. +- `-T employees.WorkOrder` exports the `employees.WorkOrder` table. ### Improve export efficiency through concurrency @@ -217,40 +216,40 @@ Finally, all the exported data can be imported back to TiDB using [Lightning](/t | Options | Usage | Default value | | --------| --- | --- | -| -V or --version | Output the Dumpling version and exit directly | -| -B or --database | Export specified databases | -| -T or --tables-list | Export specified tables | -| -f or --filter | Export tables that match the filter pattern. For the filter syntax, see [table-filter](/table-filter.md). | `"\*.\*"` (export all databases or tables) | -| --case-sensitive | whether table-filter is case-sensitive | false (case-insensitive) | -| -h or --host| The IP address of the connected database host | "127.0.0.1" | -| -t or --threads | The number of concurrent backup threads | 4 | -| -r or --rows | Divide the table into specified rows of data (generally applicable for concurrent operations of splitting a large table into multiple files. | -| -L or --logfile | Log output address. If it is empty, the log will be output to the console | "" | -| --loglevel | Log level {debug,info,warn,error,dpanic,panic,fatal} | "info" | -| --logfmt | Log output format {text,json} | "text" | -| -d or --no-data | Do not export data (suitable for scenarios where only the schema is exported) | -| --no-header | Export CSV files of the tables without generating header | -| -W or --no-views| Do not export the views | true | -| -m or --no-schemas | Do not export the schema with only the data exported | -| -s or--statement-size | Control the size of the `INSERT` statements; the unit is bytes | -| -F or --filesize | The file size of the divided tables. The unit must be specified such as `128B`, `64KiB`, `32MiB`, and `1.5GiB`. | -| --filetype| Exported file type (csv/sql) | "sql" | -| -o or --output | Exported file path | "./export-${time}" | -| -S or --sql | Export data according to the specified SQL statement. This command does not support concurrent export. | -| --consistency | flush: use FTWRL before the dump
snapshot: dump the TiDB data of a specific snapshot of a TSO
lock: execute `lock tables read` on all tables to be dumped
none: dump without adding locks, which cannot guarantee consistency
auto: MySQL defaults to using flush, TiDB defaults to using snapshot | "auto" | -| --snapshot | snapshot TSO; valid only when `consistency=snapshot` | -| --where | Specify the scope of the table backup through the `where` condition | -| -p or --password | The password of the connected database host | -| -P or --port | The port of the connected database host | 4000 | -| -u or --user | The username of the connected database host | "root" | -| --dump-empty-database | Export the `CREATE DATABASE` statements of the empty databases | true | -| --ca | The address of the certificate authority file for TLS connection | -| --cert | The address of the client certificate file for TLS connection | -| --key | The address of the client private key file for TLS connection | -| --csv-delimiter | Delimiter of character type variables in CSV files | '"' | -| --csv-separator | Separator of each value in CSV files | ',' | -| --csv-null-value | Representation of null values in CSV files | "\\N" | -| --escape-backslash | Use backslash (`\`) to escape special characters in the export file | true | -| --output-filename-template | The filename templates represented in the format of [golang template](https://golang.org/pkg/text/template/#hdr-Arguments)
Support the `{{.DB}}`, `{{.Table}}`, and `{{.Index}}` arguments
The three arguments represent the database name, table name, and chunk ID of the data file | '{{.DB}}.{{.Table}}.{{.Index}}' | -| --status-addr | Dumpling's service address, including the address for Prometheus to pull metrics and pprof debugging | ":8281" | -| --tidb-mem-quota-query | The memory limit of exporting SQL statements by a single Dumpling command, the unit is byte, and the default value is 32 GB | 34359738368 | +| `-V` or `--version` | Output the Dumpling version and exit directly | +| `-B` or `--database` | Export specified databases | +| `-T` or `--tables-list` | Export specified tables | +| `-f` or `--filter` | Export tables that match the filter pattern. For the filter syntax, see [table-filter](/table-filter.md). | `"\*.\*"` (export all databases or tables) | +| `--case-sensitive` | whether table-filter is case-sensitive | false (case-insensitive) | +| `-h` or `--host` | The IP address of the connected database host | "127.0.0.1" | +| `-t` or `--threads` | The number of concurrent backup threads | 4 | +| `-r` or `--rows` | Divide the table into specified rows of data (generally applicable for concurrent operations of splitting a large table into multiple files. | +| `-L` or `--logfile` | Log output address. If it is empty, the log will be output to the console | "" | +| `--loglevel` | Log level {debug,info,warn,error,dpanic,panic,fatal} | "info" | +| `--logfmt` | Log output format {text,json} | "text" | +| `-d` or `--no-data` | Do not export data (suitable for scenarios where only the schema is exported) | +| `--no-header` | Export CSV files of the tables without generating header | +| `-W` or `--no-views` | Do not export the views | true | +| `-m` or `--no-schemas` | Do not export the schema with only the data exported | +| `-s` or `--statement-size` | Control the size of the `INSERT` statements; the unit is bytes | +| `-F` or `--filesize` | The file size of the divided tables. The unit must be specified such as `128B`, `64KiB`, `32MiB`, and `1.5GiB`. | +| `--filetype` | Exported file type (csv/sql) | "sql" | +| `-o` or `--output` | Exported file path | "./export-${time}" | +| `-S` or `--sql` | Export data according to the specified SQL statement. This command does not support concurrent export. | +| `--consistency` | flush: use FTWRL before the dump
snapshot: dump the TiDB data of a specific snapshot of a TSO
lock: execute `lock tables read` on all tables to be dumped
none: dump without adding locks, which cannot guarantee consistency
auto: MySQL defaults to using flush, TiDB defaults to using snapshot | "auto" | +| `--snapshot` | Snapshot TSO; valid only when `consistency=snapshot` | +| `--where` | Specify the scope of the table backup through the `where` condition | +| `-p` or `--password` | The password of the connected database host | +| `-P` or `--port` | The port of the connected database host | 4000 | +| `-u` or `--user` | The username of the connected database host | "root" | +| `--dump-empty-database` | Export the `CREATE DATABASE` statements of the empty databases | true | +| `--ca` | The address of the certificate authority file for TLS connection | +| `--cert` | The address of the client certificate file for TLS connection | +| `--key` | The address of the client private key file for TLS connection | +| `--csv-delimiter` | Delimiter of character type variables in CSV files | '"' | +| `--csv-separator` | Separator of each value in CSV files | ',' | +| `--csv-null-value` | Representation of null values in CSV files | "\\N" | +| `--escape-backslash` | Use backslash (`\`) to escape special characters in the export file | true | +| `--output-filename-template` | The filename templates represented in the format of [golang template](https://golang.org/pkg/text/template/#hdr-Arguments)
Support the `{{.DB}}`, `{{.Table}}`, and `{{.Index}}` arguments
The three arguments represent the database name, table name, and chunk ID of the data file | '{{.DB}}.{{.Table}}.{{.Index}}' | +| `--status-addr` | Dumpling's service address, including the address for Prometheus to pull metrics and pprof debugging | ":8281" | +| `--tidb-mem-quota-query` | The memory limit of exporting SQL statements by a single line of Dumpling command, the unit is byte, and the default value is 32 GB | 34359738368 | From f1ca496c824f37caf7cb561951d678b2de4c3df0 Mon Sep 17 00:00:00 2001 From: yikeke Date: Fri, 31 Jul 2020 16:42:06 +0800 Subject: [PATCH 17/17] fix 2 dead links from upstream --- br/backup-and-restore-tool.md | 2 +- faq/deploy-and-maintain-faq.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/br/backup-and-restore-tool.md b/br/backup-and-restore-tool.md index 732be737d9e67..fdd0425584422 100644 --- a/br/backup-and-restore-tool.md +++ b/br/backup-and-restore-tool.md @@ -6,7 +6,7 @@ aliases: ['/docs/dev/br/backup-and-restore-tool/','/docs/dev/reference/tools/br/ # Use BR to Back up and Restore Data -[Backup & Restore](http://github.com/pingcap/br) (BR) is a command-line tool for distributed backup and restoration of the TiDB cluster data. Compared with [`dumpling`](/export-or-backup-using-dumpling.md) and [`mydumper`/`loader`](/backup-and-restore-using-mydumper-lightning.md), BR is more suitable for scenarios of huge data volume. This document describes the BR command line, detailed use examples, best practices, restrictions, and introduces the implementation principles of BR. +[Backup & Restore](http://github.com/pingcap/br) (BR) is a command-line tool for distributed backup and restoration of the TiDB cluster data. Compared with [`dumpling`](/backup-and-restore-using-dumpling-lightning.md) and [`mydumper`/`loader`](/backup-and-restore-using-mydumper-lightning.md), BR is more suitable for scenarios of huge data volume. This document describes the BR command line, detailed use examples, best practices, restrictions, and introduces the implementation principles of BR. ## Usage restrictions diff --git a/faq/deploy-and-maintain-faq.md b/faq/deploy-and-maintain-faq.md index b891264b5fc47..561cbd189c798 100644 --- a/faq/deploy-and-maintain-faq.md +++ b/faq/deploy-and-maintain-faq.md @@ -513,7 +513,7 @@ TiDB is not suitable for tables of small size (such as below ten million level), #### How to back up data in TiDB? -Currently, for the backup of a large volume of data, the preferred method is using [BR](/br/backup-and-restore-tool.md). Otherwise, the recommended tool is [Dumpling](/export-or-backup-using-dumpling.md). Although the official MySQL tool `mysqldump` is also supported in TiDB to back up and restore data, its performance is worse than [BR](/br/backup-and-restore-tool.md) and it needs much more time to back up and restore large volumes of data. +Currently, for the backup of a large volume of data, the preferred method is using [BR](/br/backup-and-restore-tool.md). Otherwise, the recommended tool is [Dumpling](/backup-and-restore-using-dumpling-lightning.md). Although the official MySQL tool `mysqldump` is also supported in TiDB to back up and restore data, its performance is worse than [BR](/br/backup-and-restore-tool.md) and it needs much more time to back up and restore large volumes of data. ## Monitoring