From 2397b667b5c7d39dbc3c15dfd01e1b9955759c54 Mon Sep 17 00:00:00 2001 From: toutdesuite Date: Tue, 17 Mar 2020 14:40:03 +0800 Subject: [PATCH 1/2] cherry pick #1860 to release-2.1 Signed-off-by: sre-bot --- TOC.md | 9 +- .../{from-aurora.md => from-mysql-aurora.md} | 7 +- how-to/migrate/incrementally-from-mysql.md | 192 ------------------ reference/sql/statements/recover-table.md | 107 ++++++++++ reference/tools/user-guide.md | 189 +++++++++++++++++ 5 files changed, 304 insertions(+), 200 deletions(-) rename how-to/migrate/{from-aurora.md => from-mysql-aurora.md} (97%) delete mode 100644 how-to/migrate/incrementally-from-mysql.md create mode 100644 reference/sql/statements/recover-table.md create mode 100644 reference/tools/user-guide.md diff --git a/TOC.md b/TOC.md index 12648fa8bd556..90afb31cd16d4 100644 --- a/TOC.md +++ b/TOC.md @@ -68,11 +68,9 @@ - [Overview](/how-to/monitor/overview.md) - [Monitor a TiDB Cluster](/how-to/monitor/monitor-a-cluster.md) + Migrate - - [Overview](/how-to/migrate/overview.md) + - [Migration Tools User Guide](/reference/tools/user-guide.md) + Migrate from MySQL - - [Migrate the Full Data](/how-to/migrate/from-mysql.md) - - [Migrate the Incremental Data](/how-to/migrate/incrementally-from-mysql.md) - - [Migrate from Aurora](/how-to/migrate/from-aurora.md) + - [Migration Case of Amazon Aurora](/how-to/migrate/from-mysql-aurora.md) - [Migrate from CSV](/reference/tools/tidb-lightning/csv.md) + Maintain - [Common Ansible Operations](/how-to/deploy/orchestrated/ansible-operations.md) @@ -279,6 +277,7 @@ - [Error Handling](/reference/tidb-binlog/troubleshoot/error-handling.md) - [FAQ](/reference/tidb-binlog/faq.md) + Tools + + - [Tools User Guide](/reference/tools/user-guide.md) - [Mydumper](/reference/tools/mydumper.md) - [Syncer](/reference/tools/syncer.md) - [Loader](/reference/tools/loader.md) @@ -317,7 +316,7 @@ - [Skip or Replace Abnormal SQL Statements](/reference/tools/data-migration/skip-replace-sqls.md) - [Monitor](/reference/tools/data-migration/monitor.md) + Migrate from MySQL compatible database - - [Migrate from Aurora](/how-to/migrate/from-aurora.md) + - [Migrate from Amazon Aurora](/how-to/migrate/from-mysql-aurora.md) + Troubleshoot - [DM Troubleshooting](/reference/tools/data-migration/troubleshoot/dm.md) - [Error Description](/reference/tools/data-migration/troubleshoot/error-system.md) diff --git a/how-to/migrate/from-aurora.md b/how-to/migrate/from-mysql-aurora.md similarity index 97% rename from how-to/migrate/from-aurora.md rename to how-to/migrate/from-mysql-aurora.md index 99ed07dada529..4e85ffd3e9c9c 100644 --- a/how-to/migrate/from-aurora.md +++ b/how-to/migrate/from-mysql-aurora.md @@ -1,10 +1,11 @@ --- -title: Migrate from Amazon Aurora MySQL to TiDB -summary: Learn how to migrate from Amazon Aurora MySQL to TiDB by using TiDB Data Migration (DM). +title: Migrate from MySQL (Amazon Aurora) +summary: Learn how to migrate from MySQL (using a case of Amazon Aurora) to TiDB by using TiDB Data Migration (DM). category: how-to +aliases: ['/docs/dev/how-to/migrate/from-aurora/'] --- -# Migrate from Amazon Aurora MySQL to TiDB +# Migrate from MySQL (Amazon Aurora) This document describes how to migrate from [Amazon Aurora MySQL](https://aws.amazon.com/rds/aurora/details/mysql-details/?nc1=h_ls) to TiDB by using TiDB Data Migration (DM). diff --git a/how-to/migrate/incrementally-from-mysql.md b/how-to/migrate/incrementally-from-mysql.md deleted file mode 100644 index edd4b0f33a33e..0000000000000 --- a/how-to/migrate/incrementally-from-mysql.md +++ /dev/null @@ -1,192 +0,0 @@ ---- -title: Migrate Incrementally Using Syncer -summary: Use `mydumper`, `loader` and `syncer` tools to migrate data from MySQL to TiDB. -category: how-to ---- - -# Migrate Incrementally Using Syncer - -The [previous guide](/how-to/migrate/from-mysql.md) introduces how to import a full database from MySQL to TiDB using `mydumper`/`loader`. This methodology is not recommended for large databases with frequent updates, since it can lead to a larger downtime window during migration. It is instead recommended to use syncer. - -Syncer can be [downloaded as part of Enterprise Tools](/reference/tools/download.md). - -Assuming the data from `t1` and `t2` is already imported to TiDB using `mydumper`/`loader`. Now we hope that any updates to these two tables are replicated to TiDB in real time. - -## Obtain the position to replicate - -The data exported from MySQL contains a metadata file which includes the position information. Take the following metadata information as an example: - -``` -Started dump at: 2017-04-28 10:48:10 -SHOW MASTER STATUS: - Log: mysql-bin.000003 - Pos: 930143241 - GTID: - -Finished dump at: 2017-04-28 10:48:11 -``` - -The position information (`Pos: 930143241`) needs to be stored in the `syncer.meta` file for `syncer` to replicate: - -```bash -# cat syncer.meta -binlog-name = "mysql-bin.000003" -binlog-pos = 930143241 -``` - -> **Note:** -> -> The `syncer.meta` file only needs to be configured once when it is first used. The position will be automatically updated when binlog is replicated. - -## Start `syncer` - -The `config.toml` file for `syncer`: - -```toml -log-level = "info" -log-file = "syncer.log" -log-rotate = "day" - -server-id = 101 - -# The file path for meta: -meta = "./syncer.meta" -worker-count = 16 -batch = 1000 -flavor = "mysql" - -# The testing address for pprof. It can also be used by Prometheus to pull Syncer metrics. -status-addr = ":8271" - -# If you set its value to true, Syncer stops and exits when it encounters the DDL operation. -stop-on-ddl = false - -# max-retry is used for retry during network interruption. -max-retry = 100 - -# Specify the database name to be replicated. Support regular expressions. Start with '~' to use regular expressions. -# replicate-do-db = ["~^b.*","s1"] - -# Specify the database you want to ignore in replication. Support regular expressions. Start with '~' to use regular expressions. -# replicate-ignore-db = ["~^b.*","s1"] - -# skip-ddls skips the ddl statements. -# skip-ddls = ["^OPTIMIZE\\s+TABLE"] - -# skip-dmls skips the DML statements. The type value can be 'insert', 'update' and 'delete'. -# The 'delete' statements that skip-dmls skips in the foo.bar table: -# [[skip-dmls]] -# db-name = "foo" -# tbl-name = "bar" -# type = "delete" -# -# The 'delete' statements that skip-dmls skips in all tables: -# [[skip-dmls]] -# type = "delete" -# -# The 'delete' statements that skip-dmls skips in all foo.* tables: -# [[skip-dmls]] -# db-name = "foo" -# type = "delete" - -# Specify the db.table to be replicated. -# db-name and tbl-name do not support the `db-name ="dbname, dbname2"` format. -# [[replicate-do-table]] -# db-name ="dbname" -# tbl-name = "table-name" - -# [[replicate-do-table]] -# db-name ="dbname1" -# tbl-name = "table-name1" - -# Specify the db.table to be replicated. Support regular expressions. Start with '~' to use regular expressions. -# [[replicate-do-table]] -# db-name ="test" -# tbl-name = "~^a.*" - -# Specify the database table you want to ignore in replication. -# db-name and tbl-name do not support the `db-name ="dbname, dbname2"` format. -# [[replicate-ignore-table]] -# db-name = "your_db" -# tbl-name = "your_table" - -# Specify the database table you want to ignore in replication. Support regular expressions. Start with '~' to use regular expressions. -# [[replicate-ignore-table]] -# db-name ="test" -# tbl-name = "~^a.*" - -# The sharding replicating rules support wildcharacter. -# 1. The asterisk character ("*", also called "star") matches zero or more characters, -# For example, "doc*" matches "doc" and "document" but not "dodo"; -# The asterisk character must be in the end of the wildcard word, -# and there is only one asterisk in one wildcard word. -# 2. The question mark ("?") matches any single character. -# [[route-rules]] -# pattern-schema = "route_*" -# pattern-table = "abc_*" -# target-schema = "route" -# target-table = "abc" - -# [[route-rules]] -# pattern-schema = "route_*" -# pattern-table = "xyz_*" -# target-schema = "route" -# target-table = "xyz" - -[from] -host = "127.0.0.1" -user = "root" -password = "" -port = 3306 - -[to] -host = "127.0.0.1" -user = "root" -password = "" -port = 4000 -``` - -Start `syncer`: - -```bash -./bin/syncer -config config.toml -2016/10/27 15:22:01 binlogsyncer.go:226: [info] begin to sync binlog from position (mysql-bin.000003, 1280) -2016/10/27 15:22:01 binlogsyncer.go:130: [info] register slave for master server 127.0.0.1:3306 -2016/10/27 15:22:01 binlogsyncer.go:552: [info] rotate to (mysql-bin.000003, 1280) -2016/10/27 15:22:01 syncer.go:549: [info] rotate binlog to (mysql-bin.000003, 1280) -``` - -## Insert data into MySQL - -```bash -INSERT INTO t1 VALUES (4, 4), (5, 5); -``` - -## Log in TiDB and view the data - -```sql -mysql -h127.0.0.1 -P4000 -uroot -p -mysql> select * from t1; -+----+------+ -| id | age | -+----+------+ -| 1 | 1 | -| 2 | 2 | -| 3 | 3 | -| 4 | 4 | -| 5 | 5 | -+----+------+ -``` - -`syncer` outputs the current replicated data statistics every 30 seconds: - -```bash -2017/06/08 01:18:51 syncer.go:934: [info] [syncer]total events = 15, total tps = 130, recent tps = 4, -master-binlog = (ON.000001, 11992), master-binlog-gtid=53ea0ed1-9bf8-11e6-8bea-64006a897c73:1-74, -syncer-binlog = (ON.000001, 2504), syncer-binlog-gtid = 53ea0ed1-9bf8-11e6-8bea-64006a897c73:1-17 -2017/06/08 01:19:21 syncer.go:934: [info] [syncer]total events = 15, total tps = 191, recent tps = 2, -master-binlog = (ON.000001, 11992), master-binlog-gtid=53ea0ed1-9bf8-11e6-8bea-64006a897c73:1-74, -syncer-binlog = (ON.000001, 2504), syncer-binlog-gtid = 53ea0ed1-9bf8-11e6-8bea-64006a897c73:1-35 -``` - -You can see that by using `syncer`, the updates in MySQL are automatically replicated in TiDB. diff --git a/reference/sql/statements/recover-table.md b/reference/sql/statements/recover-table.md new file mode 100644 index 0000000000000..4289fe9a6c8ac --- /dev/null +++ b/reference/sql/statements/recover-table.md @@ -0,0 +1,107 @@ +--- +title: RECOVER TABLE +summary: An overview of the usage of RECOVER TABLE for the TiDB database. +category: reference +--- + +# RECOVER TABLE + +`RECOVER TABLE` is used to recover a deleted table and the data on it within the GC (Garbage Collection) life time after the `DROP TABLE` statement is executed. + +## Syntax + +{{< copyable "sql" >}} + +```sql +RECOVER TABLE table_name +``` + +{{< copyable "sql" >}} + +```sql +RECOVER TABLE BY JOB ddl_job_id +``` + +> **Note:** +> +> + If a table is deleted and the GC lifetime is out, the table cannot be recovered with `RECOVER TABLE`. Execution of `RECOVER TABLE` in this scenario returns an error like: `snapshot is older than GC safe point 2019-07-10 13:45:57 +0800 CST`. +> +> + If the TiDB version is 3.0.0 or later, it is not recommended for you to use `RECOVER TABLE` when TiDB Binlog is used. +> +> + `RECOVER TABLE` is supported in the Binlog version 3.0.1, so you can use `RECOVER TABLE` in the following three situations: +> +> - Binglog version is 3.0.1 or later. +> - TiDB 3.0 is used both in the upstream cluster and the downstream cluster. +> - The GC life time of the slave cluster must be longer than that of the master cluster. However, as latency occurs during data replication between upstream and downstream databases, data recovery might fail in the downstream. + +### Troubleshoot errors during TiDB Binlog replication + +When you use `RECOVER TABLE` in the upstream TiDB during TiDB Binlog replication, TiDB Binlog might be interrupted in the following three situations: + ++ The downstream database does not support the `RECOVER TABLE` statement. An error instance: `check the manual that corresponds to your MySQL server version for the right syntax to use near 'RECOVER TABLE table_name'`. + ++ The GC life time is not consistent between the upstream database and the downstream database. An error instance: `snapshot is older than GC safe point 2019-07-10 13:45:57 +0800 CST`. + ++ Latency occurs during replication between upstream and downstream databases. An error instance: `snapshot is older than GC safe point 2019-07-10 13:45:57 +0800 CST`. + +For the above three situations, you can resume data replication from TiDB Binlog with a [full import of the deleted table](/reference/tools/user-guide.md#full-backup-and-restore-of-tidb-cluster-data-1). + +## Examples + ++ Recover the deleted table according to the table name. + + {{< copyable "sql" >}} + + ```sql + DROP TABLE t; + ``` + + {{< copyable "sql" >}} + + ```sql + RECOVER TABLE t; + ``` + + This method searches the recent DDL job history and locates the first DDL operation of the `DROP TABLE` type, and then recovers the deleted table with the name identical to the one table name specified in the `RECOVER TABLE` statement. + ++ Recover the deleted table according to the table's `DDL JOB ID` used. + + Suppose that you had deleted the table `t` and created another `t`, and again you deleted the newly created `t`. Then, if you want to recover the `t` deleted in the first place, you must use the method that specifies the `DDL JOB ID`. + + {{< copyable "sql" >}} + + ```sql + DROP TABLE t; + ``` + + {{< copyable "sql" >}} + + ```sql + ADMIN SHOW DDL JOBS 1; + ``` + + The second statement above is used to search for the table's `DDL JOB ID` to delete `t`. In the following example, the ID is `53`. + + ``` + +--------+---------+------------+------------+--------------+-----------+----------+-----------+-----------------------------------+--------+ + | JOB_ID | DB_NAME | TABLE_NAME | JOB_TYPE | SCHEMA_STATE | SCHEMA_ID | TABLE_ID | ROW_COUNT | START_TIME | STATE | + +--------+---------+------------+------------+--------------+-----------+----------+-----------+-----------------------------------+--------+ + | 53 | test | | drop table | none | 1 | 41 | 0 | 2019-07-10 13:23:18.277 +0800 CST | synced | + +--------+---------+------------+------------+--------------+-----------+----------+-----------+-----------------------------------+--------+ + ``` + + {{< copyable "sql" >}} + + ```sql + RECOVER TABLE BY JOB 53; + ``` + + This method recovers the deleted table via the `DDL JOB ID`. If the corresponding DDL job is not of the `DROP TABLE` type, an error occurs. + +## Implementation principle + +When deleting a table, TiDB only deletes the table metadata, and writes the table data (row data and index data) to be deleted to the `mysql.gc_delete_range` table. The GC Worker in the TiDB background periodically removes from the `mysql.gc_delete_range` table the keys that exceed the GC life time. + +Therefore, to recover a table, you only need to recover the table metadata and delete the corresponding row record in the `mysql.gc_delete_range` table before the GC Worker deletes the table data. You can use a snapshot read of TiDB to recover the table metadata. Refer to [Read Historical Data](/how-to/get-started/read-historical-data.md) for details. + +Table recovery is done by TiDB obtaining the table metadata through snapshot read, and then going through the process of table creation similar to `CREATE TABLE`. Therefore, `RECOVER TABLE` itself is, in essence, a kind of DDL operation. diff --git a/reference/tools/user-guide.md b/reference/tools/user-guide.md new file mode 100644 index 0000000000000..e28f0efe0deb1 --- /dev/null +++ b/reference/tools/user-guide.md @@ -0,0 +1,189 @@ +--- +title: TiDB Ecosystem Tools User Guide +category: reference +aliases: ['/docs/dev/how-to/migrate/from-mysql/','/docs/dev/how-to/migrate/incrementally-from-mysql/','/docs/dev/how-to/migrate/overview/'] +--- + +# TiDB Ecosystem Tools User Guide + +Currently, TiDB has multiple ecosystem tools. Some of them have overlapping functionality, and some are different versions of the same tool. This document introduces each of these tools, illustrates their relationship, and describes when to use which tool for each TiDB version. + +## TiDB ecosystem tools overview + +TiDB ecosystem tools can be divided into: + +- Data import tools, including full import tools, backup and restore tools, incremental import tools, and so forth. +- Data export tools, including full export tools. incremental export tools, and so forth. + +The two types of tools are discussed in detail below. + +### Data import tools + +#### Full import tool TiDB Lightning + +[TiDB Lightning](/reference/tools/tidb-lightning/overview.md) is a tool used for fast full import of data into a TiDB cluster. + +> **Note:** +> +> When you import data into TiDB using TiDB Lightning, there are two modes: +> +> - The default mode: Use `tikv-importer` as the backend. In this mode, the cluster can not provide normal services during the data import process. It is used when you import large amounts (TBs) of data. +> - The second mode: Use `TiDB` as the backend (similar to Loader). The import speed is slower than that in the default mode. However, the second mode supports online import. + +The following are the basics of TiDB Lightning: + +- Input: + - Files output by Mydumper; + - CSV files. +- Compatibility: Compatible with TiDB v2.1 and later versions. +- Kubernetes: Supported. See [Quickly restore data into a TiDB cluster in Kubernetes using TiDB Lightning](/tidb-in-kubernetes/maintain/lightning.md). + +#### Backup and restore tool BR + +[BR](/reference/tools/br/br.md) is a command-line tool used for distributed data backup and restoration for a TiDB cluster. Compared with Mydumper and Loader, BR allows you to finish backup and restore tasks with greater efficiency in scenarios of huge data volume. + +The following are the basics of BR: + +- [Types of backup files](/reference/tools/br/br.md#types-of-backup-files): The SST file and the `backupmeta` file. +- Compatibility: Compatible with TiDB v3.1 and v4.0 versions. +- Kubernetes: Supported. Relevant documents are on the way. + +#### Incremental and full import tool TiDB Data Migration + +[TiDB Data Migration (DM)](/reference/tools/data-migration/overview.md) is an tool used for data migration from MySQL/MariaDB into TiDB. It supports both the full and incremental data replication. + +The following are the basics of DM: + +- Input: Full data and binlog data of MySQL/MariaDB. +- Output: SQL statements written to TiDB. +- Compatibility: Compatible with all TiDB versions. +- Kubernetes: In development. + +#### Full import tool Loader (Stop maintenance, not recommended) + +[Loader](/reference/tools/loader.md) is a lightweight full data import tool. Data is imported into TiDB in the form of SQL statements. Currently, this tool is gradually replaced by [TiDB Lightning](#full-import-tool-tidb-lightning), see [TiDB Lightning TiDB-backend Document](/reference/tools/tidb-lightning/tidb-backend.md#migrating-from-loader-to-tidb-lightning-tidb-backend). + +The following are the basics of Loader: + +- Input: Files output by Mydumper. +- Output: SQL statements written to TiDB. +- Compatibility: Compatible with all TiDB versions. +- Kubernetes: Supported. See [Backup and restore](/tidb-in-kubernetes/maintain/backup-and-restore.md). + +#### Incremental import tool Syncer (Stop maintenance, not recommended) + +[Syncer](/reference/tools/syncer.md) is a tool used for incremental import of real-time binlog data from MySQL/MariaDB into TiDB. It is recommended to use [TiDB Data Migration](#Incremental-import-tool-tidb-data-migration) to replace Syncer. + +The following are the basics of Syncer: + +- Input: Binlog data of MySQL/MariaDB. +- Output: SQL statements written to TiDB. +- Compatibility: Compatible with all TiDB versions. +- Kubernetes: Not supported. + +### Data export tools + +#### Full export tool Mydumper + +[Mydumper](/reference/tools/mydumper.md) is a MySQL community tool used for full logical backups of MySQL that also works with TiDB. + +The following are the basics of Mydumper: + +- Input: MySQL/TiDB clusters. +- Output: SQL files. +- Compatibility: Compatible with all TiDB versions. +- Kubernetes: Supported. See [Backup and Restore](/tidb-in-kubernetes/maintain/backup-and-restore.md). + +#### Full export tool TiDB Binlog + +[TiDB Binlog](/reference/tidb-binlog/overview.md) is a tool used to collect binlog data from TiDB. It provides near real-time backup and replication to downstream platforms. + +The following are the basics of TiDB Binlog: + +- Input: TiDB clusters. +- Output: MySQL, TiDB, Kafka or incremental backup files. +- Compatibility: Compatible with TiDB v2.1 and later versions. +- Kubernetes: Supported. See [TiDB Binlog Cluster Operations](/tidb-in-kubernetes/maintain/tidb-binlog.md) and [TiDB Binlog Drainer Configurations in Kubernetes](/tidb-in-kubernetes/reference/configuration/tidb-drainer.md). + +## Tools development roadmap + +To help you understand the relationships between the above tools, here is a brief introduction to TiDB ecosystem tools development roadmap. + +### TiDB backup and restore + +Mydumper and Loader -> BR: + +Mydumper and Loader are inefficient since they back up and restore data on the logical level. BR is much more efficient because it takes advantage of TiDB features for backup and restore tasks. BR can be applied in huge data volume scenarios. + +### TiDB full data restore + +Loader -> TiDB Lightning: + +Loader is inefficient since it performs full data restoration using SQL. TiDB Lightning imports data into TiKV directly, so it is much more efficient and can be used for fast full import of large amounts (more than TBs) of data into a new TiDB cluster. + +TiDB Lightning also integrates the logical data import function of Loader and supports online data import. For details, see [TiDB Lightning TiDB-backend Document](/reference/tools/tidb-lightning/tidb-backend.md#migrating-from-loader-to-tidb-lightning-tidb-backend). + +### MySQL data migration + +- Mydumper, Loader and Syncer -> DM: + + It is tedious to migrate MySQL data to TiDB using Mydumper, Loader, and Syncer. DM provides an integrated data migration approach that improves usability. DM can be also used to merge the sharded schemas and tables. + +- Loader -> TiDB Lightning: + + TiDB Lightning integrates the logical data import function of Loader. See [TiDB Lightning TiDB-backend document](/reference/tools/tidb-lightning/tidb-backend.md#migrating-from-loader-to-tidb-lightning-tidb-backend) for details. It is used to perform full data restoration. + +## Data migration solutions + +For TiDB 2.1, 3.0, and 3.1 versions, this section introduces data migration solutions in typical application scenarios. + +### Full link data migration solutions for v3.0 + +#### Migrating MySQL data to TiDB + +If the volume is more than TBs of data, the recommended migration steps are: + +1. Export full MySQL data using Mydumper; +2. Import full backup data from MySQL into a TiDB cluster using TiDB Lightning; +3. Replicate the incremental data of MySQL into TiDB. + +If the volume is less than TBs of data, it is recommended to migrate MySQL data to TiDB using DM (the migrating process includes full data import and incremental data replication). + +#### Replication of TiDB cluster data + +It is recommended that you use TiDB Binlog to replicate TiDB data to downstream TiDB/MySQL. + +#### Full backup and restore of TiDB cluster data + +The recommended steps are: + +1. Back up full data using Mydumper; +2. Restore full data into TiDB/MySQL using TiDB Lightning. + +### Full link data migration solutions for v3.1 + +#### Migrating MySQL data to TiDB + +If the volume is more than TBs of data, the recommended migration steps are: + +1. Export full MySQL data using Mydumper; +2. Import full backup data from MySQL into a TiDB cluster using TiDB Lightning; +3. Replicate the incremental data of MySQL into TiDB. + +If the volume is less than TBs of data, it is recommended to migrate MySQL data to TiDB using DM (the migrating process includes full data import and incremental data replication). + +#### Replication of TiDB cluster data + +It is recommended that you use TiDB Binlog to replicate TiDB data to downstream TiDB/MySQL. + +#### Full backup and restore of TiDB cluster data + +- Restore to TiDB + + - Back up full data using BR; + - Restore full data using BR. + +- Restore to MySQL + + - Back up full data using Mydumper; + - Restore full data using TiDB Lightning. From 6d19f8b88502ff78c23d15b1150bfff6ac769ea1 Mon Sep 17 00:00:00 2001 From: TomShawn <1135243111@qq.com> Date: Tue, 17 Mar 2020 15:34:57 +0800 Subject: [PATCH 2/2] resolve conflict and align #2243 --- TOC.md | 9 +- .../{from-mysql-aurora.md => from-aurora.md} | 5 +- how-to/migrate/incrementally-from-mysql.md | 192 ++++++++++++++++++ reference/sql/statements/recover-table.md | 107 ---------- reference/tools/user-guide.md | 189 ----------------- 5 files changed, 199 insertions(+), 303 deletions(-) rename how-to/migrate/{from-mysql-aurora.md => from-aurora.md} (98%) create mode 100644 how-to/migrate/incrementally-from-mysql.md delete mode 100644 reference/sql/statements/recover-table.md delete mode 100644 reference/tools/user-guide.md diff --git a/TOC.md b/TOC.md index 90afb31cd16d4..b714174dca031 100644 --- a/TOC.md +++ b/TOC.md @@ -68,9 +68,11 @@ - [Overview](/how-to/monitor/overview.md) - [Monitor a TiDB Cluster](/how-to/monitor/monitor-a-cluster.md) + Migrate - - [Migration Tools User Guide](/reference/tools/user-guide.md) + - [Overview](/how-to/migrate/overview.md) + Migrate from MySQL - - [Migration Case of Amazon Aurora](/how-to/migrate/from-mysql-aurora.md) + - [Migrate the Full Data](/how-to/migrate/from-mysql.md) + - [Migrate the Incremental Data](/how-to/migrate/incrementally-from-mysql.md) + - [Migrate from MySQL/Aurora](/how-to/migrate/from-aurora.md) - [Migrate from CSV](/reference/tools/tidb-lightning/csv.md) + Maintain - [Common Ansible Operations](/how-to/deploy/orchestrated/ansible-operations.md) @@ -277,7 +279,6 @@ - [Error Handling](/reference/tidb-binlog/troubleshoot/error-handling.md) - [FAQ](/reference/tidb-binlog/faq.md) + Tools - + - [Tools User Guide](/reference/tools/user-guide.md) - [Mydumper](/reference/tools/mydumper.md) - [Syncer](/reference/tools/syncer.md) - [Loader](/reference/tools/loader.md) @@ -316,7 +317,7 @@ - [Skip or Replace Abnormal SQL Statements](/reference/tools/data-migration/skip-replace-sqls.md) - [Monitor](/reference/tools/data-migration/monitor.md) + Migrate from MySQL compatible database - - [Migrate from Amazon Aurora](/how-to/migrate/from-mysql-aurora.md) + - [Migrate from Amazon Aurora](/how-to/migrate/from-aurora.md) + Troubleshoot - [DM Troubleshooting](/reference/tools/data-migration/troubleshoot/dm.md) - [Error Description](/reference/tools/data-migration/troubleshoot/error-system.md) diff --git a/how-to/migrate/from-mysql-aurora.md b/how-to/migrate/from-aurora.md similarity index 98% rename from how-to/migrate/from-mysql-aurora.md rename to how-to/migrate/from-aurora.md index 4e85ffd3e9c9c..0c9382062a87d 100644 --- a/how-to/migrate/from-mysql-aurora.md +++ b/how-to/migrate/from-aurora.md @@ -1,11 +1,10 @@ --- -title: Migrate from MySQL (Amazon Aurora) +title: Migrate from MySQL (Amazon Aurora) to TiDB summary: Learn how to migrate from MySQL (using a case of Amazon Aurora) to TiDB by using TiDB Data Migration (DM). category: how-to -aliases: ['/docs/dev/how-to/migrate/from-aurora/'] --- -# Migrate from MySQL (Amazon Aurora) +# Migrate from MySQL (Amazon Aurora) to TiDB This document describes how to migrate from [Amazon Aurora MySQL](https://aws.amazon.com/rds/aurora/details/mysql-details/?nc1=h_ls) to TiDB by using TiDB Data Migration (DM). diff --git a/how-to/migrate/incrementally-from-mysql.md b/how-to/migrate/incrementally-from-mysql.md new file mode 100644 index 0000000000000..edd4b0f33a33e --- /dev/null +++ b/how-to/migrate/incrementally-from-mysql.md @@ -0,0 +1,192 @@ +--- +title: Migrate Incrementally Using Syncer +summary: Use `mydumper`, `loader` and `syncer` tools to migrate data from MySQL to TiDB. +category: how-to +--- + +# Migrate Incrementally Using Syncer + +The [previous guide](/how-to/migrate/from-mysql.md) introduces how to import a full database from MySQL to TiDB using `mydumper`/`loader`. This methodology is not recommended for large databases with frequent updates, since it can lead to a larger downtime window during migration. It is instead recommended to use syncer. + +Syncer can be [downloaded as part of Enterprise Tools](/reference/tools/download.md). + +Assuming the data from `t1` and `t2` is already imported to TiDB using `mydumper`/`loader`. Now we hope that any updates to these two tables are replicated to TiDB in real time. + +## Obtain the position to replicate + +The data exported from MySQL contains a metadata file which includes the position information. Take the following metadata information as an example: + +``` +Started dump at: 2017-04-28 10:48:10 +SHOW MASTER STATUS: + Log: mysql-bin.000003 + Pos: 930143241 + GTID: + +Finished dump at: 2017-04-28 10:48:11 +``` + +The position information (`Pos: 930143241`) needs to be stored in the `syncer.meta` file for `syncer` to replicate: + +```bash +# cat syncer.meta +binlog-name = "mysql-bin.000003" +binlog-pos = 930143241 +``` + +> **Note:** +> +> The `syncer.meta` file only needs to be configured once when it is first used. The position will be automatically updated when binlog is replicated. + +## Start `syncer` + +The `config.toml` file for `syncer`: + +```toml +log-level = "info" +log-file = "syncer.log" +log-rotate = "day" + +server-id = 101 + +# The file path for meta: +meta = "./syncer.meta" +worker-count = 16 +batch = 1000 +flavor = "mysql" + +# The testing address for pprof. It can also be used by Prometheus to pull Syncer metrics. +status-addr = ":8271" + +# If you set its value to true, Syncer stops and exits when it encounters the DDL operation. +stop-on-ddl = false + +# max-retry is used for retry during network interruption. +max-retry = 100 + +# Specify the database name to be replicated. Support regular expressions. Start with '~' to use regular expressions. +# replicate-do-db = ["~^b.*","s1"] + +# Specify the database you want to ignore in replication. Support regular expressions. Start with '~' to use regular expressions. +# replicate-ignore-db = ["~^b.*","s1"] + +# skip-ddls skips the ddl statements. +# skip-ddls = ["^OPTIMIZE\\s+TABLE"] + +# skip-dmls skips the DML statements. The type value can be 'insert', 'update' and 'delete'. +# The 'delete' statements that skip-dmls skips in the foo.bar table: +# [[skip-dmls]] +# db-name = "foo" +# tbl-name = "bar" +# type = "delete" +# +# The 'delete' statements that skip-dmls skips in all tables: +# [[skip-dmls]] +# type = "delete" +# +# The 'delete' statements that skip-dmls skips in all foo.* tables: +# [[skip-dmls]] +# db-name = "foo" +# type = "delete" + +# Specify the db.table to be replicated. +# db-name and tbl-name do not support the `db-name ="dbname, dbname2"` format. +# [[replicate-do-table]] +# db-name ="dbname" +# tbl-name = "table-name" + +# [[replicate-do-table]] +# db-name ="dbname1" +# tbl-name = "table-name1" + +# Specify the db.table to be replicated. Support regular expressions. Start with '~' to use regular expressions. +# [[replicate-do-table]] +# db-name ="test" +# tbl-name = "~^a.*" + +# Specify the database table you want to ignore in replication. +# db-name and tbl-name do not support the `db-name ="dbname, dbname2"` format. +# [[replicate-ignore-table]] +# db-name = "your_db" +# tbl-name = "your_table" + +# Specify the database table you want to ignore in replication. Support regular expressions. Start with '~' to use regular expressions. +# [[replicate-ignore-table]] +# db-name ="test" +# tbl-name = "~^a.*" + +# The sharding replicating rules support wildcharacter. +# 1. The asterisk character ("*", also called "star") matches zero or more characters, +# For example, "doc*" matches "doc" and "document" but not "dodo"; +# The asterisk character must be in the end of the wildcard word, +# and there is only one asterisk in one wildcard word. +# 2. The question mark ("?") matches any single character. +# [[route-rules]] +# pattern-schema = "route_*" +# pattern-table = "abc_*" +# target-schema = "route" +# target-table = "abc" + +# [[route-rules]] +# pattern-schema = "route_*" +# pattern-table = "xyz_*" +# target-schema = "route" +# target-table = "xyz" + +[from] +host = "127.0.0.1" +user = "root" +password = "" +port = 3306 + +[to] +host = "127.0.0.1" +user = "root" +password = "" +port = 4000 +``` + +Start `syncer`: + +```bash +./bin/syncer -config config.toml +2016/10/27 15:22:01 binlogsyncer.go:226: [info] begin to sync binlog from position (mysql-bin.000003, 1280) +2016/10/27 15:22:01 binlogsyncer.go:130: [info] register slave for master server 127.0.0.1:3306 +2016/10/27 15:22:01 binlogsyncer.go:552: [info] rotate to (mysql-bin.000003, 1280) +2016/10/27 15:22:01 syncer.go:549: [info] rotate binlog to (mysql-bin.000003, 1280) +``` + +## Insert data into MySQL + +```bash +INSERT INTO t1 VALUES (4, 4), (5, 5); +``` + +## Log in TiDB and view the data + +```sql +mysql -h127.0.0.1 -P4000 -uroot -p +mysql> select * from t1; ++----+------+ +| id | age | ++----+------+ +| 1 | 1 | +| 2 | 2 | +| 3 | 3 | +| 4 | 4 | +| 5 | 5 | ++----+------+ +``` + +`syncer` outputs the current replicated data statistics every 30 seconds: + +```bash +2017/06/08 01:18:51 syncer.go:934: [info] [syncer]total events = 15, total tps = 130, recent tps = 4, +master-binlog = (ON.000001, 11992), master-binlog-gtid=53ea0ed1-9bf8-11e6-8bea-64006a897c73:1-74, +syncer-binlog = (ON.000001, 2504), syncer-binlog-gtid = 53ea0ed1-9bf8-11e6-8bea-64006a897c73:1-17 +2017/06/08 01:19:21 syncer.go:934: [info] [syncer]total events = 15, total tps = 191, recent tps = 2, +master-binlog = (ON.000001, 11992), master-binlog-gtid=53ea0ed1-9bf8-11e6-8bea-64006a897c73:1-74, +syncer-binlog = (ON.000001, 2504), syncer-binlog-gtid = 53ea0ed1-9bf8-11e6-8bea-64006a897c73:1-35 +``` + +You can see that by using `syncer`, the updates in MySQL are automatically replicated in TiDB. diff --git a/reference/sql/statements/recover-table.md b/reference/sql/statements/recover-table.md deleted file mode 100644 index 4289fe9a6c8ac..0000000000000 --- a/reference/sql/statements/recover-table.md +++ /dev/null @@ -1,107 +0,0 @@ ---- -title: RECOVER TABLE -summary: An overview of the usage of RECOVER TABLE for the TiDB database. -category: reference ---- - -# RECOVER TABLE - -`RECOVER TABLE` is used to recover a deleted table and the data on it within the GC (Garbage Collection) life time after the `DROP TABLE` statement is executed. - -## Syntax - -{{< copyable "sql" >}} - -```sql -RECOVER TABLE table_name -``` - -{{< copyable "sql" >}} - -```sql -RECOVER TABLE BY JOB ddl_job_id -``` - -> **Note:** -> -> + If a table is deleted and the GC lifetime is out, the table cannot be recovered with `RECOVER TABLE`. Execution of `RECOVER TABLE` in this scenario returns an error like: `snapshot is older than GC safe point 2019-07-10 13:45:57 +0800 CST`. -> -> + If the TiDB version is 3.0.0 or later, it is not recommended for you to use `RECOVER TABLE` when TiDB Binlog is used. -> -> + `RECOVER TABLE` is supported in the Binlog version 3.0.1, so you can use `RECOVER TABLE` in the following three situations: -> -> - Binglog version is 3.0.1 or later. -> - TiDB 3.0 is used both in the upstream cluster and the downstream cluster. -> - The GC life time of the slave cluster must be longer than that of the master cluster. However, as latency occurs during data replication between upstream and downstream databases, data recovery might fail in the downstream. - -### Troubleshoot errors during TiDB Binlog replication - -When you use `RECOVER TABLE` in the upstream TiDB during TiDB Binlog replication, TiDB Binlog might be interrupted in the following three situations: - -+ The downstream database does not support the `RECOVER TABLE` statement. An error instance: `check the manual that corresponds to your MySQL server version for the right syntax to use near 'RECOVER TABLE table_name'`. - -+ The GC life time is not consistent between the upstream database and the downstream database. An error instance: `snapshot is older than GC safe point 2019-07-10 13:45:57 +0800 CST`. - -+ Latency occurs during replication between upstream and downstream databases. An error instance: `snapshot is older than GC safe point 2019-07-10 13:45:57 +0800 CST`. - -For the above three situations, you can resume data replication from TiDB Binlog with a [full import of the deleted table](/reference/tools/user-guide.md#full-backup-and-restore-of-tidb-cluster-data-1). - -## Examples - -+ Recover the deleted table according to the table name. - - {{< copyable "sql" >}} - - ```sql - DROP TABLE t; - ``` - - {{< copyable "sql" >}} - - ```sql - RECOVER TABLE t; - ``` - - This method searches the recent DDL job history and locates the first DDL operation of the `DROP TABLE` type, and then recovers the deleted table with the name identical to the one table name specified in the `RECOVER TABLE` statement. - -+ Recover the deleted table according to the table's `DDL JOB ID` used. - - Suppose that you had deleted the table `t` and created another `t`, and again you deleted the newly created `t`. Then, if you want to recover the `t` deleted in the first place, you must use the method that specifies the `DDL JOB ID`. - - {{< copyable "sql" >}} - - ```sql - DROP TABLE t; - ``` - - {{< copyable "sql" >}} - - ```sql - ADMIN SHOW DDL JOBS 1; - ``` - - The second statement above is used to search for the table's `DDL JOB ID` to delete `t`. In the following example, the ID is `53`. - - ``` - +--------+---------+------------+------------+--------------+-----------+----------+-----------+-----------------------------------+--------+ - | JOB_ID | DB_NAME | TABLE_NAME | JOB_TYPE | SCHEMA_STATE | SCHEMA_ID | TABLE_ID | ROW_COUNT | START_TIME | STATE | - +--------+---------+------------+------------+--------------+-----------+----------+-----------+-----------------------------------+--------+ - | 53 | test | | drop table | none | 1 | 41 | 0 | 2019-07-10 13:23:18.277 +0800 CST | synced | - +--------+---------+------------+------------+--------------+-----------+----------+-----------+-----------------------------------+--------+ - ``` - - {{< copyable "sql" >}} - - ```sql - RECOVER TABLE BY JOB 53; - ``` - - This method recovers the deleted table via the `DDL JOB ID`. If the corresponding DDL job is not of the `DROP TABLE` type, an error occurs. - -## Implementation principle - -When deleting a table, TiDB only deletes the table metadata, and writes the table data (row data and index data) to be deleted to the `mysql.gc_delete_range` table. The GC Worker in the TiDB background periodically removes from the `mysql.gc_delete_range` table the keys that exceed the GC life time. - -Therefore, to recover a table, you only need to recover the table metadata and delete the corresponding row record in the `mysql.gc_delete_range` table before the GC Worker deletes the table data. You can use a snapshot read of TiDB to recover the table metadata. Refer to [Read Historical Data](/how-to/get-started/read-historical-data.md) for details. - -Table recovery is done by TiDB obtaining the table metadata through snapshot read, and then going through the process of table creation similar to `CREATE TABLE`. Therefore, `RECOVER TABLE` itself is, in essence, a kind of DDL operation. diff --git a/reference/tools/user-guide.md b/reference/tools/user-guide.md deleted file mode 100644 index e28f0efe0deb1..0000000000000 --- a/reference/tools/user-guide.md +++ /dev/null @@ -1,189 +0,0 @@ ---- -title: TiDB Ecosystem Tools User Guide -category: reference -aliases: ['/docs/dev/how-to/migrate/from-mysql/','/docs/dev/how-to/migrate/incrementally-from-mysql/','/docs/dev/how-to/migrate/overview/'] ---- - -# TiDB Ecosystem Tools User Guide - -Currently, TiDB has multiple ecosystem tools. Some of them have overlapping functionality, and some are different versions of the same tool. This document introduces each of these tools, illustrates their relationship, and describes when to use which tool for each TiDB version. - -## TiDB ecosystem tools overview - -TiDB ecosystem tools can be divided into: - -- Data import tools, including full import tools, backup and restore tools, incremental import tools, and so forth. -- Data export tools, including full export tools. incremental export tools, and so forth. - -The two types of tools are discussed in detail below. - -### Data import tools - -#### Full import tool TiDB Lightning - -[TiDB Lightning](/reference/tools/tidb-lightning/overview.md) is a tool used for fast full import of data into a TiDB cluster. - -> **Note:** -> -> When you import data into TiDB using TiDB Lightning, there are two modes: -> -> - The default mode: Use `tikv-importer` as the backend. In this mode, the cluster can not provide normal services during the data import process. It is used when you import large amounts (TBs) of data. -> - The second mode: Use `TiDB` as the backend (similar to Loader). The import speed is slower than that in the default mode. However, the second mode supports online import. - -The following are the basics of TiDB Lightning: - -- Input: - - Files output by Mydumper; - - CSV files. -- Compatibility: Compatible with TiDB v2.1 and later versions. -- Kubernetes: Supported. See [Quickly restore data into a TiDB cluster in Kubernetes using TiDB Lightning](/tidb-in-kubernetes/maintain/lightning.md). - -#### Backup and restore tool BR - -[BR](/reference/tools/br/br.md) is a command-line tool used for distributed data backup and restoration for a TiDB cluster. Compared with Mydumper and Loader, BR allows you to finish backup and restore tasks with greater efficiency in scenarios of huge data volume. - -The following are the basics of BR: - -- [Types of backup files](/reference/tools/br/br.md#types-of-backup-files): The SST file and the `backupmeta` file. -- Compatibility: Compatible with TiDB v3.1 and v4.0 versions. -- Kubernetes: Supported. Relevant documents are on the way. - -#### Incremental and full import tool TiDB Data Migration - -[TiDB Data Migration (DM)](/reference/tools/data-migration/overview.md) is an tool used for data migration from MySQL/MariaDB into TiDB. It supports both the full and incremental data replication. - -The following are the basics of DM: - -- Input: Full data and binlog data of MySQL/MariaDB. -- Output: SQL statements written to TiDB. -- Compatibility: Compatible with all TiDB versions. -- Kubernetes: In development. - -#### Full import tool Loader (Stop maintenance, not recommended) - -[Loader](/reference/tools/loader.md) is a lightweight full data import tool. Data is imported into TiDB in the form of SQL statements. Currently, this tool is gradually replaced by [TiDB Lightning](#full-import-tool-tidb-lightning), see [TiDB Lightning TiDB-backend Document](/reference/tools/tidb-lightning/tidb-backend.md#migrating-from-loader-to-tidb-lightning-tidb-backend). - -The following are the basics of Loader: - -- Input: Files output by Mydumper. -- Output: SQL statements written to TiDB. -- Compatibility: Compatible with all TiDB versions. -- Kubernetes: Supported. See [Backup and restore](/tidb-in-kubernetes/maintain/backup-and-restore.md). - -#### Incremental import tool Syncer (Stop maintenance, not recommended) - -[Syncer](/reference/tools/syncer.md) is a tool used for incremental import of real-time binlog data from MySQL/MariaDB into TiDB. It is recommended to use [TiDB Data Migration](#Incremental-import-tool-tidb-data-migration) to replace Syncer. - -The following are the basics of Syncer: - -- Input: Binlog data of MySQL/MariaDB. -- Output: SQL statements written to TiDB. -- Compatibility: Compatible with all TiDB versions. -- Kubernetes: Not supported. - -### Data export tools - -#### Full export tool Mydumper - -[Mydumper](/reference/tools/mydumper.md) is a MySQL community tool used for full logical backups of MySQL that also works with TiDB. - -The following are the basics of Mydumper: - -- Input: MySQL/TiDB clusters. -- Output: SQL files. -- Compatibility: Compatible with all TiDB versions. -- Kubernetes: Supported. See [Backup and Restore](/tidb-in-kubernetes/maintain/backup-and-restore.md). - -#### Full export tool TiDB Binlog - -[TiDB Binlog](/reference/tidb-binlog/overview.md) is a tool used to collect binlog data from TiDB. It provides near real-time backup and replication to downstream platforms. - -The following are the basics of TiDB Binlog: - -- Input: TiDB clusters. -- Output: MySQL, TiDB, Kafka or incremental backup files. -- Compatibility: Compatible with TiDB v2.1 and later versions. -- Kubernetes: Supported. See [TiDB Binlog Cluster Operations](/tidb-in-kubernetes/maintain/tidb-binlog.md) and [TiDB Binlog Drainer Configurations in Kubernetes](/tidb-in-kubernetes/reference/configuration/tidb-drainer.md). - -## Tools development roadmap - -To help you understand the relationships between the above tools, here is a brief introduction to TiDB ecosystem tools development roadmap. - -### TiDB backup and restore - -Mydumper and Loader -> BR: - -Mydumper and Loader are inefficient since they back up and restore data on the logical level. BR is much more efficient because it takes advantage of TiDB features for backup and restore tasks. BR can be applied in huge data volume scenarios. - -### TiDB full data restore - -Loader -> TiDB Lightning: - -Loader is inefficient since it performs full data restoration using SQL. TiDB Lightning imports data into TiKV directly, so it is much more efficient and can be used for fast full import of large amounts (more than TBs) of data into a new TiDB cluster. - -TiDB Lightning also integrates the logical data import function of Loader and supports online data import. For details, see [TiDB Lightning TiDB-backend Document](/reference/tools/tidb-lightning/tidb-backend.md#migrating-from-loader-to-tidb-lightning-tidb-backend). - -### MySQL data migration - -- Mydumper, Loader and Syncer -> DM: - - It is tedious to migrate MySQL data to TiDB using Mydumper, Loader, and Syncer. DM provides an integrated data migration approach that improves usability. DM can be also used to merge the sharded schemas and tables. - -- Loader -> TiDB Lightning: - - TiDB Lightning integrates the logical data import function of Loader. See [TiDB Lightning TiDB-backend document](/reference/tools/tidb-lightning/tidb-backend.md#migrating-from-loader-to-tidb-lightning-tidb-backend) for details. It is used to perform full data restoration. - -## Data migration solutions - -For TiDB 2.1, 3.0, and 3.1 versions, this section introduces data migration solutions in typical application scenarios. - -### Full link data migration solutions for v3.0 - -#### Migrating MySQL data to TiDB - -If the volume is more than TBs of data, the recommended migration steps are: - -1. Export full MySQL data using Mydumper; -2. Import full backup data from MySQL into a TiDB cluster using TiDB Lightning; -3. Replicate the incremental data of MySQL into TiDB. - -If the volume is less than TBs of data, it is recommended to migrate MySQL data to TiDB using DM (the migrating process includes full data import and incremental data replication). - -#### Replication of TiDB cluster data - -It is recommended that you use TiDB Binlog to replicate TiDB data to downstream TiDB/MySQL. - -#### Full backup and restore of TiDB cluster data - -The recommended steps are: - -1. Back up full data using Mydumper; -2. Restore full data into TiDB/MySQL using TiDB Lightning. - -### Full link data migration solutions for v3.1 - -#### Migrating MySQL data to TiDB - -If the volume is more than TBs of data, the recommended migration steps are: - -1. Export full MySQL data using Mydumper; -2. Import full backup data from MySQL into a TiDB cluster using TiDB Lightning; -3. Replicate the incremental data of MySQL into TiDB. - -If the volume is less than TBs of data, it is recommended to migrate MySQL data to TiDB using DM (the migrating process includes full data import and incremental data replication). - -#### Replication of TiDB cluster data - -It is recommended that you use TiDB Binlog to replicate TiDB data to downstream TiDB/MySQL. - -#### Full backup and restore of TiDB cluster data - -- Restore to TiDB - - - Back up full data using BR; - - Restore full data using BR. - -- Restore to MySQL - - - Back up full data using Mydumper; - - Restore full data using TiDB Lightning.