From 9433e7b29d60752cb226ed54fe9339562d0f1499 Mon Sep 17 00:00:00 2001 From: TomShawn <41534398+TomShawn@users.noreply.github.com> Date: Tue, 13 Apr 2021 18:09:51 +0800 Subject: [PATCH 1/4] cherry pick #5286 to release-4.0 Signed-off-by: ti-srebot --- TOC.md | 2 +- br/backup-and-restore-storages.md | 111 ++++++++++++++---- br/backup-and-restore-tool.md | 4 +- dumpling-overview.md | 6 +- faq/migration-tidb-faq.md | 2 +- sql-statements/sql-statement-backup.md | 4 +- sql-statements/sql-statement-restore.md | 4 +- table-filter.md | 4 +- tidb-lightning/monitor-tidb-lightning.md | 8 +- tidb-lightning/tidb-lightning-checkpoints.md | 6 +- .../tidb-lightning-configuration.md | 25 ++-- tidb-lightning/tidb-lightning-faq.md | 16 +-- tidb-lightning/tidb-lightning-glossary.md | 2 +- tidb-lightning/tidb-lightning-overview.md | 7 +- tidb-troubleshooting-map.md | 2 +- 15 files changed, 139 insertions(+), 64 deletions(-) diff --git a/TOC.md b/TOC.md index d9bf343c70bcf..59b4f00b43d10 100644 --- a/TOC.md +++ b/TOC.md @@ -167,7 +167,7 @@ + [BR Tool Overview](/br/backup-and-restore-tool.md) + [Use BR Command-line for Backup and Restoration](/br/backup-and-restore-tool.md) + [BR Use Cases](/br/backup-and-restore-use-cases.md) - + [BR Storages](/br/backup-and-restore-storages.md) + + [External Storages](/br/backup-and-restore-storages.md) + [BR FAQ](/br/backup-and-restore-faq.md) + TiDB Binlog + [Overview](/tidb-binlog/tidb-binlog-overview.md) diff --git a/br/backup-and-restore-storages.md b/br/backup-and-restore-storages.md index 0bae43d6a8cce..64e4c6d3143f7 100644 --- a/br/backup-and-restore-storages.md +++ b/br/backup-and-restore-storages.md @@ -1,12 +1,18 @@ --- +<<<<<<< HEAD title: BR Storages summary: Describes the storage URL format used in BR. aliases: ['/docs/stable/br/backup-and-restore-storages/','/docs/v4.0/br/backup-and-restore-storages/'] +======= +title: External Storages +summary: Describes the storage URL format used in BR, TiDB Lightning, and Dumpling. +aliases: ['/docs/dev/br/backup-and-restore-storages/'] +>>>>>>> 3f76f22b... *: generalize and link to the external storage docs from Lightning (#5286) --- -# BR Storages +# External Storages -BR supports reading and writing data on the local filesystem, as well as on Amazon S3 and Google Cloud Storage. These are distinguished by the URL scheme in the `--storage` parameter passed into BR. +Backup & Restore (BR), TiDB Lighting, and Dumpling support reading and writing data on the local filesystem and on Amazon S3. BR also supports reading and writing data on the Google Cloud Storage (GCS). These are distinguished by the URL scheme in the `--storage` parameter passed into BR, in the `-d` parameter passed into TiDB Lightning, and in the `--output` (`-o`) parameter passed into Dumpling. ## Schemes @@ -19,19 +25,40 @@ The following services are supported: | Google Cloud Storage (GCS) | gcs, gs | `gcs://bucket-name/prefix/of/dest/` | | Write to nowhere (for benchmarking only) | noop | `noop://` | -## Parameters +## URL parameters Cloud storages such as S3 and GCS sometimes require additional configuration for connection. You can specify parameters for such configuration. For example: -{{< copyable "shell-regular" >}} ++ Use Dumpling to export data to S3: -```shell -./br backup full -u 127.0.0.1:2379 -s 's3://bucket-name/prefix?region=us-west-2' -``` + {{< copyable "shell-regular" >}} + + ```bash + ./dumpling -u root -h 127.0.0.1 -P 3306 -B mydb -F 256MiB \ + -o 's3://my-bucket/sql-backup?region=us-west-2' + ``` + ++ Use TiDB Lightning to import data from S3: + + {{< copyable "shell-regular" >}} + + ```bash + ./tidb-lightning --tidb-port=4000 --pd-urls=127.0.0.1:2379 --backend=local --sorted-kv-dir=/tmp/sorted-kvs \ + -d 's3://my-bucket/sql-backup?region=us-west-2' + ``` + ++ Use BR to back up data to GCS: -### S3 parameters + {{< copyable "shell-regular" >}} -| Parameter | Description | + ```bash + ./br backup full -u 127.0.0.1:2379 \ + -s 'gcs://bucket-name/prefix' + ``` + +### S3 URL parameters + +| URL parameter | Description | |----------:|---------| | `access-key` | The access key | | `secret-access-key` | The secret access key | @@ -46,30 +73,64 @@ Cloud storages such as S3 and GCS sometimes require additional configuration for > **Note:** > -> It is not recommended to pass in the access key and secret access key directly in the storage URL, because these keys are logged in plain text. BR tries to infer these keys from the environment in the following order: +> It is not recommended to pass in the access key and secret access key directly in the storage URL, because these keys are logged in plain text. The migration tools try to infer these keys from the environment in the following order: 1. `$AWS_ACCESS_KEY_ID` and `$AWS_SECRET_ACCESS_KEY` environment variables 2. `$AWS_ACCESS_KEY` and `$AWS_SECRET_KEY` environment variables -3. Shared credentials file on the BR node at the path specified by the `$AWS_SHARED_CREDENTIALS_FILE` environment variable -4. Shared credentials file on the BR node at `~/.aws/credentials` +3. Shared credentials file on the tool node at the path specified by the `$AWS_SHARED_CREDENTIALS_FILE` environment variable +4. Shared credentials file on the tool node at `~/.aws/credentials` 5. Current IAM role of the Amazon EC2 container 6. Current IAM role of the Amazon ECS task -### GCS parameters +### GCS URL parameters -| Parameter | Description | +| URL parameter | Description | |----------:|---------| -| `credentials-file` | The path to the credentials JSON file on the TiDB node | +| `credentials-file` | The path to the credentials JSON file on the tool node | | `storage-class` | Storage class of the uploaded objects (for example, `STANDARD`, `COLDLINE`) | | `predefined-acl` | Predefined ACL of the uploaded objects (for example, `private`, `project-private`) | -When `credentials-file` is not specified, BR will try to infer the credentials from the environment, in the following order: +When `credentials-file` is not specified, the migration tool will try to infer the credentials from the environment, in the following order: -1. Content of the file on the BR node at the path specified by the `$GOOGLE_APPLICATION_CREDENTIALS` environment variable -2. Content of the file on the BR node at `~/.config/gcloud/application_default_credentials.json` +1. Content of the file on the tool node at the path specified by the `$GOOGLE_APPLICATION_CREDENTIALS` environment variable +2. Content of the file on the tool node at `~/.config/gcloud/application_default_credentials.json` 3. When running in GCE or GAE, the credentials fetched from the metadata server. -## Sending credentials to TiKV +## Command-line parameters + +In addition to the URL parameters, BR and Dumpling also support specifying these configurations using command-line parameters. For example: + +{{< copyable "shell-regular" >}} + +```bash +./dumpling -u root -h 127.0.0.1 -P 3306 -B mydb -F 256MiB \ + -o 's3://my-bucket/sql-backup' \ + --s3.region 'us-west-2' +``` + +If you have specified URL parameters and command-line parameters at the same time, the URL parameters are overwritten by the command-line parameters. + +### S3 command-line parameters + +| Command-line parameter | Description | +|----------:|------| +| `--s3.region` | Amazon S3's service region, which defaults to `us-east-1`. | +| `--s3.endpoint` | The URL of custom endpoint for S3-compatible services. For example, `https://s3.example.com/`. | +| `--s3.storage-class` | The storage class of the upload object. For example, `STANDARD` and `STANDARD_IA`. | +| `--s3.sse` | The server-side encryption algorithm used to encrypt the upload. The value options are empty, `AES256` and `aws:kms`. | +| `--s3.sse-kms-key-id` | If `--s3.sse` is configured as `aws:kms`, this parameter is used to specify the KMS ID. | +| `--s3.acl` | The canned ACL of the upload object. For example, `private` and `authenticated-read`. | +| `--s3.provider` | The type of the S3-compatible service. The supported types are `aws`, `alibaba`, `ceph`, `netease` and `other`. | + +### GCS command-line parameters + +| Command-line parameter | Description | +|----------:|---------| +| `--gcs.credentials-file` | The path of the JSON-formatted credential on the tool node. | +| `--gcs.storage-class` | The storage type of the upload object, such as `STANDARD` and `COLDLINE`. | +| `--gcs.predefined-acl` | The pre-defined ACL of the upload object, such as `private` and `project-private`. | + +## BR sending credentials to TiKV By default, when using S3 and GCS destinations, BR will send the credentials to every TiKV nodes to reduce setup complexity. @@ -77,6 +138,16 @@ However, this is unsuitable on cloud environment, where every node has their own {{< copyable "shell-regular" >}} -```shell +```bash ./br backup full -c=0 -u pd-service:2379 -s 's3://bucket-name/prefix' ``` + +When using SQL statements to [back up](/sql-statements/sql-statement-backup.md) and [restore](/sql-statements/sql-statement-restore.md) data, you can add the `SEND_CREDENTIALS_TO_TIKV = FALSE` option: + +{{< copyable "sql" >}} + +```sql +BACKUP DATABASE * TO 's3://bucket-name/prefix' SEND_CREDENTIALS_TO_TIKV = FALSE; +``` + +This option is not supported in TiDB Lightning and Dumpling, because the two applications are currently standalone. diff --git a/br/backup-and-restore-tool.md b/br/backup-and-restore-tool.md index 4ad58ba3a979b..55cb5703685e8 100644 --- a/br/backup-and-restore-tool.md +++ b/br/backup-and-restore-tool.md @@ -180,7 +180,7 @@ In the Kubernetes environment, you can use the BR tool to back up TiDB cluster d > **Note:** > -> For Amazon S3 and Google Cloud Storage parameter descriptions, see the [BR Storages](/br/backup-and-restore-storages.md) document. +> For Amazon S3 and Google Cloud Storage parameter descriptions, see the [External Storages](/br/backup-and-restore-storages.md#url-parameters) document. - [Back up Data to S3-Compatible Storage Using BR](https://docs.pingcap.com/tidb-in-kubernetes/stable/backup-to-aws-s3-using-br) - [Restore Data from S3-Compatible Storage Using BR](https://docs.pingcap.com/tidb-in-kubernetes/stable/restore-from-aws-s3-using-br) @@ -194,4 +194,4 @@ In the Kubernetes environment, you can use the BR tool to back up TiDB cluster d - [Use BR Command-line](/br/use-br-command-line-tool.md) - [BR Use Cases](/br/backup-and-restore-use-cases.md) - [BR FAQ](/br/backup-and-restore-faq.md) -- [BR Storages](/br/backup-and-restore-storages.md) +- [External Storages](/br/backup-and-restore-storages.md) diff --git a/dumpling-overview.md b/dumpling-overview.md index 9de667a7c0397..2c54d0c52dfa8 100644 --- a/dumpling-overview.md +++ b/dumpling-overview.md @@ -159,7 +159,7 @@ export AWS_ACCESS_KEY_ID=${AccessKey} export AWS_SECRET_ACCESS_KEY=${SecretKey} ``` -Dumpling also supports reading credential files from `~/.aws/credentials`. For more Dumpling configuration, see the configuration of [BR storages](/br/backup-and-restore-storages.md), which is consistent with the Dumpling configuration. +Dumpling also supports reading credential files from `~/.aws/credentials`. For more Dumpling configuration, see the configuration of [External storages](/br/backup-and-restore-storages.md). When you back up data using Dumpling, explicitly specify the `--s3.region` parameter, which means the region of the S3 storage: @@ -311,7 +311,7 @@ After your operation is completed, set the GC time back (the default value is `1 update mysql.tidb set VARIABLE_VALUE = '10m' where VARIABLE_NAME = 'tikv_gc_life_time'; ``` -Finally, all the exported data can be imported back to TiDB using [Lightning](/tidb-lightning/tidb-lightning-backends.md). +Finally, all the exported data can be imported back to TiDB using [TiDB Lightning](/tidb-lightning/tidb-lightning-backends.md). ## Option list of Dumpling @@ -335,7 +335,7 @@ Finally, all the exported data can be imported back to TiDB using [Lightning](/t | `-s` or `--statement-size` | Control the size of the `INSERT` statements; the unit is bytes | | `-F` or `--filesize` | The file size of the divided tables. The unit must be specified such as `128B`, `64KiB`, `32MiB`, and `1.5GiB`. | | `--filetype` | Exported file type (csv/sql) | "sql" | -| `-o` or `--output` | Exported file path | "./export-${time}" | +| `-o` or `--output` | The path of exported local files or [the URL of the external storage](/br/backup-and-restore-storages.md) | "./export-${time}" | | `-S` or `--sql` | Export data according to the specified SQL statement. This command does not support concurrent export. | | `--consistency` | flush: use FTWRL before the dump
snapshot: dump the TiDB data of a specific snapshot of a TSO
lock: execute `lock tables read` on all tables to be dumped
none: dump without adding locks, which cannot guarantee consistency
auto: use --consistency flush for MySQL; use --consistency snapshot for TiDB | "auto" | | `--snapshot` | Snapshot TSO; valid only when `consistency=snapshot` | diff --git a/faq/migration-tidb-faq.md b/faq/migration-tidb-faq.md index fce3d14ac41be..4d54fe661587b 100644 --- a/faq/migration-tidb-faq.md +++ b/faq/migration-tidb-faq.md @@ -169,7 +169,7 @@ If the amount of data that needs to be deleted at a time is very large, this loo ### How to improve the data loading speed in TiDB? -- The [Lightning](/tidb-lightning/tidb-lightning-overview.md) tool is developed for distributed data import. It should be noted that the data import process does not perform a complete transaction process for performance reasons. Therefore, the ACID constraint of the data being imported during the import process cannot be guaranteed. The ACID constraint of the imported data can only be guaranteed after the entire import process ends. Therefore, the applicable scenarios mainly include importing new data (such as a new table or a new index) or the full backup and restoring (truncate the original table and then import data). +- The [TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md) tool is developed for distributed data import. It should be noted that the data import process does not perform a complete transaction process for performance reasons. Therefore, the ACID constraint of the data being imported during the import process cannot be guaranteed. The ACID constraint of the imported data can only be guaranteed after the entire import process ends. Therefore, the applicable scenarios mainly include importing new data (such as a new table or a new index) or the full backup and restoring (truncate the original table and then import data). - Data loading in TiDB is related to the status of disks and the whole cluster. When loading data, pay attention to metrics like the disk usage rate of the host, TiClient Error, Backoff, Thread CPU and so on. You can analyze the bottlenecks using these metrics. ### What should I do if it is slow to reclaim storage space after deleting data? diff --git a/sql-statements/sql-statement-backup.md b/sql-statements/sql-statement-backup.md index ce84c08fd5464..ebfb1db92e576 100644 --- a/sql-statements/sql-statement-backup.md +++ b/sql-statements/sql-statement-backup.md @@ -98,7 +98,7 @@ BACKUP DATABASE * TO 'local:///mnt/backup/full/'; Note that the system tables (`mysql.*`, `INFORMATION_SCHEMA.*`, `PERFORMANCE_SCHEMA.*`, …) will not be included into the backup. -### Remote destinations +### External storages BR supports backing up data to S3 or GCS: @@ -108,7 +108,7 @@ BR supports backing up data to S3 or GCS: BACKUP DATABASE `test` TO 's3://example-bucket-2020/backup-05/?region=us-west-2&access-key={YOUR_ACCESS_KEY}&secret-access-key={YOUR_SECRET_KEY}'; ``` -The URL syntax is further explained in [BR storages](/br/backup-and-restore-storages.md). +The URL syntax is further explained in [External Storages](/br/backup-and-restore-storages.md). When running on cloud environment where credentials should not be distributed, set the `SEND_CREDENTIALS_TO_TIKV` option to `FALSE`: diff --git a/sql-statements/sql-statement-restore.md b/sql-statements/sql-statement-restore.md index 4ed0309a3e7d4..75d9efd8db126 100644 --- a/sql-statements/sql-statement-restore.md +++ b/sql-statements/sql-statement-restore.md @@ -89,7 +89,7 @@ RESTORE DATABASE `test` FROM 'local:///mnt/backup/2020/04/'; RESTORE TABLE `test`.`sbtest01`, `test`.`sbtest02` FROM 'local:///mnt/backup/2020/04/'; ``` -### Remote destinations +### External storages BR supports restoring data from S3 or GCS: @@ -99,7 +99,7 @@ BR supports restoring data from S3 or GCS: RESTORE DATABASE * FROM 's3://example-bucket-2020/backup-05/?region=us-west-2'; ``` -The URL syntax is further explained in [BR storages](/br/backup-and-restore-storages.md). +The URL syntax is further explained in [External Storages](/br/backup-and-restore-storages.md). When running on cloud environment where credentials should not be distributed, set the `SEND_CREDENTIALS_TO_TIKV` option to `FALSE`: diff --git a/table-filter.md b/table-filter.md index 5ce306ad0c8fb..d3db62a5c568d 100644 --- a/table-filter.md +++ b/table-filter.md @@ -36,7 +36,7 @@ Table filters can be applied to the tools using multiple `-f` or `--filter` comm # ^~~~~~~~~~~~~~~~~~~~~~~ ``` -* [Lightning](/tidb-lightning/tidb-lightning-overview.md): +* [TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md): {{< copyable "shell-regular" >}} @@ -49,7 +49,7 @@ Table filters can be applied to the tools using multiple `-f` or `--filter` comm Table filters in TOML files are specified as [array of strings](https://toml.io/en/v1.0.0-rc.1#section-15). The following lists the example usage in each tool. -* Lightning: +* TiDB Lightning: ```toml [mydumper] diff --git a/tidb-lightning/monitor-tidb-lightning.md b/tidb-lightning/monitor-tidb-lightning.md index c64e1cbc9b2f7..b5d6652d80266 100644 --- a/tidb-lightning/monitor-tidb-lightning.md +++ b/tidb-lightning/monitor-tidb-lightning.md @@ -56,7 +56,7 @@ Otherwise, the dashboard JSON can be imported from **TiKV** - 6.3.4 `Checkpoint for … has invalid status:(error code)` - - Cause: Checkpoint is enabled, and Lightning/Importer has previously abnormally exited. To prevent accidental data corruption, Lightning will not start until the error is addressed. The error code is an integer less than 25, with possible values as `0, 3, 6, 9, 12, 14, 15, 17, 18, 20 and 21`. The integer indicates the step where the unexpected exit occurs in the import process. The larger the integer is, the later the exit occurs. + - Cause: Checkpoint is enabled, and Lightning/Importer has previously abnormally exited. To prevent accidental data corruption, TiDB Lightning will not start until the error is addressed. The error code is an integer less than 25, with possible values as `0, 3, 6, 9, 12, 14, 15, 17, 18, 20 and 21`. The integer indicates the step where the unexpected exit occurs in the import process. The larger the integer is, the later the exit occurs. - Solution: See [Troubleshooting Solution](/tidb-lightning/tidb-lightning-faq.md#checkpoint-for--has-invalid-status-error-code). From 7306735eaebfac1b0f68f58b26e36deaaddfc9fb Mon Sep 17 00:00:00 2001 From: TomShawn <41534398+TomShawn@users.noreply.github.com> Date: Tue, 13 Apr 2021 19:20:03 +0800 Subject: [PATCH 2/4] resolve conflicts --- benchmark/benchmark-tidb-using-tpcc.md | 20 ++++++++++---------- br/backup-and-restore-storages.md | 8 +------- faq/migration-tidb-faq.md | 4 ++-- 3 files changed, 13 insertions(+), 19 deletions(-) diff --git a/benchmark/benchmark-tidb-using-tpcc.md b/benchmark/benchmark-tidb-using-tpcc.md index 836d4175d0a4c..9d64aeda652eb 100644 --- a/benchmark/benchmark-tidb-using-tpcc.md +++ b/benchmark/benchmark-tidb-using-tpcc.md @@ -182,7 +182,7 @@ This process might last for several hours depending on the machine configuration ### Use TiDB Lightning to load data -The amount of loaded data increases as the number of warehouses increases. When you need to load more than 1000 warehouses of data, you can first use BenchmarkSQL to generate CSV files, and then quickly load the CSV files through TiDB Lightning (hereinafter referred to as Lightning). The CSV files can be reused multiple times, which saves the time required for each generation. +The amount of loaded data increases as the number of warehouses increases. When you need to load more than 1000 warehouses of data, you can first use BenchmarkSQL to generate CSV files, and then quickly load the CSV files through TiDB Lightning. The CSV files can be reused multiple times, which saves the time required for each generation. Follow the steps below to use TiDB Lightning to load data: @@ -194,7 +194,7 @@ Follow the steps below to use TiDB Lightning to load data: fileLocation=/home/user/csv/ # The absolute path of the directory where your CSV files are stored ``` - It is recommended that the CSV file names adhere to the naming rules in Lightning, that is, `{database}.{table}.csv`, because eventually you'll use Lightning to load data. Here you can modify the above configuration as follows: + It is recommended that the CSV file names adhere to the naming rules in TiDB Lightning, that is, `{database}.{table}.csv`, because eventually you'll use TiDB Lightning to load data. Here you can modify the above configuration as follows: ```text fileLocation=/home/user/csv/tpcc. # The absolute path of the directory where your CSV files are stored + the file name prefix (database) @@ -210,9 +210,9 @@ Follow the steps below to use TiDB Lightning to load data: ./runLoader.sh props.mysql ``` -3. Use Lightning to load data. +3. Use TiDB Lightning to load data. - To load data using Lightning, see [TiDB Lightning Deployment](/tidb-lightning/deploy-tidb-lightning.md). The following steps introduce how to use TiDB Ansible to deploy Lightning and use Lightning to load data. + To load data using TiDB Lightning, see [TiDB Lightning Deployment](/tidb-lightning/deploy-tidb-lightning.md). The following steps introduce how to use TiDB Ansible to deploy TiDB Lightning and use TiDB Lightning to load data. 1. Edit `inventory.ini`. @@ -240,7 +240,7 @@ Follow the steps below to use TiDB Lightning to load data: trim-last-separator: false ``` - 3. Deploy Lightning and Importer. + 3. Deploy TiDB Lightning and TiKV Importer. {{< copyable "shell-regular" >}} @@ -248,14 +248,14 @@ Follow the steps below to use TiDB Lightning to load data: ansible-playbook deploy.yml --tags=lightning ``` - 4. Start Lightning and Importer. + 4. Start TiDB Lightning and TiKV Importer. - * Log into the server where Lightning and Importer are deployed. + * Log into the server where TiDB Lightning and TiKV Importer are deployed. * Enter the deployment directory. - * Execute `scripts/start_importer.sh` under the Importer directory to start Importer. - * Execute `scripts/start_lightning.sh` under the Lightning directory to begin to load data. + * Execute `scripts/start_importer.sh` under the TiKV Importer directory to start Importer. + * Execute `scripts/start_lightning.sh` under the TiDB Lightning directory to begin to load data. - Because you've used TiDB Ansible deployment method, you can see the loading progress of Lightning on the monitoring page, or check whether the loading process is completed through the log. + Because you've used TiDB Ansible deployment method, you can see the loading progress of TiDB Lightning on the monitoring page, or check whether the loading process is completed through the log. Fourth, after successfully loading data, you can run `sql.common/test.sql` to validate the correctness of the data. If all SQL statements return an empty result, then the data is correctly loaded. diff --git a/br/backup-and-restore-storages.md b/br/backup-and-restore-storages.md index 64e4c6d3143f7..c4587cfb4ef33 100644 --- a/br/backup-and-restore-storages.md +++ b/br/backup-and-restore-storages.md @@ -1,13 +1,7 @@ --- -<<<<<<< HEAD -title: BR Storages -summary: Describes the storage URL format used in BR. -aliases: ['/docs/stable/br/backup-and-restore-storages/','/docs/v4.0/br/backup-and-restore-storages/'] -======= title: External Storages summary: Describes the storage URL format used in BR, TiDB Lightning, and Dumpling. -aliases: ['/docs/dev/br/backup-and-restore-storages/'] ->>>>>>> 3f76f22b... *: generalize and link to the external storage docs from Lightning (#5286) +aliases: ['/docs/stable/br/backup-and-restore-storages/','/docs/v4.0/br/backup-and-restore-storages/'] --- # External Storages diff --git a/faq/migration-tidb-faq.md b/faq/migration-tidb-faq.md index 4d54fe661587b..4b1c3de5b349a 100644 --- a/faq/migration-tidb-faq.md +++ b/faq/migration-tidb-faq.md @@ -90,9 +90,9 @@ See [Syncer User Guide](/syncer-overview.md). Download and import [Syncer Json](https://github.com/pingcap/docs/blob/master/etc/Syncer.json) to Grafana. Edit the Prometheus configuration file and add the following content: ``` -- job_name: 'syncer_ops' // task name +- job_name: 'syncer_ops' # task name static_configs: - - targets: [’10.10.1.1:10096’] // Syncer monitoring address and port, informing Prometheus to pull the data of Syncer + - targets: [’10.10.1.1:10096’] # Syncer monitoring address and port, informing Prometheus to pull the data of Syncer ``` Restart Prometheus. From 5e9cb920d5f83db39a9ebdc66fbe112ae9dc0b76 Mon Sep 17 00:00:00 2001 From: TomShawn <41534398+TomShawn@users.noreply.github.com> Date: Tue, 13 Apr 2021 19:22:26 +0800 Subject: [PATCH 3/4] Lightning --> TiDB Lightning --- tidb-lightning/migrate-from-csv-using-tidb-lightning.md | 4 ++-- tidb-lightning/tidb-lightning-checkpoints.md | 4 ++-- tidb-lightning/tidb-lightning-faq.md | 2 +- tiflash/use-tiflash.md | 2 +- 4 files changed, 6 insertions(+), 6 deletions(-) diff --git a/tidb-lightning/migrate-from-csv-using-tidb-lightning.md b/tidb-lightning/migrate-from-csv-using-tidb-lightning.md index 0f8cffa6134d7..2d4b2f7c15e23 100644 --- a/tidb-lightning/migrate-from-csv-using-tidb-lightning.md +++ b/tidb-lightning/migrate-from-csv-using-tidb-lightning.md @@ -138,9 +138,9 @@ TiDB Lightning does not support every option supported by the `LOAD DATA` statem ## Strict format -Lightning works the best when the input files have uniform size around 256 MB. When the input is a single huge CSV file, Lightning can only use one thread to process it, which slows down import speed a lot. +Lightning works the best when the input files have uniform size around 256 MB. When the input is a single huge CSV file, TiDB Lightning can only use one thread to process it, which slows down import speed a lot. -This can be fixed by splitting the CSV into multiple files first. For the generic CSV format, there is no way to quickly identify when a row starts and ends without reading the whole file. Therefore, Lightning by default does *not* automatically split a CSV file. However, if you are certain that the CSV input adheres to certain restrictions, you can enable the `strict-format` setting to allow Lightning to split the file into multiple 256 MB-sized chunks for parallel processing. +This can be fixed by splitting the CSV into multiple files first. For the generic CSV format, there is no way to quickly identify when a row starts and ends without reading the whole file. Therefore, TiDB Lightning by default does *not* automatically split a CSV file. However, if you are certain that the CSV input adheres to certain restrictions, you can enable the `strict-format` setting to allow TiDB Lightning to split the file into multiple 256 MB-sized chunks for parallel processing. ```toml [mydumper] diff --git a/tidb-lightning/tidb-lightning-checkpoints.md b/tidb-lightning/tidb-lightning-checkpoints.md index f022a976bac64..f0fc42ae5c16e 100644 --- a/tidb-lightning/tidb-lightning-checkpoints.md +++ b/tidb-lightning/tidb-lightning-checkpoints.md @@ -31,7 +31,7 @@ driver = "file" # The data source name (DSN) indicating the location of the checkpoint storage. # -# For the "file" driver, the DSN is a path. If the path is not specified, Lightning would +# For the "file" driver, the DSN is a path. If the path is not specified, TiDB Lightning would # default to "/tmp/CHECKPOINT_SCHEMA.pb". # # For the "mysql" driver, the DSN is a URL in the form of "USER:PASS@tcp(HOST:PORT)/". @@ -54,7 +54,7 @@ TiDB Lightning supports two kinds of checkpoint storage: a local file or a remot * With `driver = "mysql"`, checkpoints can be saved in any databases compatible with MySQL 5.7 or later, including MariaDB and TiDB. By default, the checkpoints are saved in the target database. -While using the target database as the checkpoints storage, Lightning is importing large amounts of data at the same time. This puts extra stress on the target database and sometimes leads to communication timeout. Therefore, **it is strongly recommended to install a temporary MySQL server to store these checkpoints**. This server can be installed on the same host as `tidb-lightning` and can be uninstalled after the importer progress is completed. +While using the target database as the checkpoints storage, TiDB Lightning is importing large amounts of data at the same time. This puts extra stress on the target database and sometimes leads to communication timeout. Therefore, **it is strongly recommended to install a temporary MySQL server to store these checkpoints**. This server can be installed on the same host as `tidb-lightning` and can be uninstalled after the importer progress is completed. ## Checkpoints control diff --git a/tidb-lightning/tidb-lightning-faq.md b/tidb-lightning/tidb-lightning-faq.md index 9c1acf3e1d6e6..412e0bb2ee0f8 100644 --- a/tidb-lightning/tidb-lightning-faq.md +++ b/tidb-lightning/tidb-lightning-faq.md @@ -115,7 +115,7 @@ To stop the `tikv-importer` process, you can choose the corresponding operation To stop the `tidb-lightning` process, you can choose the corresponding operation according to your deployment method. -- For deployment using TiDB Ansible: run `scripts/stop_lightning.sh` on the Lightning server. +- For deployment using TiDB Ansible: run `scripts/stop_lightning.sh` on the TiDB Lightning server. - For manual deployment: if `tidb-lightning` is running in foreground, press Ctrl+C to exit. Otherwise, obtain the process ID using the `ps aux | grep tidb-lighting` command and then terminate the process using the `kill -2 «pid»` command. diff --git a/tiflash/use-tiflash.md b/tiflash/use-tiflash.md index 1bf8f9168715f..2474bd665c622 100644 --- a/tiflash/use-tiflash.md +++ b/tiflash/use-tiflash.md @@ -56,7 +56,7 @@ ALTER TABLE `tpch50`.`lineitem` SET TIFLASH REPLICA 0 * For versions earlier than v4.0.6, if you create the TiFlash replica before using TiDB Lightning to import the data, the data import will fail. You must import data to the table before creating the TiFlash replica for the table. -* If TiDB and TiDB Lightning are both v4.0.6 or later, no matter a table has TiFlash replica(s) or not, you can import data to that table using TiDB Lightning. Note that this might slow the TiDB Lightning procedure, which depends on the NIC bandwidth on the lightning host, the CPU and disk load of the TiFlash node, and the number of TiFlash replicas. +* If TiDB and TiDB Lightning are both v4.0.6 or later, no matter a table has TiFlash replica(s) or not, you can import data to that table using TiDB Lightning. Note that this might slow the TiDB Lightning procedure, which depends on the NIC bandwidth on the TiDB Lightning host, the CPU and disk load of the TiFlash node, and the number of TiFlash replicas. * It is recommended that you do not replicate more than 1,000 tables because this lowers the PD scheduling performance. This limit will be removed in later versions. From cd9e466114e401baccefda02475d8908d6744b3e Mon Sep 17 00:00:00 2001 From: TomShawn <41534398+TomShawn@users.noreply.github.com> Date: Tue, 13 Apr 2021 19:26:11 +0800 Subject: [PATCH 4/4] Update monitor-tidb-lightning.md --- tidb-lightning/monitor-tidb-lightning.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/tidb-lightning/monitor-tidb-lightning.md b/tidb-lightning/monitor-tidb-lightning.md index b5d6652d80266..e968a45304b44 100644 --- a/tidb-lightning/monitor-tidb-lightning.md +++ b/tidb-lightning/monitor-tidb-lightning.md @@ -47,8 +47,7 @@ scrape_configs: [Grafana](https://grafana.com/) is a web interface to visualize Prometheus metrics as dashboards. -If TiDB Lightning is installed using TiDB Ansible, its dashboard is already installed. -Otherwise, the dashboard JSON can be imported from . +If TiDB Lightning is installed using TiDB Ansible, its dashboard is already installed. Otherwise, the dashboard JSON can be imported from . ### Row 1: Speed