From e80055743de8d8458f06cefca25c0237565a537f Mon Sep 17 00:00:00 2001 From: yikeke Date: Fri, 3 Apr 2020 21:59:27 +0800 Subject: [PATCH 1/6] tiflash: add the deploy.md doc --- reference/tiflash/deploy.md | 129 ++++++++++++++++++++++++++++++++++++ 1 file changed, 129 insertions(+) create mode 100644 reference/tiflash/deploy.md diff --git a/reference/tiflash/deploy.md b/reference/tiflash/deploy.md new file mode 100644 index 0000000000000..220bc9fdae80b --- /dev/null +++ b/reference/tiflash/deploy.md @@ -0,0 +1,129 @@ +--- +title: Deploy TiFlash Cluster +summary: Learn the requirements and methods of deploying a TiFlash cluster. +category: reference +--- + +# Deploy TiFlash Cluster + +> **Note:** +> +> If you want to get a first-hand experience on how to use TiFlash RC version, contact [PingCAP](mailto:info@pingcap.com) for more information and assistance. + +This document introduces the environment requirements for deploying a TiFlash cluster and the deployment methods in different scenarios. + +## Recommended hardware configuration + +This section provides hardware configuration recommendations based on different TiFlash deployment methods. + +### TiFlash standalone deployment + +* Minimum configuration: 32 VCore, 64 GB RAM, 1 SSD + n HDD +* Recommended configuration: 48 VCore, 128 GB RAM, 1 NVMe SSD + n SSD + +There is no limit to the number of deployment machines (one at least). A single machine can use multiple disks, but deploying multiple instances on a single machine is not recommended. + +It is recommended to use an SSD disk to buffer the real-time data being replicated and written to TiKV. The performance of this disk is not lower than the hard disk used by TiKV. It is recommended that you use a better performance NVMe SSD and the SSD‘s capacity is not less than 10% of the total capacity. Otherwise, it may become the bottleneck of the amount of data that this node can handle. + +For other hard disks, you can choose to use multiple HDDs or regular SSDs. But if you can afford, a better hard disk will bring better performance. + +TiFlash supports multi-directory storage, so there is no need to use RAID. + +### TiFlash and TiKV are deployed on the same node + +See [Hardware recommendations for TiKV server](/how-to/deploy/hardware-recommendations.md#server-recommendations), and increase the memory capacity and the number of and CPU cores as needed. + +It is **not** recommended to deploy TiFlash and TiKV on the same disk to prevent mutual interference. + +Hard disk selection criteria are the same as [TiFlash standalone deployment](#tiflash-standalone-deployment). The total capacity of the hard disk is roughly: `the to-be-replicated data amount of the entire TiKV cluster / the number of TiKV replicas / 2`. + +For example, the overall planned capacity of TiKV is three replicas, and the recommended capacity of TiFlash will be one sixth of the TiKV cluster. You can choose to replicate part of tables instead of all. + +## TiDB version requirements + +Currently, the testing of TiFlash is based on the related components of TiDB 3.1 (including TiDB, PD, TiKV, and TiFlash). For the download method of TiDB 3.1, refer to the following installation and deployment steps. + +## Install and deploy TiFlash + +This section describes how to install and deploy TiFlash in the following scenarios: + +- [Fresh TiFlash deployment](#fresh-tiflash-deployment) +- [Add TiFlash component to the existing TiDB cluster](#add-tiflash-component-to-the-existing-tidb-cluster) + +> **Note:** +> +> 1. Before starting the TiFlash process, you must ensure that PD's Placement Rules feature is enabled (For how to enable it, see the **second step** in the [add TiFlash component to the existing TiDB cluster](#add-tiflash-component-to-the-existing-tidb-cluster) section). +> 2. When TiFlash is running, you must ensure that PD's Placement Rules feature remains enabled. + +### Fresh TiFlash deployment + +For fresh TiFlash deployment, it is recommended to deploy TiFlash by downloading an offline installation package. The steps are as follows: + +1. Download the offline package of your desired version and unzip it. + + - If you are using TiDB 4.0 beta version, execute the following command: + + {{< copyable "shell-regular" >}} + + ```shell + curl -o tidb-ansible-tiflash-4.0-v3-20200331.tar.gz https://download.pingcap.org/tidb-ansible-tiflash-4.0-v3-20200331.tar.gz && + tar zxvf tidb-ansible-tiflash-4.0-v3-20200331.tar.gz + ``` + + - If you are using TiDB 3.1 rc version, execute the following command: + + {{< copyable "shell-regular" >}} + + ```shell + curl -o tidb-ansible-tiflash-3.1-rc.tar.gz https://download.pingcap.org/tidb-ansible-tiflash-3.1-rc.tar.gz && + tar zxvf tidb-ansible-tiflash-3.1-rc.tar.gz + ``` + +2. Edit the `inventory.ini` configuration file. In addition to [configuring for TiDB cluster deployment](/how-to/deploy/orchestrated/ansible.md#step-9-edit-the-inventoryini-file-to-orchestrate-the-tidb-cluster), you also need to specify the IPs of your TiFlash servers under the `[tiflash_servers]` section (currently only IPs are supported; domain names are not supported). + + If you want to customize the deployment directory, configure the `data_dir` parameter. If you want multi-disk deployment, separate the deployment directories with commas (note that the parent directory of each `data_dir` directory needs to give the tidb user write permissions). For example: + + {{< copyable "" >}} + + ```ini + [tiflash_servers] + 192.168.1.1 data_dir=/data1/tiflash/data,/data2/tiflash/data + ``` + +3. Complete the [remaining steps](/how-to/deploy/orchestrated/ansible.md#step-10-edit-variables-in-the-inventoryini-file) of the TiDB Ansible deployment process. + +4. To verify that TiFlash has been successfully deployed: execute the `pd-ctl store http://your-pd-address` command in [pd-ctl](/reference/tools/pd-control.md) (`resources/bin` in the tidb-ansible directory includes the pd-ctl binary file), and you can observe that the status of the deployed TiFlash instance is "Up". + +### Add TiFlash component to the existing TiDB cluster + +1. First, confirm that your current TiDB version supports TiFlash, otherwise you need to upgrade your TiDB cluster to 3.1 rc or higher according to [TiDB Upgrade Guide](/how-to/upgrade/from-previous-version.md). + +2. Execute the `config set enable-placement-rules true` command in [pd-ctl](/reference/tools/pd-control.md) (`resources/bin` in the tidb-ansible directory includes the pd-ctl binary file) to enable PD's Placement Rules feature. + +3. Edit the `inventory.ini` configuration file. You need to specify the IPs of your TiFlash servers under the `[tiflash_servers]` section (currently only IPs are supported; domain names are not supported). + + If you want to customize the deployment directory, configure the `data_dir` parameter. If you want multi-disk deployment, separate the deployment directories with commas (note that the parent directory of each `data_dir` directory needs to give the tidb user write permissions). For example: + + {{< copyable "" >}} + + ```ini + [tiflash_servers] + 192.168.1.1 data_dir=/data1/tiflash/data,/data2/tiflash/data + ``` + + > **Note:** + > + > Even if TiFlash and TiKV are deployed on the same machine, TiFlash uses a different default port from TiKV. TiFlash's default port is 9000. If you want to modify the port, add a new line `tcp_port=xxx` to the `inventory.ini` configuration file. + +4. Execute the following ansible-playbook command to deploy TiFlash: + + {{< copyable "shell-regular" >}} + + ```shell + ansible-playbook local_prepare.yml && + ansible-playbook -t tiflash deploy.yml && + ansible-playbook -t tiflash start.yml && + ansible-playbook rolling_update_monitor.yml + ``` + +5. To verify that TiFlash has been successfully deployed: execute the `pd-ctl store http://your-pd-address` command in [pd-ctl](/reference/tools/pd-control.md) (`resources/bin` in the tidb-ansible directory includes the pd-ctl binary file), and you can observe that the status of the deployed TiFlash instance is "Up". From ce63ed2a787a824267d3b593c0ebbbe4af87b17c Mon Sep 17 00:00:00 2001 From: yikeke Date: Fri, 3 Apr 2020 22:11:03 +0800 Subject: [PATCH 2/6] refine format and content --- reference/tiflash/deploy.md | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-) diff --git a/reference/tiflash/deploy.md b/reference/tiflash/deploy.md index 220bc9fdae80b..051215c8a4401 100644 --- a/reference/tiflash/deploy.md +++ b/reference/tiflash/deploy.md @@ -23,9 +23,9 @@ This section provides hardware configuration recommendations based on different There is no limit to the number of deployment machines (one at least). A single machine can use multiple disks, but deploying multiple instances on a single machine is not recommended. -It is recommended to use an SSD disk to buffer the real-time data being replicated and written to TiKV. The performance of this disk is not lower than the hard disk used by TiKV. It is recommended that you use a better performance NVMe SSD and the SSD‘s capacity is not less than 10% of the total capacity. Otherwise, it may become the bottleneck of the amount of data that this node can handle. +It is recommended to use an SSD disk to buffer the real-time data being replicated and written to TiKV. The performance of this disk need to be not lower than the hard disk used by TiKV. It is recommended that you use a better performance NVMe SSD and the SSD‘s capacity is not less than 10% of the total capacity. Otherwise, it may become the bottleneck of the amount of data that this node can handle. -For other hard disks, you can choose to use multiple HDDs or regular SSDs. But if you can afford, a better hard disk will bring better performance. +For other hard disks, you can use multiple HDDs or regular SSDs. A better hard disk will surely bring better performance. TiFlash supports multi-directory storage, so there is no need to use RAID. @@ -35,7 +35,7 @@ See [Hardware recommendations for TiKV server](/how-to/deploy/hardware-recommend It is **not** recommended to deploy TiFlash and TiKV on the same disk to prevent mutual interference. -Hard disk selection criteria are the same as [TiFlash standalone deployment](#tiflash-standalone-deployment). The total capacity of the hard disk is roughly: `the to-be-replicated data amount of the entire TiKV cluster / the number of TiKV replicas / 2`. +Hard disk selection criteria are the same as [TiFlash standalone deployment](#tiflash-standalone-deployment). The total capacity of the hard disk is roughly: `the to-be-replicated data capacity of the entire TiKV cluster / the number of TiKV replicas / 2`. For example, the overall planned capacity of TiKV is three replicas, and the recommended capacity of TiFlash will be one sixth of the TiKV cluster. You can choose to replicate part of tables instead of all. @@ -48,11 +48,11 @@ Currently, the testing of TiFlash is based on the related components of TiDB 3.1 This section describes how to install and deploy TiFlash in the following scenarios: - [Fresh TiFlash deployment](#fresh-tiflash-deployment) -- [Add TiFlash component to the existing TiDB cluster](#add-tiflash-component-to-the-existing-tidb-cluster) +- [Add TiFlash component to an existing TiDB cluster](#add-tiflash-component-to-an-existing-tidb-cluster) > **Note:** > -> 1. Before starting the TiFlash process, you must ensure that PD's Placement Rules feature is enabled (For how to enable it, see the **second step** in the [add TiFlash component to the existing TiDB cluster](#add-tiflash-component-to-the-existing-tidb-cluster) section). +> 1. Before starting the TiFlash process, you must ensure that PD's Placement Rules feature is enabled (For how to enable it, see the **second step** in the [Add TiFlash component to an existing TiDB cluster](#add-tiflash-component-to-an-existing-tidb-cluster) section). > 2. When TiFlash is running, you must ensure that PD's Placement Rules feature remains enabled. ### Fresh TiFlash deployment @@ -81,7 +81,7 @@ For fresh TiFlash deployment, it is recommended to deploy TiFlash by downloading 2. Edit the `inventory.ini` configuration file. In addition to [configuring for TiDB cluster deployment](/how-to/deploy/orchestrated/ansible.md#step-9-edit-the-inventoryini-file-to-orchestrate-the-tidb-cluster), you also need to specify the IPs of your TiFlash servers under the `[tiflash_servers]` section (currently only IPs are supported; domain names are not supported). - If you want to customize the deployment directory, configure the `data_dir` parameter. If you want multi-disk deployment, separate the deployment directories with commas (note that the parent directory of each `data_dir` directory needs to give the tidb user write permissions). For example: + If you want to customize the deployment directory, configure the `data_dir` parameter. If you want multi-disk deployment, separate the deployment directories with commas (note that the parent directory of each `data_dir` directory needs to give the `tidb` user write permissions). For example: {{< copyable "" >}} @@ -92,9 +92,12 @@ For fresh TiFlash deployment, it is recommended to deploy TiFlash by downloading 3. Complete the [remaining steps](/how-to/deploy/orchestrated/ansible.md#step-10-edit-variables-in-the-inventoryini-file) of the TiDB Ansible deployment process. -4. To verify that TiFlash has been successfully deployed: execute the `pd-ctl store http://your-pd-address` command in [pd-ctl](/reference/tools/pd-control.md) (`resources/bin` in the tidb-ansible directory includes the pd-ctl binary file), and you can observe that the status of the deployed TiFlash instance is "Up". +4. To verify that TiFlash has been successfully deployed: + + 1. Execute the `pd-ctl store http://your-pd-address` command in [pd-ctl](/reference/tools/pd-control.md) (`resources/bin` in the tidb-ansible directory includes the pd-ctl binary file). + 2. Observe that the status of the deployed TiFlash instance is "Up". -### Add TiFlash component to the existing TiDB cluster +### Add TiFlash component to an existing TiDB cluster 1. First, confirm that your current TiDB version supports TiFlash, otherwise you need to upgrade your TiDB cluster to 3.1 rc or higher according to [TiDB Upgrade Guide](/how-to/upgrade/from-previous-version.md). @@ -102,7 +105,7 @@ For fresh TiFlash deployment, it is recommended to deploy TiFlash by downloading 3. Edit the `inventory.ini` configuration file. You need to specify the IPs of your TiFlash servers under the `[tiflash_servers]` section (currently only IPs are supported; domain names are not supported). - If you want to customize the deployment directory, configure the `data_dir` parameter. If you want multi-disk deployment, separate the deployment directories with commas (note that the parent directory of each `data_dir` directory needs to give the tidb user write permissions). For example: + If you want to customize the deployment directory, configure the `data_dir` parameter. If you want multi-disk deployment, separate the deployment directories with commas (note that the parent directory of each `data_dir` directory needs to give the `tidb` user write permissions). For example: {{< copyable "" >}} @@ -115,7 +118,7 @@ For fresh TiFlash deployment, it is recommended to deploy TiFlash by downloading > > Even if TiFlash and TiKV are deployed on the same machine, TiFlash uses a different default port from TiKV. TiFlash's default port is 9000. If you want to modify the port, add a new line `tcp_port=xxx` to the `inventory.ini` configuration file. -4. Execute the following ansible-playbook command to deploy TiFlash: +4. Execute the following ansible-playbook commands to deploy TiFlash: {{< copyable "shell-regular" >}} @@ -126,4 +129,7 @@ For fresh TiFlash deployment, it is recommended to deploy TiFlash by downloading ansible-playbook rolling_update_monitor.yml ``` -5. To verify that TiFlash has been successfully deployed: execute the `pd-ctl store http://your-pd-address` command in [pd-ctl](/reference/tools/pd-control.md) (`resources/bin` in the tidb-ansible directory includes the pd-ctl binary file), and you can observe that the status of the deployed TiFlash instance is "Up". +5. To verify that TiFlash has been successfully deployed: + + 1. Execute the `pd-ctl store http://your-pd-address` command in [pd-ctl](/reference/tools/pd-control.md) (`resources/bin` in the tidb-ansible directory includes the pd-ctl binary file). + 2. Observe that the status of the deployed TiFlash instance is "Up". From dd457d0ab279b2fa51d584ba6ad300b2eee37474 Mon Sep 17 00:00:00 2001 From: Soup Date: Mon, 6 Apr 2020 23:42:27 +0800 Subject: [PATCH 3/6] Update deploy.md --- reference/tiflash/deploy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reference/tiflash/deploy.md b/reference/tiflash/deploy.md index 051215c8a4401..11bde3afaa0ec 100644 --- a/reference/tiflash/deploy.md +++ b/reference/tiflash/deploy.md @@ -23,7 +23,7 @@ This section provides hardware configuration recommendations based on different There is no limit to the number of deployment machines (one at least). A single machine can use multiple disks, but deploying multiple instances on a single machine is not recommended. -It is recommended to use an SSD disk to buffer the real-time data being replicated and written to TiKV. The performance of this disk need to be not lower than the hard disk used by TiKV. It is recommended that you use a better performance NVMe SSD and the SSD‘s capacity is not less than 10% of the total capacity. Otherwise, it may become the bottleneck of the amount of data that this node can handle. +It is recommended to use an SSD disk to buffer the real-time data being replicated and written to TiFlash. The performance of this disk need to be not lower than the hard disk used by TiKV. It is recommended that you use a better performance NVMe SSD and the SSD‘s capacity is not less than 10% of the total capacity. Otherwise, it may become the bottleneck of the amount of data that this node can handle. For other hard disks, you can use multiple HDDs or regular SSDs. A better hard disk will surely bring better performance. From ef04ec81a810b67a01bc4fd9d639c29cd73dffd3 Mon Sep 17 00:00:00 2001 From: yikeke Date: Tue, 7 Apr 2020 09:19:14 +0800 Subject: [PATCH 4/6] Update TOC.md --- TOC.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/TOC.md b/TOC.md index f71e30abc05e2..4f00ebc974968 100644 --- a/TOC.md +++ b/TOC.md @@ -301,6 +301,8 @@ - [Grafana Best Practices](/reference/best-practices/grafana-monitor.md) - [TiKV Performance Tuning with Massive Regions](/reference/best-practices/massive-regions.md) - [TiSpark](/reference/tispark.md) + + TiFlash + - [Overview](/reference/tiflash/overview.md) + TiDB Binlog - [Overview](/reference/tidb-binlog/overview.md) - [Deploy](/reference/tidb-binlog/deploy.md) From f30d1176acb87198f2f91aaddb4c05933d78010a Mon Sep 17 00:00:00 2001 From: Keke Yi <40977455+yikeke@users.noreply.github.com> Date: Tue, 7 Apr 2020 09:24:00 +0800 Subject: [PATCH 5/6] Update TOC.md --- TOC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/TOC.md b/TOC.md index 676dcd184573e..8ee7cce7f9758 100644 --- a/TOC.md +++ b/TOC.md @@ -302,7 +302,7 @@ - [TiKV Performance Tuning with Massive Regions](/reference/best-practices/massive-regions.md) - [TiSpark](/reference/tispark.md) + TiFlash - - [Overview](/reference/tiflash/overview.md) + - [Deploy TiFlash Cluster](/reference/tiflash/deploy.md) - [Use TiFlash](/reference/tiflash/use-tiflash.md) + TiDB Binlog - [Overview](/reference/tidb-binlog/overview.md) From eb7a49a0f756cc41c33dcdef2d1acdd6f5242140 Mon Sep 17 00:00:00 2001 From: Keke Yi <40977455+yikeke@users.noreply.github.com> Date: Tue, 7 Apr 2020 09:24:58 +0800 Subject: [PATCH 6/6] Apply suggestions from code review Co-Authored-By: Lilian Lee --- TOC.md | 2 +- reference/tiflash/deploy.md | 10 +++++----- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/TOC.md b/TOC.md index 8ee7cce7f9758..bd339d16f8440 100644 --- a/TOC.md +++ b/TOC.md @@ -302,7 +302,7 @@ - [TiKV Performance Tuning with Massive Regions](/reference/best-practices/massive-regions.md) - [TiSpark](/reference/tispark.md) + TiFlash - - [Deploy TiFlash Cluster](/reference/tiflash/deploy.md) + - [Deploy a TiFlash Cluster](/reference/tiflash/deploy.md) - [Use TiFlash](/reference/tiflash/use-tiflash.md) + TiDB Binlog - [Overview](/reference/tidb-binlog/overview.md) diff --git a/reference/tiflash/deploy.md b/reference/tiflash/deploy.md index 11bde3afaa0ec..8cfa4141c25f6 100644 --- a/reference/tiflash/deploy.md +++ b/reference/tiflash/deploy.md @@ -1,14 +1,14 @@ --- -title: Deploy TiFlash Cluster +title: Deploy a TiFlash Cluster summary: Learn the requirements and methods of deploying a TiFlash cluster. category: reference --- -# Deploy TiFlash Cluster +# Deploy a TiFlash Cluster > **Note:** > -> If you want to get a first-hand experience on how to use TiFlash RC version, contact [PingCAP](mailto:info@pingcap.com) for more information and assistance. +> If you want to get a first-hand experience on how to use the TiFlash RC version, contact [PingCAP](mailto:info@pingcap.com) for more information and assistance. This document introduces the environment requirements for deploying a TiFlash cluster and the deployment methods in different scenarios. @@ -23,7 +23,7 @@ This section provides hardware configuration recommendations based on different There is no limit to the number of deployment machines (one at least). A single machine can use multiple disks, but deploying multiple instances on a single machine is not recommended. -It is recommended to use an SSD disk to buffer the real-time data being replicated and written to TiFlash. The performance of this disk need to be not lower than the hard disk used by TiKV. It is recommended that you use a better performance NVMe SSD and the SSD‘s capacity is not less than 10% of the total capacity. Otherwise, it may become the bottleneck of the amount of data that this node can handle. +It is recommended to use an SSD disk to buffer the real-time data being replicated and written to TiFlash. The performance of this disk need to be not lower than the hard disk used by TiKV. It is recommended that you use a better performance NVMe SSD and the SSD's capacity is not less than 10% of the total capacity. Otherwise, it might become the bottleneck of the amount of data that this node can handle. For other hard disks, you can use multiple HDDs or regular SSDs. A better hard disk will surely bring better performance. @@ -37,7 +37,7 @@ It is **not** recommended to deploy TiFlash and TiKV on the same disk to prevent Hard disk selection criteria are the same as [TiFlash standalone deployment](#tiflash-standalone-deployment). The total capacity of the hard disk is roughly: `the to-be-replicated data capacity of the entire TiKV cluster / the number of TiKV replicas / 2`. -For example, the overall planned capacity of TiKV is three replicas, and the recommended capacity of TiFlash will be one sixth of the TiKV cluster. You can choose to replicate part of tables instead of all. +For example, if the overall planned capacity of TiKV is three replicas, then the recommended capacity of TiFlash will be one sixth of the TiKV cluster. You can choose to replicate part of tables instead of all. ## TiDB version requirements