From b096071ab0a24fb04d100ee31e1011edb33fd2a0 Mon Sep 17 00:00:00 2001 From: TomShawn <41534398+TomShawn@users.noreply.github.com> Date: Wed, 11 Nov 2020 16:46:56 +0800 Subject: [PATCH 1/7] Add synchronous replication docs --- TOC.md | 1 + pd-configuration-file.md | 4 ++ pd-control.md | 2 + synchronous-replication.md | 93 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 100 insertions(+) create mode 100644 synchronous-replication.md diff --git a/TOC.md b/TOC.md index b553535b1f447..649b4456d387e 100644 --- a/TOC.md +++ b/TOC.md @@ -141,6 +141,7 @@ + Tutorials + [Multiple Data Centers in One City Deployment](/multi-data-centers-in-one-city-deployment.md) + [Three Data Centers in Two Cities Deployment](/three-data-centers-in-two-cities-deployment.md) + + [Synchronous Replication for Dual Data Centers](/synchronous-replication.md) + Best Practices + [Use TiDB](/tidb-best-practices.md) + [Java Application Development](/best-practices/java-app-best-practices.md) diff --git a/pd-configuration-file.md b/pd-configuration-file.md index 1034af943989b..8b776d86bc795 100644 --- a/pd-configuration-file.md +++ b/pd-configuration-file.md @@ -375,3 +375,7 @@ Configuration items related to the [TiDB Dashboard](/dashboard/dashboard-intro.m + Determines whether to enable the telemetry collection feature in TiDB Dashboard. + Default value: `true` + See [Telemetry](/telemetry.md) for details. + +## `replication-mode` + +Configuration items related to the replication mode of a single Region. See [Enable synchronous replication in PD configuration file](/synchronous-replication.md#enable-synchronous-replication-in-pd-configuration-file) for details. diff --git a/pd-control.md b/pd-control.md index 33abdb86080a9..c42d3203ba20e 100644 --- a/pd-control.md +++ b/pd-control.md @@ -295,6 +295,8 @@ Usage: config set cluster-version 1.0.8 // Set the version of the cluster to 1.0.8 ``` +- `replication-mode` controls the replication mode of a single Region in the dual data center scenario. See [Change replication mode manually](/synchronous-replication.md#change-replication-mode-manually) for details. + - `leader-schedule-policy` is used to select the scheduling strategy for the leader. You can schedule the leader according to `size` or `count`. - `scheduler-max-waiting-operator` is used to control the number of waiting operators in each scheduler. diff --git a/synchronous-replication.md b/synchronous-replication.md new file mode 100644 index 0000000000000..106de57f63628 --- /dev/null +++ b/synchronous-replication.md @@ -0,0 +1,93 @@ +--- +title: Synchronous Replication for Dual Data Centers +summary: Learn how to configure synchronous replication. +--- + +# Synchronous Replication for Dual Data Centers + +This document introduces how to configure synchronous replication for dual data centers. + +> **Warning:** +> +> Synchronous replication is still an experimental feature. Do not use it in a production environment. + +In the scenario of dual data centers, one is the primary center and the other is the DR (data recovery) center. When a Region has an odd number of replicas, more replicas are placed in the primary center. When the DR center is down for more than a specified period of time, the asynchronous mode is used by default for the replication between two centers. In this situation, the primary center will serve requests on its own. + +## Enable synchronous replication in PD configuration file + +The replication mode is controlled by PD. You can configure in the PD configuration file when deploying a cluster. See the following example: + +{{< copyable "" >}} + +```toml +[replication-mode] +replication-mode = "dr-auto-sync" +[replication-mode.dr-auto-sync] +label-key = "zone" +primary = "z1" +dr = "z2" +primary-replicas = 2 +dr-replicas = 1 +wait-store-timeout = "1m" +wait-sync-timeout = "1m" +``` + +In the configuration above: + ++ `dr-auto-sync` is the mode to enable synchronous replication. ++ The label key `zone` is used to distinguish different data centers. ++ TiKV instances with the `"z1"` value are considered the primary data center, and TiKV instances with `"z2"` are the DR data center. ++ `primary-replicas` is the number of replicas that should be placed in the primary data center. ++ `dr-replicas` is the number of replicas that should be placed in the DR data center. ++ `wait-store-timeout` is the time to wait before falling back to asynchronous replication. ++ `wait-sync-timeout` is the time to wait before forcing TiKV to change replication mode (currently not supported). + +To check the current replication state of the cluster, use the following URL: + +{{< copyable "shell-regular" >}} + +```bash +% curl http://pd_ip:pd_port/pd/api/v1/replication_mode/status +``` + +```bash +{ + "mode": "dr-auto-sync", + "dr-auto-sync": { + "label-key": "zone", + "state": "sync" + } +} +``` + +> **Note:** +> +> The replication mode indicates how a single Region is replicated, either `asynchronous` or `synchronous`. The replication state of the cluster indicates how all Regions are replicated, with the options of `async`, `sync-recover`, and `sync`. + +After the cluster state becomes `sync`, it will not become `async` unless the number of down instances is larger than the specified number of replicas in either data center. Once the cluster state becomes `async`, PD requests TiKV to change the replication mode to `asynchronous` and checks whether TiKV instances are recovered from time to time. When the number of down instances is smaller than the number of replicas in both data centers, the cluster enters the `sync-recover` state, and then requests TiKV to change the replication mode to `synchronous`. After all Regions become `synchronous`, the cluster becomes `sync` again. + +## Change replication mode manually + +You can use [`pd-ctl`](/pd-control.md) to change a cluster from `asynchronous` to `synchronous`. + +{{< copyable "shell-regular" >}} + +```bash +>> config set replication-mode dr-auto-sync +``` + +Or change back to `asynchronous`: + +{{< copyable "shell-regular" >}} + +```bash +>> config set replication-mode majority +``` + +You can also update the label key: + +{{< copyable "shell-regular" >}} + +```bash +>> config set replication-mode dr-auto-sync label-key dc +``` From d64edde2020e9d3fbeb668650c7a924147d40258 Mon Sep 17 00:00:00 2001 From: TomShawn <41534398+TomShawn@users.noreply.github.com> Date: Wed, 11 Nov 2020 17:00:37 +0800 Subject: [PATCH 2/7] add a sentence to connect sections --- synchronous-replication.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/synchronous-replication.md b/synchronous-replication.md index 106de57f63628..c047e4e59a60c 100644 --- a/synchronous-replication.md +++ b/synchronous-replication.md @@ -11,7 +11,9 @@ This document introduces how to configure synchronous replication for dual data > > Synchronous replication is still an experimental feature. Do not use it in a production environment. -In the scenario of dual data centers, one is the primary center and the other is the DR (data recovery) center. When a Region has an odd number of replicas, more replicas are placed in the primary center. When the DR center is down for more than a specified period of time, the asynchronous mode is used by default for the replication between two centers. In this situation, the primary center will serve requests on its own. +In the scenario of dual data centers, one is the primary center and the other is the DR (data recovery) center. When a Region has an odd number of replicas, more replicas are placed in the primary center. When the DR center is down for more than a specified period of time, the asynchronous mode is used by default for the replication between two centers. + +To use the synchronous mode, you can configure in the PD configuration file or change the replication mode manually using pd-ctl. ## Enable synchronous replication in PD configuration file From 38a98daaf20f68b0ed41644e3cf2edbd61ba91ba Mon Sep 17 00:00:00 2001 From: TomShawn <41534398+TomShawn@users.noreply.github.com> Date: Mon, 16 Nov 2020 10:56:16 +0800 Subject: [PATCH 3/7] address comment from Jay --- pd-configuration-file.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pd-configuration-file.md b/pd-configuration-file.md index 8b776d86bc795..b2de52b6d1b97 100644 --- a/pd-configuration-file.md +++ b/pd-configuration-file.md @@ -378,4 +378,4 @@ Configuration items related to the [TiDB Dashboard](/dashboard/dashboard-intro.m ## `replication-mode` -Configuration items related to the replication mode of a single Region. See [Enable synchronous replication in PD configuration file](/synchronous-replication.md#enable-synchronous-replication-in-pd-configuration-file) for details. +Configuration items related to the replication mode of all Regions. See [Enable synchronous replication in PD configuration file](/synchronous-replication.md#enable-synchronous-replication-in-pd-configuration-file) for details. From 49b7ea9c34816d07cc57de2c2a20fc0adc28541c Mon Sep 17 00:00:00 2001 From: TomShawn <41534398+TomShawn@users.noreply.github.com> Date: Tue, 17 Nov 2020 17:48:08 +0800 Subject: [PATCH 4/7] Apply suggestions from code review Co-authored-by: Lilian Lee --- pd-control.md | 2 +- synchronous-replication.md | 14 +++++++------- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/pd-control.md b/pd-control.md index c42d3203ba20e..8d49eff1d4bfc 100644 --- a/pd-control.md +++ b/pd-control.md @@ -295,7 +295,7 @@ Usage: config set cluster-version 1.0.8 // Set the version of the cluster to 1.0.8 ``` -- `replication-mode` controls the replication mode of a single Region in the dual data center scenario. See [Change replication mode manually](/synchronous-replication.md#change-replication-mode-manually) for details. +- `replication-mode` controls the replication mode of Regions in the dual data center scenario. See [Change replication mode manually](/synchronous-replication.md#change-the-replication-mode-manually) for details. - `leader-schedule-policy` is used to select the scheduling strategy for the leader. You can schedule the leader according to `size` or `count`. diff --git a/synchronous-replication.md b/synchronous-replication.md index c047e4e59a60c..7e64954e61ed9 100644 --- a/synchronous-replication.md +++ b/synchronous-replication.md @@ -1,6 +1,6 @@ --- title: Synchronous Replication for Dual Data Centers -summary: Learn how to configure synchronous replication. +summary: Learn how to configure synchronous replication for dual data centers. --- # Synchronous Replication for Dual Data Centers @@ -13,11 +13,11 @@ This document introduces how to configure synchronous replication for dual data In the scenario of dual data centers, one is the primary center and the other is the DR (data recovery) center. When a Region has an odd number of replicas, more replicas are placed in the primary center. When the DR center is down for more than a specified period of time, the asynchronous mode is used by default for the replication between two centers. -To use the synchronous mode, you can configure in the PD configuration file or change the replication mode manually using pd-ctl. +To use the synchronous mode, you can configure it in the PD configuration file or change the replication mode manually using pd-ctl. -## Enable synchronous replication in PD configuration file +## Enable synchronous replication in the PD configuration file -The replication mode is controlled by PD. You can configure in the PD configuration file when deploying a cluster. See the following example: +The replication mode is controlled by PD. You can configure it in the PD configuration file when deploying a cluster. See the following example: {{< copyable "" >}} @@ -38,11 +38,11 @@ In the configuration above: + `dr-auto-sync` is the mode to enable synchronous replication. + The label key `zone` is used to distinguish different data centers. -+ TiKV instances with the `"z1"` value are considered the primary data center, and TiKV instances with `"z2"` are the DR data center. ++ TiKV instances with the `"z1"` value are considered in the primary data center, and TiKV instances with `"z2"` are in the DR data center. + `primary-replicas` is the number of replicas that should be placed in the primary data center. + `dr-replicas` is the number of replicas that should be placed in the DR data center. + `wait-store-timeout` is the time to wait before falling back to asynchronous replication. -+ `wait-sync-timeout` is the time to wait before forcing TiKV to change replication mode (currently not supported). ++ `wait-sync-timeout` is the time to wait before forcing TiKV to change the replication mode (currently not supported). To check the current replication state of the cluster, use the following URL: @@ -68,7 +68,7 @@ To check the current replication state of the cluster, use the following URL: After the cluster state becomes `sync`, it will not become `async` unless the number of down instances is larger than the specified number of replicas in either data center. Once the cluster state becomes `async`, PD requests TiKV to change the replication mode to `asynchronous` and checks whether TiKV instances are recovered from time to time. When the number of down instances is smaller than the number of replicas in both data centers, the cluster enters the `sync-recover` state, and then requests TiKV to change the replication mode to `synchronous`. After all Regions become `synchronous`, the cluster becomes `sync` again. -## Change replication mode manually +## Change the replication mode manually You can use [`pd-ctl`](/pd-control.md) to change a cluster from `asynchronous` to `synchronous`. From 1a77774df414b5caff9279f47489e281abec8c8a Mon Sep 17 00:00:00 2001 From: TomShawn <41534398+TomShawn@users.noreply.github.com> Date: Tue, 17 Nov 2020 19:45:32 +0800 Subject: [PATCH 5/7] Update pd-configuration-file.md --- pd-configuration-file.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pd-configuration-file.md b/pd-configuration-file.md index b2de52b6d1b97..3d1f2cc9aaa13 100644 --- a/pd-configuration-file.md +++ b/pd-configuration-file.md @@ -378,4 +378,4 @@ Configuration items related to the [TiDB Dashboard](/dashboard/dashboard-intro.m ## `replication-mode` -Configuration items related to the replication mode of all Regions. See [Enable synchronous replication in PD configuration file](/synchronous-replication.md#enable-synchronous-replication-in-pd-configuration-file) for details. +Configuration items related to the replication mode of all Regions. See [Enable synchronous replication in PD configuration file](/synchronous-replication.md#enable-synchronous-replication-in-the-pd-configuration-file) for details. From 681da0f95634d825bc0a409509151f67fbb4d0a8 Mon Sep 17 00:00:00 2001 From: TomShawn <41534398+TomShawn@users.noreply.github.com> Date: Thu, 19 Nov 2020 13:44:19 +0800 Subject: [PATCH 6/7] Update synchronous-replication.md --- synchronous-replication.md | 1 - 1 file changed, 1 deletion(-) diff --git a/synchronous-replication.md b/synchronous-replication.md index 7e64954e61ed9..b2bc60f53e5b6 100644 --- a/synchronous-replication.md +++ b/synchronous-replication.md @@ -42,7 +42,6 @@ In the configuration above: + `primary-replicas` is the number of replicas that should be placed in the primary data center. + `dr-replicas` is the number of replicas that should be placed in the DR data center. + `wait-store-timeout` is the time to wait before falling back to asynchronous replication. -+ `wait-sync-timeout` is the time to wait before forcing TiKV to change the replication mode (currently not supported). To check the current replication state of the cluster, use the following URL: From 1be20b3e9581aed566b666c63e0ee985f3ad6728 Mon Sep 17 00:00:00 2001 From: TomShawn <41534398+TomShawn@users.noreply.github.com> Date: Fri, 20 Nov 2020 16:01:26 +0800 Subject: [PATCH 7/7] delete a sentence --- synchronous-replication.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/synchronous-replication.md b/synchronous-replication.md index b2bc60f53e5b6..16b51c2a31b3b 100644 --- a/synchronous-replication.md +++ b/synchronous-replication.md @@ -63,7 +63,7 @@ To check the current replication state of the cluster, use the following URL: > **Note:** > -> The replication mode indicates how a single Region is replicated, either `asynchronous` or `synchronous`. The replication state of the cluster indicates how all Regions are replicated, with the options of `async`, `sync-recover`, and `sync`. +> The replication state of the cluster indicates how all Regions are replicated, with the options of `async`, `sync-recover`, and `sync`. After the cluster state becomes `sync`, it will not become `async` unless the number of down instances is larger than the specified number of replicas in either data center. Once the cluster state becomes `async`, PD requests TiKV to change the replication mode to `asynchronous` and checks whether TiKV instances are recovered from time to time. When the number of down instances is smaller than the number of replicas in both data centers, the cluster enters the `sync-recover` state, and then requests TiKV to change the replication mode to `synchronous`. After all Regions become `synchronous`, the cluster becomes `sync` again.