diff --git a/TOC.md b/TOC.md index 458251a7ce2f0..0a1e94794805b 100644 --- a/TOC.md +++ b/TOC.md @@ -398,7 +398,13 @@ + Maintain - [Destroy a TiDB cluster](/tidb-in-kubernetes/maintain/destroy-tidb-cluster.md) - [Maintain a Hosting Kubernetes Node](/tidb-in-kubernetes/maintain/kubernetes-node.md) - - [Backup and Restore](/tidb-in-kubernetes/maintain/backup-and-restore.md) + + Backup and Restore + - [Helm Charts-Based Backup and Restoration](/tidb-in-kubernetes/maintain/backup-and-restore/charts.md) + + CRD-Based Backup and Restoration + - [Back up TiDB Cluster Data to GCS](/tidb-in-kubernetes/maintain/backup-and-restore/backup-gcs.md) + - [Restore Data From GCS](/tidb-in-kubernetes/maintain/backup-and-restore/restore-gcs.md) + - [Back up TiDB Cluster Data to S3-Compatible Storage](/tidb-in-kubernetes/maintain/backup-and-restore/backup-s3.md) + - [Restore Data From S3-Compatible Storage](/tidb-in-kubernetes/maintain/backup-and-restore/restore-s3.md) - [Restore Data with TiDB Lightning](/tidb-in-kubernetes/maintain/lightning.md) - [Collect Logs](/tidb-in-kubernetes/maintain/log-collecting.md) - [Automatic Failover](/tidb-in-kubernetes/maintain/auto-failover.md) diff --git a/tidb-in-kubernetes/maintain/backup-and-restore/backup-gcs.md b/tidb-in-kubernetes/maintain/backup-and-restore/backup-gcs.md new file mode 100644 index 0000000000000..b39c216593d5c --- /dev/null +++ b/tidb-in-kubernetes/maintain/backup-and-restore/backup-gcs.md @@ -0,0 +1,204 @@ +--- +title: Back up TiDB Cluster Data to GCS +summary: Learn how to back up the TiDB cluster to GCS. +category: how-to +--- + +# Back up TiDB Cluster Data to GCS + +This document describes how to back up the data of the TiDB cluster in Kubernetes to [Google Cloud Storage (GCS)](https://cloud.google.com/storage/docs/). "Backup" in this document refers to full backup (ad-hoc full backup and scheduled full backup). For the underlying implementation, [`mydumper`](/reference/tools/mydumper.md) is used to get the logic backup of the TiDB cluster, and then this backup data is sent to the remote GCS. + +The backup method described in this document is implemented based on CustomResourceDefinition (CRD) in TiDB Operator v1.1 or later versions. For the backup method implemented based on Helm Charts, refer to [Back up and Restore TiDB Cluster Data Based on Helm Charts](/tidb-in-kubernetes/maintain/backup-and-restore/charts.md). + +## Ad-hoc full backup to GCS + +Ad-hoc full backup describes a backup operation by creating a `Backup` custom resource (CR) object. TiDB Operator performs the specific backup operation based on this `Backup` object. If an error occurs during the backup process, TiDB Operator does not retry and you need to handle this error manually. + +To better explain how to perform the backup operation, this document shows an example in which the data of the `demo1` TiDB cluster is backed up to the `test1` Kubernetes namespace. + +### Prerequisites for ad-hoc backup + +1. Download [backup-rbac.yaml](https://github.com/pingcap/tidb-operator/blob/master/manifests/backup/backup-rbac.yaml) and execute the following command to create the role-based access control (RBAC) resources in the `test1` namespace: + + {{< copyable "shell-regular" >}} + + ```shell + kubectl apply -f backup-rbac.yaml -n test1 + ``` + +2. Create the `gcs-secret` secret which stores the credential used to access GCS. The `google-credentials.json` file stores the service account key that you have downloaded from the GCP console. Refer to [GCP documentation](https://cloud.google.com/docs/authentication/getting-started) for details. + + {{< copyable "shell-regular" >}} + + ```shell + kubectl create secret generic gcs-secret --from-file=credentials=./google-credentials.json -n test1 + ``` + +3. Create the `backup-demo1-tidb-secret` secret which stores the root account and password needed to access the TiDB cluster: + + {{< copyable "shell-regular" >}} + + ```shell + kubectl create secret generic backup-demo1-tidb-secret --from-literal=password= --namespace=test1 + ``` + +### Ad-hoc backup process + +Create the `Backup` CR and back up data to GCS: + +{{< copyable "shell-regular" >}} + +```shell +kubectl apply -f backup-gcs.yaml +``` + +The `backup-gcs.yaml` file has the following content: + +```yaml +--- +apiVersion: pingcap.com/v1alpha1 +kind: Backup +metadata: + name: demo1-backup-gcs + namespace: test1 +spec: + from: + host: + port: + user: + secretName: backup-demo1-tidb-secret + gcs: + secretName: gcs-secret + projectId: + # location: us-east1 + # storageClass: STANDARD_IA + # objectAcl: private + # bucketAcl: private + storageClassName: local-storage + storageSize: 10Gi +``` + +In the above example, all data of the TiDB cluster is exported and backed up to GCS. You can ignore the `location`, `objectAcl`, `bucketAcl`, and `storageClass` items in the GCS configuration. + +`projectId` in the configuration is the unique identifier of the user project on GCP. To learn how to get this identifier, refer to the [GCP documentation](https://cloud.google.com/resource-manager/docs/creating-managing-projects). + +GCS supports the following `storageClass` types: + +* `MULTI_REGIONAL` +* `REGIONAL` +* `NEARLINE` +* `COLDLINE` +* `DURABLE_REDUCED_AVAILABILITY` + +If `storageClass` is not configured, `COLDLINE` is used by default. For the detailed description of these storage types, refer to [GCS documentation](https://cloud.google.com/storage/docs/storage-classes). + +GCS supports the following object access-control list (ACL) polices: + +* `authenticatedRead` +* `bucketOwnerFullControl` +* `bucketOwnerRead` +* `private` +* `projectPrivate` +* `publicRead` + +If the object ACL policy is not configured, the `private` policy is used by default. For the detailed description of these access control policies, refer to [GCS documentation](https://cloud.google.com/storage/docs/access-control/lists). + +GCS supports the following bucket ACL policies: + +* `authenticatedRead` +* `private` +* `projectPrivate` +* `publicRead` +* `publicReadWrite` + +If the bucket ACL policy is not configured, the `private` policy is used by default. For the detailed description of these access control policies, refer to [GCS documentation](https://cloud.google.com/storage/docs/access-control/lists). + +After creating the `Backup` CR, execute the following command to check the backup status: + +{{< copyable "shell-regular" >}} + + ```shell + kubectl get bk -n test1 -owide + ``` + +More `Backup` CR configurations are as described follows: + +* `.spec.metadata.namespace`: the namespace where the `Backup` CR is located. +* `.spec.from.host`: the address of the TiDB cluster to be backed up. +* `.spec.from.port`: the port of the TiDB cluster to be backed up. +* `.spec.from.user`: the accessing user of the TiDB cluster to be backed up. +* `.spec.from.tidbSecretName`: the secret of the credential needed by the TiDB cluster to be backed up. +* `.spec.storageClassName`: the persistent volume (PV) type specified for the backup operation. If this item is not specified, the value of the `default-backup-storage-class-name` parameter (`standard` by default, specified when TiDB Operator is started) is used by default. +* `.spec.storageSize`: the PV size specified for the backup operation. This value must be greater than size of the TiDB cluster to be backed up. + +## Scheduled full backup to GCS + +You can set a backup policy to perform scheduled backups of the TiDB cluster, and set a backup retention policy to avoid excessive backup items. A scheduled full backup is described by a custom `BackupSchedule` CR object. A full backup is triggered at each backup time point. Its underlying implementation is the ad-hoc full backup. + +### Prerequisites for scheduled backup + +The prerequisites for the scheduled backup is the same with the [prerequisites for ad-hoc backup](#prerequisites-for-ad-hoc-backup). + +### Scheduled backup process + +Create the `BackupSchedule` CR to enable the scheduled full backup to GCS: + +{{< copyable "shell-regular" >}} + +```shell +kubectl apply -f backup-schedule-gcs.yaml +``` + +The `backup-gcs.yaml` file has the following content: + +```yaml +--- +apiVersion: pingcap.com/v1alpha1 +kind: BackupSchedule +metadata: + name: demo1-backup-schedule-gcs + namespace: test1 +spec: + #maxBackups: 5 + #pause: true + maxReservedTime: "3h" + schedule: "*/2 * * * *" + backupTemplate: + from: + host: + port: + user: + secretName: backup-demo1-tidb-secret + gcs: + secretName: gcs-secret + projectId: + # location: us-east1 + # storageClass: STANDARD_IA + # objectAcl: private + # bucketAcl: private + storageClassName: local-storage + storageSize: 10Gi +``` + +After creating the scheduled full backup, use the following command to check the backup status: + +{{< copyable "shell-regular" >}} + +```shell +kubectl get bks -n test1 -owide +``` + +Execute the following command to check all the backup items: + +{{< copyable "shell-regular" >}} + + ```shell + kubectl get bk -l tidb.pingcap.com/backup-schedule=demo1-backup-schedule-gcs -n test1 + ``` + +From the above example, you can see that the `backupSchedule` configuration consists of two part. One is the unique configuration of `backupSchedule` and the other is `backupTemplate`. `backupTemple` specifies the configuration related to the GCS storage, which is the same with the configuration of the ad-hoc full backup to GCS (refer to [GCS backup process](#ad-hoc-backup-process) for details). The following are the unique configuration items of `backupSchedule`: + ++ `.spec.maxBackups`: A backup retention policy, which determines the maximum number of backup items to be retained. When this value is exceeded, the outdated backup items will be deleted. If you set this configuration item to `0`, all backup items are retained. ++ `.spec.maxReservedTime`: A backup retention policy based on time. For example, if you set the value of this configuration to `24h`, backup items only of recent 24 hours are retained. All backup items out of this time are deleted. For the time format, refer to [`func ParseDuration`](https://golang.org/pkg/time/#ParseDuration). If you have set the maximum number of backup items and the longest retention time of backup items at the same time, the latter setting takes effect. ++ `.spec.schedule`: The time scheduling format of Cron. Refer to [Cron](https://en.wikipedia.org/wiki/Cron) for details. ++ `.spec.pause`: `false` by default. If this parameter is set to `true`, the scheduled scheduling is paused. In this situation, the backup operation will not be performed even if the scheduling time is reached. During this pause, the backup [Garbage Collection](/reference/garbage-collection/overview.md) (GC) runs normally. If you change `true` to `false`, the full backup process is restarted. diff --git a/tidb-in-kubernetes/maintain/backup-and-restore/backup-s3.md b/tidb-in-kubernetes/maintain/backup-and-restore/backup-s3.md new file mode 100644 index 0000000000000..92dbcc9e550ce --- /dev/null +++ b/tidb-in-kubernetes/maintain/backup-and-restore/backup-s3.md @@ -0,0 +1,271 @@ +--- +title: Back up TiDB Cluster Data to S3-Compatible Storage +summary: Learn how to back up the TiDB cluster to the S3-compatible storage. +category: how-to +--- + +# Back up TiDB Cluster Data to S3-Compatible Storage + +This document describes how to back up the data of the TiDB cluster in Kubernetes to the S3-compatible storage. "Backup" in this document refers to full backup (ad-hoc full backup and scheduled full backup). For the underlying implementation, [`mydumper`](/reference/tools/mydumper.md) is used to get the logic backup of the TiDB cluster, and then this backup data is sent to the S3-compatible storage. + +The backup method described in this document is implemented based on CustomResourceDefinition (CRD) in TiDB Operator v1.1 or later versions. For the backup method implemented based on Helm Charts, refer to [Back up and Restore TiDB Cluster Data Based on Helm Charts](/tidb-in-kubernetes/maintain/backup-and-restore/charts.md). + +## Ad-hoc full backup + +Ad-hoc full backup describes the backup by creating a `Backup` custom resource (CR) object. TiDB Operator performs the specific backup operation based on this `Backup` object. If an error occurs during the backup process, TiDB Operator does not retry and you need to handle this error manually. + +For the current S3-compatible storage types, Ceph and Amazon S3 work normally as tested. Therefore, this document shows examples in which the data of the `demo1` TiDB cluster in the `test1` Kubernetes namespace is backed up to Ceph and Amazon S3 respectively. + +### Prerequisites for ad-hoc backup + +1. Download [backup-rbac.yaml](https://github.com/pingcap/tidb-operator/blob/master/manifests/backup/backup-rbac.yaml) and execute the following command to create the role-based access control (RBAC) resources in the `test1` namespace: + + {{< copyable "shell-regular" >}} + + ```shell + kubectl apply -f backup-rbac.yaml -n test1 + ``` + +2. Create the `s3-secret` secret which stores the credential used to access the S3-compatible storage: + + {{< copyable "shell-regular" >}} + + ```shell + kubectl create secret generic s3-secret --from-literal=access_key=xxx --from-literal=secret_key=yyy --namespace=test1 + ``` + +3. Create the `backup-demo1-tidb-secret` secret which stores the root account and password needed to access the TiDB cluster: + + {{< copyable "shell-regular" >}} + + ```shell + kubectl create secret generic backup-demo1-tidb-secret --from-literal=password= --namespace=test1 + ``` + +### Ad-hoc backup process + ++ Create the `Backup` CR and back up data to Amazon S3: + + {{< copyable "shell-regular" >}} + + ```shell + kubectl apply -f backup-s3.yaml + ``` + + The `backup-s3.yaml` file has the following content: + + ```yaml + --- + apiVersion: pingcap.com/v1alpha1 + kind: Backup + metadata: + name: demo1-backup-s3 + namespace: test1 + spec: + from: + host: + port: + user: + secretName: backup-demo1-tidb-secret + s3: + provider: aws + secretName: s3-secret + # region: us-east-1 + # storageClass: STANDARD_IA + # acl: private + # endpoint: + storageClassName: local-storage + storageSize: 10Gi + ``` + ++ Create the `Backup` CR and back up data to Ceph: + + {{< copyable "shell-regular" >}} + + ```shell + kubectl apply -f backup-s3.yaml + ``` + + The `backup-s3.yaml` file has the following content: + + ```yaml + --- + apiVersion: pingcap.com/v1alpha1 + kind: Backup + metadata: + name: demo1-backup-s3 + namespace: test1 + spec: + from: + host: + port: + user: + secretName: backup-demo1-tidb-secret + s3: + provider: ceph + secretName: s3-secret + endpoint: http://10.0.0.1:30074 + storageClassName: local-storage + storageSize: 10Gi + ``` + +In the above two examples, all data of the TiDB cluster is exported and backed up to Amazon S3 and Ceph respectively. You can ignore the `region`, `acl`, `endpoint`, and `storageClass` configuration items in the Amazon S3 configuration. S3-compatible storage types other than Amazon S3 can also use configuration similar to that of Amazon S3. You can also leave the configuration item fields empty if you do not need to configure these items as shown in the above Ceph configuration. + +Amazon S3 supports the following access-control list (ACL) polices: + +* `private` +* `public-read` +* `public-read-write` +* `authenticated-read` +* `bucket-owner-read` +* `bucket-owner-full-control` + +If the ACL policy is not configured, the `private` policy is used by default. For the detailed description of these access control policies, refer to [AWS documentation](https://docs.aws.amazon.com/AmazonS3/latest/dev/acl-overview.html). + +Amazon S3 supports the following `storageClass` types: + +* `STANDARD` +* `REDUCED_REDUNDANCY` +* `STANDARD_IA` +* `ONEZONE_IA` +* `GLACIER` +* `DEEP_ARCHIVE` + +If `storageClass` is not configured, `STANDARD_IA` is used by default. For the detailed description of these storage types, refer to [AWS documentation](https://docs.aws.amazon.com/AmazonS3/latest/dev/storage-class-intro.html). + +After creating the `Backup` CR, use the following command to check the backup status: + +{{< copyable "shell-regular" >}} + + ```shell + kubectl get bk -n test1 -owide + ``` + +More `Backup` CR configurations are as described follows: + +* `.spec.metadata.namespace`: the namespace where the `Backup` CR is located. +* `.spec.from.host`: the address of the TiDB cluster to be backed up. +* `.spec.from.port`: the port of the TiDB cluster to be backed up. +* `.spec.from.user`: the accessing user of the TiDB cluster to be backed up. +* `.spec.from.tidbSecretName`: the secret of the credential needed by the TiDB cluster to be backed up. +* `.spec.storageClassName`: the persistent volume (PV) type specified for the backup operation. If this item is not specified, the value of the `default-backup-storage-class-name` parameter (`standard` by default, specified when TiDB Operator is started) is used by default. +* `.spec.storageSize`: the PV size specified for the backup operation. This value must be greater than size of the TiDB cluster to be backed up. + +More S3-compatible `provider`s are described as follows: + +* `alibaba`: Alibaba Cloud Object Storage System (OSS) formerly Aliyun +* `digitalocean`: Digital Ocean Spaces +* `dreamhost`: Dreamhost DreamObjects +* `ibmcos`: IBM COS S3 +* `minio`: Minio Object Storage +* `netease`: Netease Object Storage (NOS) +* `wasabi`: Wasabi Object Storage +* `other`: Any other S3 compatible provider + +## Scheduled full backup to S3-compatible storage + +You can set a backup policy to perform scheduled backups of the TiDB cluster, and set a backup retention policy to avoid excessive backup items. A scheduled full backup is described by a custom `BackupSchedule` CR object. A full backup is triggered at each backup time point. Its underlying implementation is the ad-hoc full backup. + +### Prerequisites for scheduled backup + +The prerequisites for the scheduled backup is the same with the [prerequisites for ad-hoc backup](#prerequisites-for-ad-hoc-backup). + +### Scheduled backup process + ++ Create the `BackupSchedule` CR to enable the scheduled full backup to Amazon S3: + + {{< copyable "shell-regular" >}} + + ```shell + kubectl apply -f backup-schedule-s3.yaml + ``` + + The `backup-gcs.yaml` file has the following content: + + ```yaml + --- + apiVersion: pingcap.com/v1alpha1 + kind: BackupSchedule + metadata: + name: demo1-backup-schedule-s3 + namespace: test1 + spec: + #maxBackups: 5 + #pause: true + maxReservedTime: "3h" + schedule: "*/2 * * * *" + backupTemplate: + from: + host: + port: + user: + secretName: backup-demo1-tidb-secret + s3: + provider: aws + secretName: s3-secret + # region: us-east-1 + # storageClass: STANDARD_IA + # acl: private + # endpoint: + storageClassName: local-storage + storageSize: 10Gi + ``` + ++ Create the `BackupSchedule` CR to enable the scheduled full backup to Amazon S3: + + {{< copyable "shell-regular" >}} + + ```shell + kubectl apply -f backup-schedule-s3.yaml + ``` + + The `backup-gcs.yaml` file has the following content: + + ```yaml + --- + apiVersion: pingcap.com/v1alpha1 + kind: BackupSchedule + metadata: + name: demo1-backup-schedule-ceph + namespace: test1 + spec: + #maxBackups: 5 + #pause: true + maxReservedTime: "3h" + schedule: "*/2 * * * *" + backupTemplate: + from: + host: + port: + user: + secretName: backup-demo1-tidb-secret + s3: + provider: ceph + secretName: s3-secret + endpoint: http://10.0.0.1:30074 + storageClassName: local-storage + storageSize: 10Gi + ``` + +After creating the scheduled full backup, use the following command to check the backup status: + +{{< copyable "shell-regular" >}} + +```shell +kubectl get bks -n test1 -owide +``` + +Execute the following command to check all the backup items: + +{{< copyable "shell-regular" >}} + +```shell +kubectl get bk -l tidb.pingcap.com/backup-schedule=demo1-backup-schedule-s3 -n test1 +``` + +From the above two examples, you can see that the `backupSchedule` configuration consists of two part. One is the unique configurations of `backupSchedule` and the other is `backupTemplate`. `backupTemple` specifies the configuration related to the S3-compatible storage, which is the same with the configuration of the ad-hoc full backup to the S3-compatible storage (refer to [Ad-hoc backup process](#ad-hoc-backup-process) for details). The following are the unique configuration items of `backupSchedule`: + ++ `.spec.maxBackups`: A backup retention policy, which determines the maximum number of backup items to be retained. When this value is exceeded, the outdated backup items will be deleted. If you set this configuration item to `0`, all backup items are retained. ++ `.spec.maxReservedTime`: A backup retention policy based on time. For example, if you set the value of this configuration to `24h`, backup items only of recent 24 hours are retained. All backup items out of this time are deleted. For the time format, refer to [`func ParseDuration`](https://golang.org/pkg/time/#ParseDuration). If you have set the maximum number of backup items and the longest retention time of backup items at the same time, the latter setting takes effect. ++ `.spec.schedule`: The time scheduling format of Cron. Refer to [Cron](https://en.wikipedia.org/wiki/Cron) for details. ++ `.spec.pause`: `false` by default. If this parameter is set to `true`, the scheduled scheduling is paused. In this situation, the backup operation will not be performed even if the scheduling time is reached. During this pause, the backup [Garbage Collection](/reference/garbage-collection/overview.md) (GC) runs normally. If you change `true` to `false`, the full backup process is restarted. diff --git a/tidb-in-kubernetes/maintain/backup-and-restore.md b/tidb-in-kubernetes/maintain/backup-and-restore/charts.md similarity index 90% rename from tidb-in-kubernetes/maintain/backup-and-restore.md rename to tidb-in-kubernetes/maintain/backup-and-restore/charts.md index 0814ea80a718a..51f54a771fed7 100644 --- a/tidb-in-kubernetes/maintain/backup-and-restore.md +++ b/tidb-in-kubernetes/maintain/backup-and-restore/charts.md @@ -1,12 +1,20 @@ --- -title: Backup and Restore -summary: Learn how to back up and restore the data of TiDB cluster in Kubernetes. +title: Helm Charts-Based Backup and Restoration in Kubernetes +summary: Learn how to back up and restore the data of TiDB cluster in Kubernetes based on Helm Charts. category: how-to +aliases: ['/docs/dev/tidb-in-kubernetes/maintain/backup-and-store/'] --- -# Backup and Restore +# Helm Charts-Based Backup and Restoration in Kubernetes -This document describes how to back up and restore the data of a TiDB cluster in Kubernetes. +This document describes how to back up and restore the data of a TiDB cluster in Kubernetes based on Helm Charts. + +For TiDB Operator 1.1 or later versions, it is recommended that you use the backup and restoration methods based on CustomResourceDefinition (CRD). Refer to the following documents for details: + +- [Back up TiDB Cluster Data to GCS](/tidb-in-kubernetes/maintain/backup-and-restore/backup-gcs.md) +- [Restore Data From GCS](/tidb-in-kubernetes/maintain/backup-and-restore/restore-gcs.md) +- [Back up TiDB Cluster Data to S3-Compatible Storage](/tidb-in-kubernetes/maintain/backup-and-restore/backup-s3.md) +- [Restore Data From S3-Compatible Storage](/tidb-in-kubernetes/maintain/backup-and-restore/restore-s3.md) TiDB in Kubernetes supports two kinds of backup strategies: diff --git a/tidb-in-kubernetes/maintain/backup-and-restore/restore-gcs.md b/tidb-in-kubernetes/maintain/backup-and-restore/restore-gcs.md new file mode 100644 index 0000000000000..2205dd381defc --- /dev/null +++ b/tidb-in-kubernetes/maintain/backup-and-restore/restore-gcs.md @@ -0,0 +1,84 @@ +--- +title: Restore Data From GCS +summary: Learn how to restore the backup data from GCS. +category: how-to +--- + +# Restore Data From GCS + +This document describes how to restore the TiDB cluster data backed up using TiDB Operator in Kubernetes. For the underlying implementation, [`loader`](/reference/tools/loader.md) is used to perform the restoration. + +The restoration method described in this document is implemented based on CustomResourceDefinition (CRD) in TiDB Operator v1.1 or later versions. For the restoration method implemented based on Helm Charts, refer to [Back up and Restore TiDB Cluster Data Based on Helm Charts](/tidb-in-kubernetes/maintain/backup-and-restore/charts.md). + +This document shows an example in which the backup data stored in the specified path on [Google Cloud Storage (GCS)](https://cloud.google.com/storage/docs/) is restored to the TiDB cluster. + +## Prerequisites + +1. Download [`backup-rbac.yaml`](https://github.com/pingcap/tidb-operator/blob/master/manifests/backup/backup-rbac.yaml) and execute the following command to create the role-based access control (RBAC) resources in the `test2` namespace: + + {{< copyable "shell-regular" >}} + + ```shell + kubectl apply -f backup-rbac.yaml -n test2 + ``` + +2. Create the `restore-demo2-tidb-secret` secret which stores the root account and password needed to access the TiDB cluster: + + {{< copyable "shell-regular" >}} + + ```shell + kubectl create secret generic restore-demo2-tidb-secret --from-literal=user=root --from-literal=password= --namespace=test2 + ``` + +## Restoration process + +1. Create the restore custom resource (CR) and restore the backup data to the TiDB cluster: + + {{< copyable "shell-regular" >}} + + ```shell + kubectl apply -f restore.yaml + ``` + + The `restore.yaml` file has the following content: + + ```yaml + --- + apiVersion: pingcap.com/v1alpha1 + kind: Restore + metadata: + name: demo2-restore + namespace: test2 + spec: + to: + host: + port: + user: + secretName: restore-demo2-tidb-secret + gcs: + projectId: + secretName: gcs-secret + path: gcs:// + storageClassName: local-storage + storageSize: 1Gi + ``` + +2. After creating the `Restore` CR, execute the following command to check the restoration status: + + {{< copyable "shell-regular" >}} + + ```shell + kubectl get rt -n test2 -owide + ``` + +In the above example, the backup data stored in the specified `spec.gcs.path` path on GCS is restored to the `spec.to.host` TiDB cluster. For the configuration of GCS, refer to [backup-gcs.yaml](/tidb-in-kubernetes/maintain/backup-and-restore/backup-gcs.md#ad-hoc-backup-process). + +More `Restore` CR configurations are as described follows: + +* `.spec.metadata.namespace`: the namespace where the `Restore` CR is located. +* `.spec.to.host`: the address of the TiDB cluster to be restored. +* `.spec.to.port`: the port of the TiDB cluster to be restored. +* `.spec.to.user`: the accessing user of the TiDB cluster to be restored. +* `.spec.to.tidbSecretName`: the secret of the credential needed by the TiDB cluster to be restored. +* `.spec.storageClassName`: the persistent volume (PV) type specified for the restoration. If this item is not specified, the value of the `default-backup-storage-class-name` parameter (`standard` by default, specified when TiDB Operator is started) is used by default. +* `.spec.storageSize`: the PV size specified for the restoration. This value must be greater than size of the backed up TiDB cluster. diff --git a/tidb-in-kubernetes/maintain/backup-and-restore/restore-s3.md b/tidb-in-kubernetes/maintain/backup-and-restore/restore-s3.md new file mode 100644 index 0000000000000..b0d2b2bc46970 --- /dev/null +++ b/tidb-in-kubernetes/maintain/backup-and-restore/restore-s3.md @@ -0,0 +1,85 @@ +--- +title: Restore Data From S3-Compatible Storage +summary: Learn how to restore data from the S3-compatible storage. +category: how-to +--- + +# Restore Data From S3-Compatible Storage + +This document describes how to restore the TiDB cluster data backed up using TiDB Operator in Kubernetes. For the underlying implementation, [`loader`](/reference/tools/loader.md) is used to perform the restoration. + +The restoration method described in this document is implemented based on CustomResourceDefinition (CRD) in TiDB Operator v1.1 or later versions. For the restoration method implemented based on Helm Charts, refer to [Back up and Restore TiDB Cluster Data Based on Helm Charts](/tidb-in-kubernetes/maintain/backup-and-restore/charts.md). + +This document shows an example in which the backup data stored in the specified path on the S3-compatible storage is restored to the TiDB cluster. + +## Prerequisites + +1. Download [`backup-rbac.yaml`](https://github.com/pingcap/tidb-operator/blob/master/manifests/backup/backup-rbac.yaml) and execute the following command to create the role-based access control (RBAC) resources in the `test2` namespace: + + {{< copyable "shell-regular" >}} + + ```shell + kubectl apply -f backup-rbac.yaml -n test2 + ``` + +2. Create the `restore-demo2-tidb-secret` secret which stores the root account and password needed to access the TiDB cluster: + + {{< copyable "shell-regular" >}} + + ```shell + kubectl create secret generic restore-demo2-tidb-secret --from-literal=user=root --from-literal=password= --namespace=test2 + ``` + +## Restoration process + +1. Create the restore custom resource (CR) and restore the backup data to the TiDB cluster: + + {{< copyable "shell-regular" >}} + + ```shell + kubectl apply -f restore.yaml + ``` + + The `restore.yaml` file has the following content: + + ```yaml + --- + apiVersion: pingcap.com/v1alpha1 + kind: Restore + metadata: + name: demo2-restore + namespace: test2 + spec: + to: + host: + port: + user: + secretName: restore-demo2-tidb-secret + s3: + provider: ceph + endpoint: http://10.233.2.161 + secretName: ceph-secret + path: s3:// + storageClassName: local-storage + storageSize: 1Gi + ``` + +2. After creating the `Restore` CR, execute the following command to check the restoration status: + + {{< copyable "shell-regular" >}} + + ```shell + kubectl get rt -n test2 -owide + ``` + +In the above example, the backup data stored in the `spec.s3.path` path on the S3-compatible storage is restored to the `spec.to.host` TiDB cluster. For the configuration of the S3-compatible storage, refer to [backup-s3.yaml](/tidb-in-kubernetes/maintain/backup-and-restore/backup-s3.md#ad-hoc-backup-process). + +More `Restore` CR configurations are as described follows: + +* `.spec.metadata.namespace`: the namespace where the `Restore` CR is located. +* `.spec.to.host`: the address of the TiDB cluster to be restored. +* `.spec.to.port`: the port of the TiDB cluster to be restored. +* `.spec.to.user`: the accessing user of the TiDB cluster to be restored. +* `.spec.to.tidbSecretName`: the secret of the credential needed by the TiDB cluster to be restored. +* `.spec.storageClassName`: the persistent volume (PV) type specified for the restoration. If this item is not specified, the value of the `default-backup-storage-class-name` parameter (`standard` by default, specified when TiDB Operator is started) is used by default. +* `.spec.storageSize`: the PV size specified for the restoration. This value must be greater than size of the backed up TiDB cluster. diff --git a/tidb-in-kubernetes/reference/configuration/storage-class.md b/tidb-in-kubernetes/reference/configuration/storage-class.md index 1d435b40c03d5..9c1d00fe4188b 100644 --- a/tidb-in-kubernetes/reference/configuration/storage-class.md +++ b/tidb-in-kubernetes/reference/configuration/storage-class.md @@ -114,7 +114,7 @@ If the components such as monitoring, TiDB Binlog, and `tidb-backup` use local d >**Note:** > - > The number of directories you create depends on the planned number of TiDB clusters, the number of Pumps in each cluster, and your backup method. For each directory, a corresponding PV will be created. Each Pump uses one PV and each Drainer uses one PV. Each [Ad-hoc full backup](/tidb-in-kubernetes/maintain/backup-and-restore.md#ad-hoc-full-backup) task uses one PV, and all [scheduled full backup](/tidb-in-kubernetes/maintain/backup-and-restore.md#scheduled-full-backup) tasks share one PV. + > The number of directories you create depends on the planned number of TiDB clusters, the number of Pumps in each cluster, and your backup method. For each directory, a corresponding PV will be created. Each Pump uses one PV and each Drainer uses one PV. Each [Ad-hoc full backup](/tidb-in-kubernetes/maintain/backup-and-restore/charts.md#ad-hoc-full-backup) task uses one PV, and all [scheduled full backup](/tidb-in-kubernetes/maintain/backup-and-restore/charts.md#scheduled-full-backup) tasks share one PV. - For a disk storing data in PD, follow the [steps](https://github.com/kubernetes-sigs/sig-storage-local-static-provisioner/blob/master/docs/operations.md#sharing-a-disk-filesystem-by-multiple-filesystem-pvs) to mount the disk. First, create multiple directories in disk, and bind mount them into `/mnt/sharedssd` directory. Then, create `shared-ssd-storage` `StorageClass` for them to use. diff --git a/tidb-in-kubernetes/tidb-operator-overview.md b/tidb-in-kubernetes/tidb-operator-overview.md index ff9dd643b6720..12157b0a62fc5 100644 --- a/tidb-in-kubernetes/tidb-operator-overview.md +++ b/tidb-in-kubernetes/tidb-operator-overview.md @@ -65,7 +65,7 @@ After the deployment is complete, see the following documents to use, operate, a + [Scale TiDB Cluster](/tidb-in-kubernetes/scale-in-kubernetes.md) + [Upgrade TiDB Cluster](/tidb-in-kubernetes/upgrade/tidb-cluster.md#upgrade-the-version-of-tidb-cluster) + [Change the Configuration of TiDB Cluster](/tidb-in-kubernetes/upgrade/tidb-cluster.md#change-the-configuration-of-tidb-cluster) -+ [Backup and Restore](/tidb-in-kubernetes/maintain/backup-and-restore.md) ++ [Backup and Restore](/tidb-in-kubernetes/maintain/backup-and-restore/charts.md) + [Automatic Failover](/tidb-in-kubernetes/maintain/auto-failover.md) + [Monitor a TiDB Cluster in Kubernetes](/tidb-in-kubernetes/monitor/tidb-in-kubernetes.md) + [Collect TiDB Logs in Kubernetes](/tidb-in-kubernetes/maintain/log-collecting.md)