Skip to content
This repository was archived by the owner on Jul 24, 2024. It is now read-only.
This repository was archived by the owner on Jul 24, 2024. It is now read-only.

Update GC safePoint with TTL failed due to DeadlineExceeded #324

@overvenus

Description

@overvenus

Please answer these questions before submitting your issue. Thanks!

  1. What did you do?

BR backup full but failed with

[2020-05-28T06:31:24.127Z] [2020/05/28 14:31:23.948 +08:00] [ERROR] [client.go:408] ["update GC safePoint with TTL failed"] [error="rpc error: code = DeadlineExceeded desc = context deadline exceeded"] [errorVerbose="rpc error: code = DeadlineExceeded desc = context deadline exceeded\ngithub.com/pingcap/pd/v4/client.(*client).UpdateServiceGCSafePoint\n\t/go/pkg/mod/github.com/pingcap/pd/v4@v4.0.0-rc.2.0.20200520083007-2c251bd8f181/client/client.go:662\ngithub.com/pingcap/br/pkg/backup.UpdateServiceSafePoint\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/pkg/backup/safe_point.go:51\ngithub.com/pingcap/br/pkg/backup.(*Client).BackupRanges\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/pkg/backup/client.go:406\ngithub.com/pingcap/br/pkg/task.RunBackup\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/pkg/task/backup.go:188\ngithub.com/pingcap/br/cmd.runBackupCommand\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/cmd/backup.go:22\ngithub.com/pingcap/br/cmd.newTableBackupCommand.func1\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/cmd/backup.go:99\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:842\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:950\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:887\ngithub.com/pingcap/br.main\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/main.go:54\ngithub.com/pingcap/br.TestRunMain.func1\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/main_test.go:39\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1357"] [stack="github.com/pingcap/log.Error\n\t/go/pkg/mod/github.com/pingcap/log@v0.0.0-20200511115504-543df19646ad/global.go:42\ngithub.com/pingcap/br/pkg/backup.(*Client).BackupRanges\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/pkg/backup/client.go:408\ngithub.com/pingcap/br/pkg/task.RunBackup\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/pkg/task/backup.go:188\ngithub.com/pingcap/br/cmd.runBackupCommand\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/cmd/backup.go:22\ngithub.com/pingcap/br/cmd.newTableBackupCommand.func1\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/cmd/backup.go:99\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:842\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:950\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:887\ngithub.com/pingcap/br.main\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/main.go:54\ngithub.com/pingcap/br.TestRunMain.func1\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/main_test.go:39"]
[2020-05-28T06:31:24.403Z] [2020/05/28 14:31:24.213 +08:00] [INFO] [collector.go:172] ["Table backup Failed summary : total backup ranges: 1, total success: 0, total failed: 1"] ["backup total regions"=2] [unitName="range start:74800000000000002f5f720000000000000000 end:74800000000000002f5f72ffffffffffffffff00"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\ngithub.com/pingcap/errors.AddStack\n\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/errors.go:174\ngithub.com/pingcap/errors.Trace\n\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/juju_adaptor.go:15\ngithub.com/pingcap/br/pkg/backup.SendBackup\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/pkg/backup/client.go:792\ngithub.com/pingcap/br/pkg/backup.(*pushDown).pushBackup.func1\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/pkg/backup/push.go:61\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1357"]

[2020-05-28T06:31:24.403Z] Error: rpc error: code = DeadlineExceeded desc = context deadline exceeded

PD client set an internal context timeout to every RPC call and the default value is 3 seconds which is too short in some case.

Detail log: https://internal.pingcap.net/idc-jenkins/blue/organizations/jenkins/tikv_ghpr_integration_br_test/detail/tikv_ghpr_integration_br_test/47/pipeline/

We should extend the timeout in PD client.

  1. What version of BR and TiDB/TiKV/PD are you using?

All master version.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions