server/schedulers: introduce evict-slow-store-scheduler #3922
Conversation
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
This reverts commit e7f8021. Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
|
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. DetailsReviewer can indicate their review by submitting an approval review. |
Codecov Report
@@ Coverage Diff @@
## master #3922 +/- ##
==========================================
- Coverage 74.65% 74.58% -0.07%
==========================================
Files 246 247 +1
Lines 25135 25276 +141
==========================================
+ Hits 18764 18852 +88
- Misses 4714 4748 +34
- Partials 1657 1676 +19
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
|
Hi, I have some concerns about it:
|
|
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
There was a problem hiding this comment.
As we can evict leader from only 1 slow store, why use a vector to store the ID>
There was a problem hiding this comment.
I thought there would be a chance to introduce a more complex strategy for this scheduler. So use a vector here for keeping flexibility.
There was a problem hiding this comment.
I think it is not a very good idea to use another scheduler as a toolkit because each scheduler has its own context. For example, scheduler names will be recorded to the filter metrics, it may happen that we want to record evict-slow-store-scheduler and end up with evict-leader-scheduler.
There are two possible ways to improve this: a) reimplement the similar code again b) extract the common parts and let different schedulers call them
There was a problem hiding this comment.
The logic is basically the same so I think it would be better to reuse it directly. About the metrics, I saw that it is based on the name of the scheduler, may setting a customized name is a good solution?
There was a problem hiding this comment.
Consider extracting the common part?
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
|
This pr will release in sprint4, PTAL |
| // NewEvictSlowStoreSchedulerCommand returns a command to add a evict-slow-store-scheduler. | ||
| func NewEvictSlowStoreSchedulerCommand() *cobra.Command { | ||
| c := &cobra.Command{ | ||
| Use: "evict-slow-store-scheduler", |
There was a problem hiding this comment.
the schedule should add store_id param???
There was a problem hiding this comment.
No, it just needs the id of the previous evicted store which was stored in etcd.
| PauseLeaderTransfer(id uint64) error | ||
| ResumeLeaderTransfer(id uint64) | ||
|
|
||
| SlowStoreEvicted(id uint64) error |
There was a problem hiding this comment.
Can we reuse PauseLeaderTransfer/ResumeLeaderTransfer?
There was a problem hiding this comment.
That will make this scheduler conflicts with evict-leader-scheduler
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
|
/merge |
|
@5kbpers: It seems you want to merge this PR, I will help you trigger all the tests: /run-all-tests DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
|
@5kbpers: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
|
/merge |
|
@disksing: It seems you want to merge this PR, I will help you trigger all the tests: /run-all-tests DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
|
This pull request has been accepted and is ready to merge. DetailsCommit hash: 6451ef4 |
What problem does this PR solve?
For addressing tikv/tikv#10539, we will introduce a
slow scoreconcept into tikv and report it to pd in store heartbeats. We then implementevict-slow-store-schedulerto detect and evict all leaders from one store that was considered slow.Blocked by pingcap/kvproto#789
What is changed and how it works?
evict-slow-store-schedulerslow scorein store infoCheck List
Tests
Code changes
Related changes
Release note