From 38caf95dc02cf4dcbd8735126a3041acbb60fb3d Mon Sep 17 00:00:00 2001
From: Win-Man <825895587@qq.com>
Date: Fri, 26 Jun 2020 23:30:12 +0800
Subject: [PATCH 01/13] Update grafana tikv dashboard doc

---
 grafana-tikv-dashboard.md | 575 +++++++++++++++++++++++++-------------
 1 file changed, 378 insertions(+), 197 deletions(-)

diff --git a/grafana-tikv-dashboard.md b/grafana-tikv-dashboard.md
index dff06b60d0c36..f98774a9e5aa8 100644
--- a/grafana-tikv-dashboard.md
+++ b/grafana-tikv-dashboard.md
@@ -5,236 +5,417 @@ category: reference
 aliases: ['/docs/dev/grafana-tikv-dashboard/','/docs/dev/reference/key-monitoring-metrics/tikv-dashboard/']
 ---
 
-# Key Monitoring Metrics of TiKV
-
-If you use TiDB Ansible to deploy the TiDB cluster, the monitoring system is deployed at the same time. For more information, see [Overview of the Monitoring Framework](/tidb-monitoring-framework.md).
-
-The Grafana dashboard is divided into a series of sub dashboards which include Overview, PD, TiDB, TiKV, Node\_exporter, Disk Performance, and so on. A lot of metrics are there to help you diagnose.
-
-You can get an overview of the component TiKV status from the TiKV dashboard, where the key metrics are displayed. This document provides a detailed description of these key metrics.
-
-## Key metrics description
-
-To understand the key metrics displayed on the Overview dashboard, check the following table:
-
-Service | Panel name | Description | Normal range
----------------- | ---------------- | ---------------------------------- | --------------
-Cluster | Store size | The storage size per TiKV instance |
-Cluster | Available size | The available capacity per TiKV instance |
-Cluster | Capacity size | The capacity size per TiKV instance |
-Cluster | CPU | The CPU usage per TiKV instance |
-Cluster | Memory | The memory usage per TiKV instance |
-Cluster | IO utilization | The I/O utilization per TiKV instance |
-Cluster | MBps | The total bytes of read and write in each TiKV instance |
-Cluster | QPS | The QPS per command in each TiKV instance |
-Cluster | Errors-gRPC | The total number of gRPC message failures |
-Cluster | Leaders | The number of leaders per TiKV instance |
-Cluster | Regions | The number of Regions per TiKV instance |
-Errors | Server is busy | Indicates occurrences of events that make the TiKV instance unavailable temporarily, such as Write Stall, Channel Full, Scheduler Busy, and Coprocessor Full|
-Errors | Server message failures | The number of failed messages between TiKV instances | It should be `0` in normal case.
-Errors | Raftstore errors | The number of Raftstore errors per type on each TiKV instance |
-Errors | Scheduler errors | The number of scheduler errors per type on each TiKV instance |
-Errors | Coprocessor errors | The number of coprocessor errors per type on each TiKV instance |
-Errors | gRPC message errors | The number of gRPC message errors per type on each TiKV instance |
-Errors | Leader drop | The count of dropped leaders per TiKV instance |
-Errors | Leader missing | The count of missing leaders per TiKV instance |
-Server | Leaders | The number of leaders per TiKV instance |
-Server | Regions | The number of Regions per TiKV instance |
-Server | CF size | The size of each column family |
-Server | Store size | The storage size per TiKV instance |
-Server | Channel full | The number of Channel Full errors per TiKV instance | It should be `0` in normal case.
-Server | Server message failures  | The number of failed messages between TiKV instances |
-Server | Average Region written keys | The average rate of written keys to Regions per TiKV instance |
-Server | Average Region written bytes | The average rate of writing bytes to Regions per TiKV instance |
-Server | Active written leaders | The number of leaders being written on each TiKV instance |
-Server | Approximate Region size | The approximate Region size |
-Raft IO | Apply log duration | The time consumed for Raft to apply logs |
-Raft IO | Apply log duration per server | The time consumed for Raft to apply logs per TiKV instance |
-Raft IO | Append log duration | The time consumed for Raft to append logs |
-Raft IO | Append log duration per server | The time consumed for Raft to append logs per TiKV instance |
-Raft process | Ready handled | The count of handled ready buckets per region |
-Raft process | Process ready duration per server | The time consumed for peer processes to be ready in Raft | It should be less than `2s` (P99.99).
-Raft process | Process tick duration per server | The peer processes in Raft |
-Raft process | 99% Duration of raftstore events | The time consumed by raftstore events (P99) |
-Raft message | Sent messages per server | The number of Raft messages sent by each TiKV instance |
-Raft message | Flush messages per server | The number of Raft messages flushed by each TiKV instance |
-Raft message | Receive messages per server | The number of Raft messages received by each TiKV instance |
-Raft message | Messages | The number of Raft messages sent per type |
-Raft message | Vote | The number of Vote messages sent in Raft |
-Raft message | Raft dropped messages | The number of dropped Raft messages per type|
-Raft proposal | Raft proposals per ready | The number of Raft proposals of all Regions per ready handled bucket|
-Raft proposal | Raft read/write proposals | The number of proposals per type|
-Raft proposal | Raft read proposals per server | The number of read proposals made by each TiKV instance |
-Raft proposal | Raft write proposals per server | The number of write proposals made by each TiKV instance |
-Raft proposal | Proposal wait duration | The wait time of each proposal |
-Raft proposal | Proposal wait duration per server | The wait time of each proposal per TiKV instance |
-Raft proposal | Raft log speed | The rate at which peers propose logs |
-Raft admin | Admin proposals | The number of admin proposals |
-Raft admin | Admin apply | The number of processed apply commands |
-Raft admin | Check split | The number of raftstore split checks |
-Raft admin | 99.99% Check split duration | The time consumed when running split checks (P99.99) |
-Local reader | Local reader requests | The number of total requests and the number of rejections from the local read thread |
-Local reader | Local read requests duration | The wait time of local read requests |
-Local reader | Local read requests batch size | The batch size of local read requests |
-Storage | Storage command total | The total number of received commands per type |
-Storage | Storage async request error | The total number of engine asynchronous request errors |
-Storage | Storage async snapshot duration | The time consumed by processing asynchronous snapshot requests | It should be less than `1s` in `.99`.
-Storage | Storage async write duration | The time consumed by processing asynchronous write requests | It should be less than `1s` in `.99`.
-Scheduler | Scheduler stage total | The total number of commands at each stage | There should not be lots of errors in a short time.
-Scheduler | Scheduler priority commands | The count of different priority commands |
-Scheduler | Scheduler pending commands | The count of pending commands per TiKV instance |
-Scheduler - XX | Scheduler stage total | The total number of commands at each stage when executing the batch_get command | There should not be lots of errors in a short time.
-Scheduler - XX | Scheduler command duration | The time consumed when executing the batch_get command | It should be less than `1s`.
-Scheduler - XX | Scheduler latch wait duration | The wait time caused by latch when executing the batch_get command | It should be less than `1s`.
-Scheduler - XX | Scheduler keys read | The count of keys read by a batch_get command |
-Scheduler - XX | Scheduler keys written | The count of keys written by a batch_get command |
-Scheduler - XX | Scheduler scan details | The keys scan details of each CF when executing the batch_get command |
-Scheduler - XX | Scheduler scan details [lock] | The keys scan details of lock CF when executing the batch_get command |
-Scheduler - XX | Scheduler scan details [write] | The keys scan details of write CF when executing the batch_get command |
-Scheduler - XX | Scheduler scan details [default] | The keys scan details of default CF when executing the batch_get command |
-Coprocessor | Request duration | The time consumed to handle coprocessor read requests |
-Coprocessor | Wait duration | The time consumed when coprocessor requests are waiting to be handled | It should be less than `10s` (P99.99).
-Coprocessor | Processing duration | The time consumed to handle coprocessor requests |
-Coprocessor | 95% Request duration by store | The time consumed to handle coprocessor read requests per TiKV instance (P95) |
-Coprocessor | 95% Wait duration by store | The time consumed when coprocessor requests are waiting to be handled per TiKV instance (P95)|
-Coprocessor | 95% Handling duration by store | The time consumed to handle coprocessor requests per TiKV instance (P95) |
-Coprocessor | Request errors | The total number of the push down request errors | There should not be lots of errors in a short time.
-Coprocessor | DAG executors | The total number of DAG executors |
-Coprocessor | Scan keys | The number of keys that each request scans |
-Coprocessor | Scan details | The scan details for each CF |
-Coprocessor | Table Scan - Details by CF | The table scan details for each CF |
-Coprocessor | Index Scan - Details by CF | The index scan details for each CF |
-Coprocessor | Table Scan - Perf Statistics | The total number of RocksDB internal operations from PerfContext when executing table scan |
-Coprocessor | Index Scan - Perf Statistics | The total number of RocksDB internal operations from PerfContext when executing index scan |
-GC | MVCC versions | The number of versions for each key |
-GC | MVCC deleted versions | The number of versions deleted by GC for each key |
-GC | GC tasks | The count of GC tasks processed by gc_worker |
-GC | GC tasks Duration | The time consumed when executing GC tasks |
-GC | GC keys (write CF) | The count of keys in write CF affected during GC |
-GC | TiDB GC actions result | The TiDB GC action result on Region level |
-GC | TiDB GC worker actions | The count of TiDB GC worker actions |
-GC | TiDB GC seconds | The GC duration |
-GC | TiDB GC failure | The count of failed TiDB GC jobs |
-GC | GC lifetime | The lifetime of TiDB GC |
-GC | GC interval | The interval of TiDB GC |
-Snapshot | Rate snapshot message | The rate at which Raft snapshot messages are sent |
-Snapshot | 99% Handle snapshot duration | The time consumed to handle snapshots (P99) |
-Snapshot | Snapshot state count | The number of snapshots per state |
-Snapshot | 99.99% Snapshot size | The snapshot size (P99.99)  |
-Snapshot | 99.99% Snapshot KV count | The number of KV within a snapshot (P99.99)  |
-Task | Worker handled tasks | The number of tasks handled by worker |
-Task | Worker pending tasks | Current number of pending and running tasks of worker | It should be less than `1000`.
-Task | FuturePool handled tasks | The number of tasks handled by future_pool |
-Task | FuturePool pending tasks | Current number of pending and running tasks of future_pool |
-Thread CPU | Raft store CPU | The CPU utilization of the raftstore thread | The CPU usage should be less than `80%`.
-Thread CPU | Async apply CPU | The CPU utilization of async apply | The CPU usage should be less than `90%`.
-Thread CPU | Scheduler CPU | The CPU utilization of scheduler | The CPU usage should be less than `80%`.
-Thread CPU | Scheduler Worker CPU | The CPU utilization of scheduler worker |
-Thread CPU | Storage ReadPool CPU | The CPU utilization of readpool |
-Thread CPU | Coprocessor CPU | The CPU utilization of coprocessor |
-Thread CPU | Snapshot worker CPU | The CPU utilization of snapshot worker |
-Thread CPU | Split check CPU | The CPU utilization of split check |
-Thread CPU | RocksDB CPU | The CPU utilization of RocksDB |
-Thread CPU | gRPC poll CPU | The CPU utilization of gRPC | The CPU usage should be less than `80%`.
-RocksDB - XX | Get operations | The count of get operations |
-RocksDB - XX | Get duration | The time consumed when executing get operations |
-RocksDB - XX | Seek operations | The count of seek operations |
-RocksDB - XX | Seek duration | The time consumed when executing seek operations |
-RocksDB - XX | Write operations | The count of write operations |
-RocksDB - XX | Write duration | The time consumed when executing write operations |
-RocksDB - XX | WAL sync operations | The count of WAL sync operations |
-RocksDB - XX | WAL sync duration | The time consumed when executing WAL sync operations |
-RocksDB - XX | Compaction operations | The count of compaction and flush operations |
-RocksDB - XX | Compaction duration | The time consumed when executing the compaction and flush operations |
-RocksDB - XX | SST read duration | The time consumed when reading SST files |
-RocksDB - XX | Write stall duration | Write stall duration | It should be `0` in normal case.
-RocksDB - XX | Memtable size | The memtable size of each column family |
-RocksDB - XX | Memtable hit | The hit rate of memtable |
-RocksDB - XX | Block cache size | The block cache size. Broken down by column family if shared block cache is disabled. |
-RocksDB - XX | Block cache hit | The hit rate of block cache |
-RocksDB - XX | Block cache flow | The flow rate of block cache operations per type |
-RocksDB - XX | Block cache operations | The count of block cache operations per type |
-RocksDB - XX | Keys flow | The flow rate of operations on keys per type |
-RocksDB - XX | Total keys | The count of keys in each column family |
-RocksDB - XX | Read flow | The flow rate of read operations per type |
-RocksDB - XX | Bytes / Read | The bytes per read operation|
-RocksDB - XX | Write flow | The flow rate of write operations per type|
-RocksDB - XX | Bytes / Write | The bytes per write operation |
-RocksDB - XX | Compaction flow | The flow rate of compaction operations per type |
-RocksDB - XX | Compaction pending bytes | The pending bytes to be compacted |
-RocksDB - XX | Read amplification | The read amplification per TiKV instance |
-RocksDB - XX | Compression ratio | The compression ratio of each level |
-RocksDB - XX | Number of snapshots | The number of snapshots per TiKV instance |
-RocksDB - XX | Oldest snapshots duration | The time that the oldest unreleased snapshot survivals |
-RocksDB - XX | Number files at each level | The number of SST files for different column families in each level |
-RocksDB - XX | Ingest SST duration seconds | The time consumed to ingest SST files |
-RocksDB - XX | Stall conditions changed of each CF | Stall conditions changed of each column family |
-gRPC | gRPC messages | The count of gRPC messages per type |
-gRPC | gRPC message failed | The count of failed gRPC messages per type|
-gRPC | 99% gRPC message duration | The gRPC message duration per message type (P99) |
-gRPC | gRPC GC message count | The count of gRPC GC messages |
-gRPC | 99% gRPC KV GC message duration | The execution time of gRPC GC messages (P99) |
-PD | PD requests | The count of requests that TiKV sends to PD |
-PD | PD request duration (average) | The time consumed by requests that TiKV sends to PD |
-PD | PD heartbeats | The total number of PD heartbeat messages |
-PD | PD validated peers | The total number of peers validated by the PD worker |
-
-## TiKV dashboard interface
-
-This section shows images of the service panels on the TiKV dashboard.
-
-### Cluster
+# The metrics description of TiKV
+
+If you use TiUP to deploy the TiDB cluster, the monitoring system (Prometheus/Grafana) is deployed at the same time. For more information, see [Overview of the Monitoring Framework](/tidb-monitoring-framework.md).
+
+The Grafana dashboard is divided into a series of sub dashboards which include Overview, PD, TiDB, TiKV, Node\_exporter, and so on. A lot of metrics are there to help you diagnose.
+
+You can get an overview of the component TiKV status from the TiKV dashboard, where the key metrics are displayed. According to the [Performance Map](https://asktug.com/_/tidb-performance-map/#/), you can check whether the status of the cluster is as expected.
+
+This document provides a detailed description of these key metrics.
+
+## Cluster
+
+- Store size: The storage size per TiKV instance
+- Available size：The available capacity per TiKV instance
+- Capacity size：The capacity size per TiKV instance
+- CPU：The CPU usage per TiKV instance
+- Memory：The memory usage per TiKV instance
+- IO utilization：The I/O utilization per TiKV instance
+- MBps：The total bytes of read and write in each TiKV instance
+- QPS：	The QPS per command in each TiKV instance
+- Errps：	The total number of gRPC message failures
+- leader：The number of leaders per TiKV instance
+- Region：The number of Regions per TiKV instance
+- Uptime：The runtime of TiKV since last restart
 
 ![TiKV Dashboard - Cluster metrics](/media/tikv-dashboard-cluster.png)
 
-### Errors
+## Errors
+
+- Critical error：The number of critical errors
+- Server is busy：Indicates occurrences of events that make the TiKV instance unavailable temporarily, such as Write Stall, Channel Full, and so on. It should be `0` in normal case.
+- Server report failures：The number of error messages reported by server. It should be `0` in normal case.
+- Raftstore error：The number of Raftstore errors per type on each TiKV instance
+- Scheduler error：The number of scheduler errors per type on each TiKV instance
+- Coprocessor error：The number of coprocessor errors per type on each TiKV instance
+- gRPC message error：The number of gRPC message errors per type on each TiKV instance
+- Leader drop：The count of dropped leaders per TiKV instance
+- Leader missing：The count of missing leaders per TiKV instance
 
 ![TiKV Dashboard - Errors metrics](/media/tikv-dashboard-errors.png)
 
-### Server
+## Server
+
+- CF size：The size of each column family
+- Store size：The storage size per TiKV instance
+- Channel full：The number of Channel Full errors per TiKV instance. It should be `0` in normal case.
+- Active written leaders：The number of leaders being written on each TiKV instance
+- Approximate Region size：The approximate Region size
+- Approximate Region size Histogram：The histogram of approximate Region size
+- Region average written keys：The average rate of written keys to Regions per TiKV instance
+- Region average written bytes：The average rate of writing bytes to Regions per TiKV instance
 
 ![TiKV Dashboard - Server metrics](/media/tikv-dashboard-server.png)
 
-### Raft IO
+## gRPC
+
+- gRPC message count：The number of gRPC messages
+- gRPC message failed：The number of failed gRPC messages
+- 99% gRPC message duration：99% duration of gRPC messages
+- Average gRPC message duration：Average duration of gRPC messages
+- gRPC batch size：The batch size of gRPC messages between TiDB and TiKV
+- raft message batch size：The batch size of raft messages
+
+## Thread CPU
+
+- Raft store CPU：The CPU utilization of the raftstore thread. The CPU usage should be less than 80% * `raftstore.store-pool-size` in normal case.
+- Async apply CPU：The CPU utilization of async apply. The CPU usage should be less than 80% * `raftstore.apply-pool-size` in normal case.
+- Scheduler worker CPU：The CPU utilization of scheduler. The CPU usage should be less than 90% * `storage.scheduler-worker-pool-size` in normal case.
+- gRPC poll CPU：The CPU utilization of gRPC. The CPU usage should be less than 80% * `server.grpc-concurrency` in normal case.
+- Unified read pool CPU：The CPU utilization of unified read pool
+- Storage ReadPool CPU：The CPU utilization of readpool
+- Coprocessor CPU：The CPU utilization of coprocessor
+- RocksDB CPU：The CPU utilization of RocksDB
+- Split check CPU：The CPU utilization of split check
+- GC worker CPU：The CPU utilization of GC worker 
+- Snapshot worker CPU：The CPU utilization of snapshot worker
+
+## PD
+
+- PD requests：The count of requests that TiKV sends to PD
+- PD request duration (average)：The time consumed by requests that TiKV sends to PD
+- PD heartbeats：The total number of PD heartbeat messages
+- PD validate peers：The total number of peers validated by the PD worker
+
+## Raft IO
+
+- Apply log duration：Raft apply The time consumed for Raft to apply logs
+- Apply log duration per server：The time consumed for Raft to apply logs per TiKV instance
+- Append log duration：The time consumed for Raft to append logs
+- Append log duration per server：The time consumed for Raft to append logs per TiKV instance
+- Commit log duration：The time consumed for Raft to commit logs
+- Commit log duration per server：The time consumed for Raft to commit logs per TiKV instance
 
 ![TiKV Dashboard - Raft IO metrics](/media/tikv-dashboard-raftio.png)
 
-### Raft process
+## Raft process
+
+- Ready handled：The count of handled ready buckets per region
+- 0.99 Duration of Raft store events：The time consumed by raftstore events (P99)
+- Process ready duration：The time consumed for processes to be ready in Raft
+- Process ready duration per server：The time consumed for peer processes to be ready in Raft. It should be less than 2s(P99.99).
 
 ![TiKV Dashboard - Raft process metrics](/media/tikv-dashboard-raft-process.png)
 
-### Raft message
+## Raft message
+
+- Sent messages per server：The number of Raft messages sent by each TiKV instance
+- Flush messages per server：The number of Raft messages flushed by each TiKV instance
+- Receive messages per server：The number of Raft messages received by each TiKV instance
+- Messages：The number of Raft messages sent per type
+- Vote：The number of Vote messages sent in Raft
+- Raft dropped messages：The number of dropped Raft messages per type
 
 ![TiKV Dashboard - Raft message metrics](/media/tikv-dashboard-raft-message.png)
 
-### Raft proposal
+## Raft propose
 
-![TiKV Dashboard - Raft proposal metrics](/media/tikv-dashboard-raft-propose.png)
+- Raft apply proposals per ready：The number of Raft proposals of all Regions per ready handled bucket
+- Raft read/write proposals：The number of proposals per type
+- Raft read proposals per server：The number of read proposals made by each TiKV instance
+- Raft write proposals per server：The number of write proposals made by each TiKV instance
+- Propose wait duration：The wait time of each proposal
+- Propose wait duration per server：The wait time of each proposal per TiKV instance
+- Apply wait duration：The apply time of each proposal
+- Apply wait duration per server：The apply time of each proposal per TiKV instance
+- Raft log speed：The rate at which peers propose logs
 
-### Raft admin
+![TiKV Dashboard - Raft propose metrics](/media/tikv-dashboard-raft-propose.png)
+
+## Raft admin
+
+- Admin proposals：The number of admin proposals
+- Admin apply：The number of processed apply commands
+- Check split：The number of raftstore split checks
+- 99.99% Check split duration：The time consumed when running split checks (P99.99)
 
 ![TiKV Dashboard - Raft admin metrics](/media/tikv-dashboard-raft-admin.png)
 
-### Local reader
+## Local reader
+
+- Local reader requests：The number of total requests and the number of rejections from the local read thread
 
 ![TiKV Dashboard - Local reader metrics](/media/tikv-dashboard-local-reader.png)
 
-### Storage
+## Unified Read Pool
 
-![TiKV Dashboard - Storage metrics](/media/tikv-dashboard-storage.png)
+- Time used by level：The time consumed for each level in unified read pool, level 0 means small query 
+- Level 0 chance：The proportion of level 0 tasks in unified read pool
+- Running tasks：The number of tasks running concurrently in the unified read pool
 
-### Scheduler
+## Storage
 
-![TiKV Dashboard - Scheduler metrics](/media/tikv-dashboard-scheduler.png)
+- Storage command total：The total number of received commands per type
+- Storage async request error：The total number of engine asynchronous request errors
+- Storage async snapshot duration：The time consumed by processing asynchronous snapshot requests. It should be less than `1s` in `.99`.
+- Storage async write duration：The time consumed by processing asynchronous write requests. It should be less than `1s` in `.99`.
 
-### Scheduler - batch_get
+![TiKV Dashboard - Storage metrics](/media/tikv-dashboard-storage.png)
+
+## Scheduler
 
-![TiKV Dashboard - Scheduler - batch_get metrics](/media/tikv-dashboard-scheduler-batch-get.png)
+- Scheduler stage total：The total number of commands at each stage. There should not be lots of errors in a short time.
+- Scheduler writing bytes：The total bytes of writing bytes per TiKV instance
+- Scheduler priority commands：The count of different priority commands
+- Scheduler pending commands：The count of pending commands per TiKV instance
 
-### Scheduler - cleanup
+![TiKV Dashboard - Scheduler metrics](/media/tikv-dashboard-scheduler.png)
 
-![TiKV Dashboard - Scheduler - cleanup metrics](/media/tikv-dashboard-scheduler-cleanup.png)
+## Scheduler - commit
 
-### Scheduler - commit
+- Scheduler stage total：The total number of commands at each stage when executing the commit command. There should not be lots of errors in a short time.
+- Scheduler command duration：The time consumed when executing the commit command. It should be less than `1s`.
+- Scheduler latch wait duration：The wait time caused by latch when executing the commit command. It should be less than `1s`.
+- Scheduler keys read：The count of keys read by a commit command
+- Scheduler keys written：The count of keys written by a commit command
+- Scheduler scan details：The keys scan details of each CF when executing the commit command.
+- Scheduler scan details [lock]：The keys scan details of lock CF when executing the commit command
+- Scheduler scan details [write]：The keys scan details of write CF when executing the commit command
+- Scheduler scan details [default]：The keys scan details of default CF when executing the commit command
 
 ![TiKV Dashboard - Scheduler commit metrics](/media/tikv-dashboard-scheduler-commit.png)
+
+## Scheduler - pessimistic_rollback
+
+- Scheduler stage total：The total number of commands at each stage when executing the pessimistic_rollback command. There should not be lots of errors in a short time.
+- Scheduler command duration：The time consumed when executing the pessimistic_rollback command. It should be less than `1s`.
+- Scheduler latch wait duration：The wait time caused by latch when executing the pessimistic_rollback command. It should be less than `1s`.
+- Scheduler keys read：The count of keys read by a pessimistic_rollback command
+- Scheduler keys written：The count of keys written by a pessimistic_rollback command
+- Scheduler scan details：The keys scan details of each CF when executing the pessimistic_rollback command.
+- Scheduler scan details [lock]：The keys scan details of lock CF when executing the pessimistic_rollback command
+- Scheduler scan details [write]：The keys scan details of write CF when executing the pessimistic_rollback command
+- Scheduler scan details [default]：The keys scan details of default CF when executing the pessimistic_rollback command
+
+## Scheduler - prewrite
+
+- Scheduler stage total：The total number of commands at each stage when executing the prewrite command. There should not be lots of errors in a short time.
+- Scheduler command duration：The time consumed when executing the prewrite command. It should be less than `1s`.
+- Scheduler latch wait duration：The wait time caused by latch when executing the prewrite command. It should be less than `1s`.
+- Scheduler keys read：The count of keys read by a prewrite command
+- Scheduler keys written：The count of keys written by a prewrite command
+- Scheduler scan details：The keys scan details of each CF when executing the prewrite command.
+- Scheduler scan details [lock]：The keys scan details of lock CF when executing the prewrite command
+- Scheduler scan details [write]：The keys scan details of write CF when executing the prewrite command
+- Scheduler scan details [default]：The keys scan details of default CF when executing the prewrite command
+
+## Scheduler - rollback
+
+- Scheduler stage total：The total number of commands at each stage when executing the rollback command. There should not be lots of errors in a short time.
+- Scheduler command duration：The time consumed when executing the rollback command. It should be less than `1s`.
+- Scheduler latch wait duration：The wait time caused by latch when executing the rollback command. It should be less than `1s`.
+- Scheduler keys read：The count of keys read by a rollback command
+- Scheduler keys written：The count of keys written by a rollback command
+- Scheduler scan details：The keys scan details of each CF when executing the rollback command.
+- Scheduler scan details [lock]：The keys scan details of lock CF when executing the rollback command
+- Scheduler scan details [write]：The keys scan details of write CF when executing the rollback command
+- Scheduler scan details [default]：The keys scan details of default CF when executing the rollback command
+
+## GC
+
+- MVCC versions：The number of versions for each key
+- MVCC delete versions：The number of versions deleted by GC for each key
+- GC tasks：The count of GC tasks processed by gc_worker
+- GC tasks Duration：The time consumed when executing GC tasks
+- GC keys (write CF)：The count of keys in write CF affected during GC
+- TiDB GC worker actions：The count of TiDB GC worker actions
+- TiDB GC seconds：The GC duration
+- GC speed：The number of keys deleted by GC per second
+- TiKV AutoGC Working：The status of Auto GC 
+- ResolveLocks Progress：The progress of the first phase of GC(ResolveLocks)
+- TiKV Auto GC Progress：The progress of the second phase of GC
+- TiKV Auto GC SafePoint：TiKV GC safr point value, safe point is the current GC timestamp
+- GC lifetime：The lifetime of TiDB GC
+- GC interval：The interval of TiDB GC
+
+## Snapshot
+
+- Rate snapshot message：The rate at which Raft snapshot messages are sent
+- 99% Handle snapshot duration：The time consumed to handle snapshots (P99)
+- Snapshot state count：The number of snapshots per state
+- 99.99% Snapshot size：The snapshot size (P99.99)
+- 99.99% Snapshot KV count：The number of KV within a snapshot (P99.99)
+
+## Task
+
+- Worker handled tasks：The number of tasks handled by worker
+- Worker pending tasks：Current number of pending and running tasks of worker. It should be less than `1000` in normal case.
+- FuturePool handled tasks：The number of tasks handled by future_pool
+- FuturePool pending tasks：Current number of pending and running tasks of future_pool
+
+## Coprocessor Overview
+
+- Request duration：The time consumed to handle coprocessor read requests
+- Total Requests：The number of total coprocessor request
+- Handle duration：The histogram of time spent actually processing coprocessor requests per minute
+- Total Request Errors：The total number of the coprocessor request errors
+- Total KV Cursor Operations：The total number of the KV cursor operations, such as select, index, analyze_table, analyze_index, checksum_table, checksum_index, and so on.
+- KV Cursor Operations：The histogram of KV cursor operations
+- Total RocksDB Perf Statistics：The performance statistics of RocksDB
+- Total Response Size：The total size of coprocessor response
+
+## Coprocessor Detail
+
+- Handle duration：The histogram of time spent actually processing coprocessor requests per minute
+- 95% Handle duration by store：The time consumed to handle coprocessor requests per TiKV instance (P95)
+- Wait duration：The time consumed when coprocessor requests are waiting to be handled. It should be less than `10s`(P99.99).
+- 95% Wait duration by store：The time consumed when coprocessor requests are waiting to be handled per TiKV instance (P95)
+- Total DAG Requests：The total number of DAG requests
+- Total DAG Executors：The total number of DAG executors
+- Total Ops Details (Table Scan)：The total number of RocksDB internal operations when executing select scan
+- Total Ops Details (Index Scan)：The total number of RocksDB internal operations when executing index scan
+- Total Ops Details by CF (Table Scan)：The select scan details for each CF
+- Total Ops Details by CF (Index Scan)：The index scan details for each CF
+
+## Threads
+
+- Threads state：The state of TiKV threads
+- Threads IO：The I/O traffic of each TiKV thread
+- Thread Voluntary Context Switches：The number of TiKV threads voluntary context switches
+- Thread Nonvoluntary Context Switches：The number of TiKV threads nonvoluntary context switches
+
+## RocksDB - kv/raft
+
+- Get operations：The count of get operations
+- Get duration：The time consumed when executing get operations
+- Seek operations：The count of seek operations
+- Seek duration：The time consumed when executing seek operations
+- Write operations：The count of write operations
+- Write duration：The time consumed when executing write operations
+- WAL sync operations：The count of WAL sync operations
+- Write WAL duration：The time consumed for writing WAL
+- WAL sync duration：The time consumed when executing WAL sync operations
+- Compaction operationsThe count of compaction and flush operations
+- Compaction duration：The time consumed when executing the compaction and flush operations
+- SST read duration：The time consumed when reading SST files
+- Write stall duration：Write stall duration. It should be `0` in normal case.
+- Memtable size：The memtable size of each column family
+- Memtable hit：The hit rate of memtable
+- Block cache size：The block cache size. Broken down by column family if shared block cache is disabled.
+- Block cache hit：The hit rate of block cache
+- Block cache flow：The flow rate of block cache operations per type
+- Block cache operations: The count of block cache operations per type
+- Keys flow：The flow rate of operations on keys per type
+- Total keys：The count of keys in each column family
+- Read flow：The flow rate of read operations per type
+- Bytes / Read：The bytes per read operation
+- Write flow：The flow rate of write operations per type
+- Bytes / Write：The bytes per write operation
+- Compaction flow：The flow rate of compaction operations per type
+- Compaction pending bytes：The pending bytes to be compacted
+- Read amplification：The read amplification per TiKV instance
+- Compression ratio：The compression ratio of each level
+- Number of snapshots：The number of snapshots per TiKV instance
+- Oldest snapshots duration：The time that the oldest unreleased snapshot survivals
+- Number files at each level：The number of SST files for different column families in each level
+- Ingest SST duration seconds：The time consumed to ingest SST files
+- Stall conditions changed of each CF：Stall conditions changed of each column family
+
+## Titan - All
+
+- Blob file count：The number of Titan blob file
+- Blob file size：The total size of Titan blob file
+- Live blob size：The total size of valid blob record
+- Blob cache hit：The hit rate of Titan block cache
+- Iter touched blob file count：The number of blob file involved in a single iterator
+- Blob file discardable ratio distribution：The distribution of blob file failure blob record ratio
+- Blob key size：The size of Titan blob keys
+- Blob value size：The size of Titan blob values
+- Blob get operations：The count of get operations in Titan blob
+- Blob get duration：The time consumed when executing get operations in Titan blob
+- Blob iter operations：The time consumed when executing iter operations in Titan blob
+- Blob seek duration：The time consumed when executing seek operations in Titan blob
+- Blob next duration：The time consumed when executing next operations in Titan blob
+- Blob prev duration：The time consumed when executing prev operations in Titan blob
+- Blob keys flow：The flow rate of operations on Titan blob keys
+- Blob bytes flow：The flow rate of bytes on Titan blob keys
+- Blob file read duration：The time consumed when reading Titan blob file
+- Blob file write duration：The time consumed when writing Titan blob file
+- Blob file sync operations：The count of blob file sync operations
+- Blob file sync duration：The time consumed when sync blob file
+- Blob GC action：The count of Titan GC actions
+- Blob GC duration：The Titan GC duration
+- Blob GC keys flow：The flow rate of keys read and written by Titan GC
+- Blob GC bytes flow：The flow rate of bytes read and written by Titan GC
+- Blob GC input file size：The size of Titan GC input file 
+- Blob GC output file size：The size of Titan GC output file
+- Blob GC file count：The count of blob files involved in Titan GC
+
+## Lock manager
+
+- Thread CPU：The CPU utilization of the lock manager thread
+- Handled tasks：The number of taks handled by lock manager
+- Waiter lifetime duration：The time consumed for the transaction waitting for the lock to be released
+- Wait table：The status information of wait table, including the number of locks and the number of transactions waitting for the lock
+- Deadlock detect duration：The time consumed for detecting deadlock
+- Detect error：The number of errors encountered when detecting deadlock, including the number of deadlocks
+- Deadlock detector leader：The information about the node where the deadlock detector leader is located
+
+## Memory
+
+- Allocator Stats：The statistics of the memory allocator
+
+## Backup
+
+- Backup CPU：The CPU utilization of the backup thread
+- Range Size：The histogram of backup range size
+- Backup Duration：The time consumed for backup
+- Backup Flow：The total bytes of backup
+- Disk Throughput：The disk throughput per instance
+- Backup Range Duration：The time consumed for range backup
+- Backup Errors：The number of errors encountered when making a backup
+
+## Encryption
+
+- Encryption data keys：The total number of encrypted data keys
+- Encrypted files：The number of encrypted files
+- Encryption initialized：It shows whether encryption is enabled, `1` means enabled.
+- Encryption meta files size：The size of meta file about encrpytion
+- Encrypt/decrypt data nanos：The histogram of time on encrypting/decrypting data ecch time
+- Read/write encryption meta duration：The time consumed for reading/writing encryption meta file
+
+## 面板常见参数的解释
+
+### gRPC 消息类型
+
+1. 使用事务型接口的命令：
+
+    - kv_get：The command of getting the latest version of data specified by ts
+    - kv_scan：The command of scanning a continuous piece of data
+    - kv_prewrite：The command of prewriting the data to be committed at first phase of 2PC
+    - kv_pessimistic_lock：The command of adding a pessimistic lock to the key to prevent other transaction from modifying
+    - kv_pessimistic_rollback：The command of deleting the pessimistic lock on the key
+    - kv_txn_heart_beat：The command of updating `lock_ttl` for pessimistic transactions or large transactions to prevent them from rolling back
+    - kv_check_txn_status：The command of checking the status of the transaction
+    - kv_commit：The command of committing the data written by prewrite command
+    - kv_cleanup：The command of rolling back a transaction, it will abolished in 4.0
+    - kv_batch_get：The command of getting the value of batch key at once, similar to `kv_get`.
+    - kv_batch_rollback：The command of batch rollback of multiple prewrite transaction
+    - kv_scan_lock：The command of scanning all locks with a version number before `max_version` to clean up expired transactions
+    - kv_resolve_lock：The command of committing or rollback the transaction lock, according to the transaction status.
+    - kv_gc：The command of GC
+    - kv_delete_range：The command of deleting a continuous piece of data from TiKV
+
+2. 非事务型的裸命令：
+
+    - raw_get：The command of getting the value of key
+    - raw_batch_get：The command of getting the value of batch keys
+    - raw_scan：The command of scanning a continuous piece of data
+    - raw_batch_scan：The command of scanning multiple consecutive data
+    - raw_put：The command of writing a key/value pair
+    - raw_batch_put：The command of writing a batch of key/value pairs
+    - raw_delete：The command of deleting a key/value pair
+    - raw_batch_delete：The command of a batch of key/value pairs
+    - raw_delete_range：The command of deleting a continuous interval

From e06f91c0ecf8808e7fba410d8c4031096ffec3fb Mon Sep 17 00:00:00 2001
From: Win-Man <825895587@qq.com>
Date: Fri, 26 Jun 2020 23:36:12 +0800
Subject: [PATCH 02/13] fix lint

---
 grafana-tikv-dashboard.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/grafana-tikv-dashboard.md b/grafana-tikv-dashboard.md
index f98774a9e5aa8..465b6c9bdc1e8 100644
--- a/grafana-tikv-dashboard.md
+++ b/grafana-tikv-dashboard.md
@@ -24,8 +24,8 @@ This document provides a detailed description of these key metrics.
 - Memory：The memory usage per TiKV instance
 - IO utilization：The I/O utilization per TiKV instance
 - MBps：The total bytes of read and write in each TiKV instance
-- QPS：	The QPS per command in each TiKV instance
-- Errps：	The total number of gRPC message failures
+- QPS：The QPS per command in each TiKV instance
+- Errps：The total number of gRPC message failures
 - leader：The number of leaders per TiKV instance
 - Region：The number of Regions per TiKV instance
 - Uptime：The runtime of TiKV since last restart

From 763e69d44b17259e473db23ec12ee39d72e9bd7a Mon Sep 17 00:00:00 2001
From: Win-Man <825895587@qq.com>
Date: Sun, 28 Jun 2020 19:18:58 +0800
Subject: [PATCH 03/13] Fix first round of review

---
 grafana-tikv-dashboard.md | 72 +++++++++++++++++++--------------------
 1 file changed, 36 insertions(+), 36 deletions(-)

diff --git a/grafana-tikv-dashboard.md b/grafana-tikv-dashboard.md
index 465b6c9bdc1e8..0c1908d6cdeee 100644
--- a/grafana-tikv-dashboard.md
+++ b/grafana-tikv-dashboard.md
@@ -5,15 +5,15 @@ category: reference
 aliases: ['/docs/dev/grafana-tikv-dashboard/','/docs/dev/reference/key-monitoring-metrics/tikv-dashboard/']
 ---
 
-# The metrics description of TiKV
+# Description of TiKV Monitoring Metrics
 
 If you use TiUP to deploy the TiDB cluster, the monitoring system (Prometheus/Grafana) is deployed at the same time. For more information, see [Overview of the Monitoring Framework](/tidb-monitoring-framework.md).
 
-The Grafana dashboard is divided into a series of sub dashboards which include Overview, PD, TiDB, TiKV, Node\_exporter, and so on. A lot of metrics are there to help you diagnose.
+The Grafana dashboard is divided into a series of sub dashboards which include Overview, PD, TiDB, TiKV, Node_exporter, and so on. A lot of metrics are there to help you diagnose.
 
-You can get an overview of the component TiKV status from the TiKV dashboard, where the key metrics are displayed. According to the [Performance Map](https://asktug.com/_/tidb-performance-map/#/), you can check whether the status of the cluster is as expected.
+You can get an overview of the component TiKV status from the **TiKV-Details** dashboard, where the key metrics are displayed. According to the [Performance Map](https://asktug.com/_/tidb-performance-map/#/), you can check whether the status of the cluster is as expected.
 
-This document provides a detailed description of these key metrics.
+This document provides a detailed description of these key metrics on the **TiKV-Details** dashboard.
 
 ## Cluster
 
@@ -25,7 +25,7 @@ This document provides a detailed description of these key metrics.
 - IO utilization：The I/O utilization per TiKV instance
 - MBps：The total bytes of read and write in each TiKV instance
 - QPS：The QPS per command in each TiKV instance
-- Errps：The total number of gRPC message failures
+- Errps：The rate of gRPC message failures
 - leader：The number of leaders per TiKV instance
 - Region：The number of Regions per TiKV instance
 - Uptime：The runtime of TiKV since last restart
@@ -53,27 +53,27 @@ This document provides a detailed description of these key metrics.
 - Channel full：The number of Channel Full errors per TiKV instance. It should be `0` in normal case.
 - Active written leaders：The number of leaders being written on each TiKV instance
 - Approximate Region size：The approximate Region size
-- Approximate Region size Histogram：The histogram of approximate Region size
-- Region average written keys：The average rate of written keys to Regions per TiKV instance
-- Region average written bytes：The average rate of writing bytes to Regions per TiKV instance
+- Approximate Region size Histogram：The histogram of each approximate Region size
+- Region average written keys：The average number of written keys to Regions per TiKV instance
+- Region average written bytes: The average written bytes to Regions per TiKV instance
 
 ![TiKV Dashboard - Server metrics](/media/tikv-dashboard-server.png)
 
 ## gRPC
 
-- gRPC message count：The number of gRPC messages
+- gRPC message count: The number of gRPC messages per type
 - gRPC message failed：The number of failed gRPC messages
-- 99% gRPC message duration：99% duration of gRPC messages
-- Average gRPC message duration：Average duration of gRPC messages
+- 99% gRPC message duration: The gRPC message duration per message type (P99)
+- Average gRPC message duration: The average execution time of gRPC messages
 - gRPC batch size：The batch size of gRPC messages between TiDB and TiKV
-- raft message batch size：The batch size of raft messages
+- Raft message batch size：The batch size of Raft messages between TiKV instances
 
 ## Thread CPU
 
 - Raft store CPU：The CPU utilization of the raftstore thread. The CPU usage should be less than 80% * `raftstore.store-pool-size` in normal case.
-- Async apply CPU：The CPU utilization of async apply. The CPU usage should be less than 80% * `raftstore.apply-pool-size` in normal case.
-- Scheduler worker CPU：The CPU utilization of scheduler. The CPU usage should be less than 90% * `storage.scheduler-worker-pool-size` in normal case.
-- gRPC poll CPU：The CPU utilization of gRPC. The CPU usage should be less than 80% * `server.grpc-concurrency` in normal case.
+- Async apply CPU：The CPU utilization of the `async apply` thread. The CPU usage should be less than 90% * `raftstore.apply-pool-size` in normal cases.
+- Scheduler worker CPU：The CPU utilization of the `scheduler worker` thread. The CPU usage should be less than 90% * `storage.scheduler-worker-pool-size` in normal cases.
+- gRPC poll CPU：The CPU utilization of the `gRPC` thread. The CPU usage should be less than 80% * `server.grpc-concurrency` in normal cases.
 - Unified read pool CPU：The CPU utilization of unified read pool
 - Storage ReadPool CPU：The CPU utilization of readpool
 - Coprocessor CPU：The CPU utilization of coprocessor
@@ -85,13 +85,13 @@ This document provides a detailed description of these key metrics.
 ## PD
 
 - PD requests：The count of requests that TiKV sends to PD
-- PD request duration (average)：The time consumed by requests that TiKV sends to PD
+- PD request duration (average)：The average time consumed by requests that TiKV sends to PD
 - PD heartbeats：The total number of PD heartbeat messages
 - PD validate peers：The total number of peers validated by the PD worker
 
 ## Raft IO
 
-- Apply log duration：Raft apply The time consumed for Raft to apply logs
+- Apply log duration：The time consumed for Raft to apply logs
 - Apply log duration per server：The time consumed for Raft to apply logs per TiKV instance
 - Append log duration：The time consumed for Raft to append logs
 - Append log duration per server：The time consumed for Raft to append logs per TiKV instance
@@ -102,35 +102,35 @@ This document provides a detailed description of these key metrics.
 
 ## Raft process
 
-- Ready handled：The count of handled ready buckets per region
+- Ready handled：The count of handled ready operations per second
 - 0.99 Duration of Raft store events：The time consumed by raftstore events (P99)
 - Process ready duration：The time consumed for processes to be ready in Raft
-- Process ready duration per server：The time consumed for peer processes to be ready in Raft. It should be less than 2s(P99.99).
+- Process ready duration per server：The time consumed for peer processes to be ready in Raft. It should be less than 2 seconds (P99.99).
 
 ![TiKV Dashboard - Raft process metrics](/media/tikv-dashboard-raft-process.png)
 
 ## Raft message
 
-- Sent messages per server：The number of Raft messages sent by each TiKV instance
-- Flush messages per server：The number of Raft messages flushed by each TiKV instance
-- Receive messages per server：The number of Raft messages received by each TiKV instance
-- Messages：The number of Raft messages sent per type
-- Vote：The number of Vote messages sent in Raft
+- Sent messages per server：The number of Raft messages sent per second by each TiKV instance
+- Flush messages per server：The number of Raft messages flushed per second by the Raft client in each TiKV instance
+- Receive messages per server：The number of Raft messages received per second by each TiKV instance
+- Messages：The number of Raft messages sent per type per second
+- Vote：The number of Vote messages sent in Raft per second
 - Raft dropped messages：The number of dropped Raft messages per type
 
 ![TiKV Dashboard - Raft message metrics](/media/tikv-dashboard-raft-message.png)
 
 ## Raft propose
 
-- Raft apply proposals per ready：The number of Raft proposals of all Regions per ready handled bucket
+- Raft apply proposals per ready：The histogram of the number of proposals that each ready operation containes in a batch while applying proposal.
 - Raft read/write proposals：The number of proposals per type
 - Raft read proposals per server：The number of read proposals made by each TiKV instance
 - Raft write proposals per server：The number of write proposals made by each TiKV instance
-- Propose wait duration：The wait time of each proposal
-- Propose wait duration per server：The wait time of each proposal per TiKV instance
-- Apply wait duration：The apply time of each proposal
-- Apply wait duration per server：The apply time of each proposal per TiKV instance
-- Raft log speed：The rate at which peers propose logs
+- Propose wait duration：The histogram of wait time of each proposal
+- Propose wait duration per server：The histogram of wait time of each proposal per TiKV instance
+- Apply wait duration：The histogram of apply time of each proposal
+- Apply wait duration per server：The histogram of apply time of each proposal per TiKV instance
+- Raft log speed：The average rate at which peers propose logs
 
 ![TiKV Dashboard - Raft propose metrics](/media/tikv-dashboard-raft-propose.png)
 
@@ -138,8 +138,8 @@ This document provides a detailed description of these key metrics.
 
 - Admin proposals：The number of admin proposals
 - Admin apply：The number of processed apply commands
-- Check split：The number of raftstore split checks
-- 99.99% Check split duration：The time consumed when running split checks (P99.99)
+- Check split：The number of raftstore split check commands
+- 99.99% Check split duration：The time consumed when running split check commands (P99.99)
 
 ![TiKV Dashboard - Raft admin metrics](/media/tikv-dashboard-raft-admin.png)
 
@@ -386,11 +386,11 @@ This document provides a detailed description of these key metrics.
 - Encrypt/decrypt data nanos：The histogram of time on encrypting/decrypting data ecch time
 - Read/write encryption meta duration：The time consumed for reading/writing encryption meta file
 
-## 面板常见参数的解释
+## Explanation of Common Parameters
 
-### gRPC 消息类型
+### gRPC Message Type
 
-1. 使用事务型接口的命令：
+1. Transactional API：
 
     - kv_get：The command of getting the latest version of data specified by ts
     - kv_scan：The command of scanning a continuous piece of data
@@ -408,7 +408,7 @@ This document provides a detailed description of these key metrics.
     - kv_gc：The command of GC
     - kv_delete_range：The command of deleting a continuous piece of data from TiKV
 
-2. 非事务型的裸命令：
+2. Raw API：
 
     - raw_get：The command of getting the value of key
     - raw_batch_get：The command of getting the value of batch keys

From ba0a764c28f763e6bea0c460eeaaf8561933c92b Mon Sep 17 00:00:00 2001
From: Win-Man <825895587@qq.com>
Date: Sun, 28 Jun 2020 19:23:33 +0800
Subject: [PATCH 04/13] fix

---
 grafana-tikv-dashboard.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/grafana-tikv-dashboard.md b/grafana-tikv-dashboard.md
index 0c1908d6cdeee..295fef15267a9 100644
--- a/grafana-tikv-dashboard.md
+++ b/grafana-tikv-dashboard.md
@@ -116,7 +116,7 @@ This document provides a detailed description of these key metrics on the **TiKV
 - Receive messages per server：The number of Raft messages received per second by each TiKV instance
 - Messages：The number of Raft messages sent per type per second
 - Vote：The number of Vote messages sent in Raft per second
-- Raft dropped messages：The number of dropped Raft messages per type
+- Raft dropped messages：The number of dropped Raft messages per type per second
 
 ![TiKV Dashboard - Raft message metrics](/media/tikv-dashboard-raft-message.png)
 

From ffe3c1be5366fb70f4ace057853857dfe576befc Mon Sep 17 00:00:00 2001
From: Win-Man <825895587@qq.com>
Date: Sun, 28 Jun 2020 23:04:28 +0800
Subject: [PATCH 05/13] Fix some errors like ops

---
 grafana-tikv-dashboard.md | 102 +++++++++++++++++++-------------------
 1 file changed, 51 insertions(+), 51 deletions(-)

diff --git a/grafana-tikv-dashboard.md b/grafana-tikv-dashboard.md
index 295fef15267a9..3931030223e2f 100644
--- a/grafana-tikv-dashboard.md
+++ b/grafana-tikv-dashboard.md
@@ -61,8 +61,8 @@ This document provides a detailed description of these key metrics on the **TiKV
 
 ## gRPC
 
-- gRPC message count: The number of gRPC messages per type
-- gRPC message failed：The number of failed gRPC messages
+- gRPC message count: The rate of gRPC messages per type
+- gRPC message failed：The rate of failed gRPC messages
 - 99% gRPC message duration: The gRPC message duration per message type (P99)
 - Average gRPC message duration: The average execution time of gRPC messages
 - gRPC batch size：The batch size of gRPC messages between TiDB and TiKV
@@ -70,24 +70,24 @@ This document provides a detailed description of these key metrics on the **TiKV
 
 ## Thread CPU
 
-- Raft store CPU：The CPU utilization of the raftstore thread. The CPU usage should be less than 80% * `raftstore.store-pool-size` in normal case.
+- Raft store CPU：The CPU utilization of the `raftstore` thread. The CPU usage should be less than 80% * `raftstore.store-pool-size` in normal case.
 - Async apply CPU：The CPU utilization of the `async apply` thread. The CPU usage should be less than 90% * `raftstore.apply-pool-size` in normal cases.
 - Scheduler worker CPU：The CPU utilization of the `scheduler worker` thread. The CPU usage should be less than 90% * `storage.scheduler-worker-pool-size` in normal cases.
 - gRPC poll CPU：The CPU utilization of the `gRPC` thread. The CPU usage should be less than 80% * `server.grpc-concurrency` in normal cases.
-- Unified read pool CPU：The CPU utilization of unified read pool
-- Storage ReadPool CPU：The CPU utilization of readpool
-- Coprocessor CPU：The CPU utilization of coprocessor
-- RocksDB CPU：The CPU utilization of RocksDB
-- Split check CPU：The CPU utilization of split check
-- GC worker CPU：The CPU utilization of GC worker 
-- Snapshot worker CPU：The CPU utilization of snapshot worker
+- Unified read pool CPU：The CPU utilization of `unified read pool` thread
+- Storage ReadPool CPU：The CPU utilization of `storage read pool` thread
+- Coprocessor CPU：The CPU utilization of `coprocessor` thread
+- RocksDB CPU：The CPU utilization of RocksDB thread
+- Split check CPU：The CPU utilization of `split check` thread
+- GC worker CPU：The CPU utilization of `GC worker` thread 
+- Snapshot worker CPU：The CPU utilization of `snapshot worker` thread
 
 ## PD
 
-- PD requests：The count of requests that TiKV sends to PD
+- PD requests：The rate of requests that TiKV sends to PD
 - PD request duration (average)：The average time consumed by requests that TiKV sends to PD
-- PD heartbeats：The total number of PD heartbeat messages
-- PD validate peers：The total number of peers validated by the PD worker
+- PD heartbeats：The rate of heartbeat messages sended from TiKV to PD
+- PD validate peers：The rate of messages that sended from TiKV to PD to validate peer
 
 ## Raft IO
 
@@ -105,15 +105,15 @@ This document provides a detailed description of these key metrics on the **TiKV
 - Ready handled：The count of handled ready operations per second
 - 0.99 Duration of Raft store events：The time consumed by raftstore events (P99)
 - Process ready duration：The time consumed for processes to be ready in Raft
-- Process ready duration per server：The time consumed for peer processes to be ready in Raft. It should be less than 2 seconds (P99.99).
+- Process ready duration per server：The time consumed for peer processes to be ready in Raft per TiKV instance. It should be less than 2 seconds (P99.99).
 
 ![TiKV Dashboard - Raft process metrics](/media/tikv-dashboard-raft-process.png)
 
 ## Raft message
 
-- Sent messages per server：The number of Raft messages sent per second by each TiKV instance
-- Flush messages per server：The number of Raft messages flushed per second by the Raft client in each TiKV instance
-- Receive messages per server：The number of Raft messages received per second by each TiKV instance
+- Sent messages per server：The number of Raft messages sent by each TiKV instance per second
+- Flush messages per server：The number of Raft messages flushed by the Raft client in each TiKV instance per second
+- Receive messages per server：The number of Raft messages received by each TiKV instance per second
 - Messages：The number of Raft messages sent per type per second
 - Vote：The number of Vote messages sent in Raft per second
 - Raft dropped messages：The number of dropped Raft messages per type per second
@@ -123,9 +123,9 @@ This document provides a detailed description of these key metrics on the **TiKV
 ## Raft propose
 
 - Raft apply proposals per ready：The histogram of the number of proposals that each ready operation containes in a batch while applying proposal.
-- Raft read/write proposals：The number of proposals per type
-- Raft read proposals per server：The number of read proposals made by each TiKV instance
-- Raft write proposals per server：The number of write proposals made by each TiKV instance
+- Raft read/write proposals：The number of proposals per type per second
+- Raft read proposals per server：The number of read proposals made by each TiKV instance per second
+- Raft write proposals per server：The number of write proposals made by each TiKV instance per second
 - Propose wait duration：The histogram of wait time of each proposal
 - Propose wait duration per server：The histogram of wait time of each proposal per TiKV instance
 - Apply wait duration：The histogram of apply time of each proposal
@@ -136,9 +136,9 @@ This document provides a detailed description of these key metrics on the **TiKV
 
 ## Raft admin
 
-- Admin proposals：The number of admin proposals
-- Admin apply：The number of processed apply commands
-- Check split：The number of raftstore split check commands
+- Admin proposals：The number of admin proposals per second
+- Admin apply：The number of processed apply commands per second
+- Check split：The number of raftstore split check commands per second
 - 99.99% Check split duration：The time consumed when running split check commands (P99.99)
 
 ![TiKV Dashboard - Raft admin metrics](/media/tikv-dashboard-raft-admin.png)
@@ -157,8 +157,8 @@ This document provides a detailed description of these key metrics on the **TiKV
 
 ## Storage
 
-- Storage command total：The total number of received commands per type
-- Storage async request error：The total number of engine asynchronous request errors
+- Storage command total：The number of received command by type per second
+- Storage async request error：The number of engine asynchronous request errors per second
 - Storage async snapshot duration：The time consumed by processing asynchronous snapshot requests. It should be less than `1s` in `.99`.
 - Storage async write duration：The time consumed by processing asynchronous write requests. It should be less than `1s` in `.99`.
 
@@ -166,7 +166,7 @@ This document provides a detailed description of these key metrics on the **TiKV
 
 ## Scheduler
 
-- Scheduler stage total：The total number of commands at each stage. There should not be lots of errors in a short time.
+- Scheduler stage total：The number of commands at each stage per second. There should not be lots of errors in a short time.
 - Scheduler writing bytes：The total bytes of writing bytes per TiKV instance
 - Scheduler priority commands：The count of different priority commands
 - Scheduler pending commands：The count of pending commands per TiKV instance
@@ -175,7 +175,7 @@ This document provides a detailed description of these key metrics on the **TiKV
 
 ## Scheduler - commit
 
-- Scheduler stage total：The total number of commands at each stage when executing the commit command. There should not be lots of errors in a short time.
+- Scheduler stage total：The number of commands at each stage per second when executing the commit command. There should not be lots of errors in a short time.
 - Scheduler command duration：The time consumed when executing the commit command. It should be less than `1s`.
 - Scheduler latch wait duration：The wait time caused by latch when executing the commit command. It should be less than `1s`.
 - Scheduler keys read：The count of keys read by a commit command
@@ -189,7 +189,7 @@ This document provides a detailed description of these key metrics on the **TiKV
 
 ## Scheduler - pessimistic_rollback
 
-- Scheduler stage total：The total number of commands at each stage when executing the pessimistic_rollback command. There should not be lots of errors in a short time.
+- Scheduler stage total：The number of commands at each stage per second when executing the pessimistic_rollback command. There should not be lots of errors in a short time.
 - Scheduler command duration：The time consumed when executing the pessimistic_rollback command. It should be less than `1s`.
 - Scheduler latch wait duration：The wait time caused by latch when executing the pessimistic_rollback command. It should be less than `1s`.
 - Scheduler keys read：The count of keys read by a pessimistic_rollback command
@@ -201,7 +201,7 @@ This document provides a detailed description of these key metrics on the **TiKV
 
 ## Scheduler - prewrite
 
-- Scheduler stage total：The total number of commands at each stage when executing the prewrite command. There should not be lots of errors in a short time.
+- Scheduler stage total：The number of commands at each stage per second when executing the prewrite command. There should not be lots of errors in a short time.
 - Scheduler command duration：The time consumed when executing the prewrite command. It should be less than `1s`.
 - Scheduler latch wait duration：The wait time caused by latch when executing the prewrite command. It should be less than `1s`.
 - Scheduler keys read：The count of keys read by a prewrite command
@@ -213,7 +213,7 @@ This document provides a detailed description of these key metrics on the **TiKV
 
 ## Scheduler - rollback
 
-- Scheduler stage total：The total number of commands at each stage when executing the rollback command. There should not be lots of errors in a short time.
+- Scheduler stage total：The number of commands at each stage per second when executing the rollback command. There should not be lots of errors in a short time.
 - Scheduler command duration：The time consumed when executing the rollback command. It should be less than `1s`.
 - Scheduler latch wait duration：The wait time caused by latch when executing the rollback command. It should be less than `1s`.
 - Scheduler keys read：The count of keys read by a rollback command
@@ -236,7 +236,7 @@ This document provides a detailed description of these key metrics on the **TiKV
 - TiKV AutoGC Working：The status of Auto GC 
 - ResolveLocks Progress：The progress of the first phase of GC(ResolveLocks)
 - TiKV Auto GC Progress：The progress of the second phase of GC
-- TiKV Auto GC SafePoint：TiKV GC safr point value, safe point is the current GC timestamp
+- TiKV Auto GC SafePoint：TiKV GC safe point value, safe point is the current GC timestamp
 - GC lifetime：The lifetime of TiDB GC
 - GC interval：The interval of TiDB GC
 
@@ -250,19 +250,19 @@ This document provides a detailed description of these key metrics on the **TiKV
 
 ## Task
 
-- Worker handled tasks：The number of tasks handled by worker
-- Worker pending tasks：Current number of pending and running tasks of worker. It should be less than `1000` in normal case.
-- FuturePool handled tasks：The number of tasks handled by future_pool
-- FuturePool pending tasks：Current number of pending and running tasks of future_pool
+- Worker handled tasks：The number of tasks handled by worker persecond
+- Worker pending tasks：Current number of pending and running tasks of worker per second. It should be less than `1000` in normal case.
+- FuturePool handled tasks：The number of tasks handled by future_pool per second
+- FuturePool pending tasks：Current number of pending and running tasks of future_pool per second
 
 ## Coprocessor Overview
 
-- Request duration：The time consumed to handle coprocessor read requests
-- Total Requests：The number of total coprocessor request
+- Request duration：The total time spent from receiving the coprocessor request to the end of processing
+- Total Requests：The number of requests by type per second
 - Handle duration：The histogram of time spent actually processing coprocessor requests per minute
-- Total Request Errors：The total number of the coprocessor request errors
-- Total KV Cursor Operations：The total number of the KV cursor operations, such as select, index, analyze_table, analyze_index, checksum_table, checksum_index, and so on.
-- KV Cursor Operations：The histogram of KV cursor operations
+- Total Request Errors：The number of request error of Coprocessor.  There should not be lots of errors in a short time.
+- Total KV Cursor Operations：The total number of the KV cursor operations by type per second, such as select, index, analyze_table, analyze_index, checksum_table, checksum_index, and so on.
+- KV Cursor Operations：The histogram of KV cursor operations by type per second
 - Total RocksDB Perf Statistics：The performance statistics of RocksDB
 - Total Response Size：The total size of coprocessor response
 
@@ -272,12 +272,12 @@ This document provides a detailed description of these key metrics on the **TiKV
 - 95% Handle duration by store：The time consumed to handle coprocessor requests per TiKV instance (P95)
 - Wait duration：The time consumed when coprocessor requests are waiting to be handled. It should be less than `10s`(P99.99).
 - 95% Wait duration by store：The time consumed when coprocessor requests are waiting to be handled per TiKV instance (P95)
-- Total DAG Requests：The total number of DAG requests
-- Total DAG Executors：The total number of DAG executors
-- Total Ops Details (Table Scan)：The total number of RocksDB internal operations when executing select scan
-- Total Ops Details (Index Scan)：The total number of RocksDB internal operations when executing index scan
-- Total Ops Details by CF (Table Scan)：The select scan details for each CF
-- Total Ops Details by CF (Index Scan)：The index scan details for each CF
+- Total DAG Requests：The number of DAG requests per second
+- Total DAG Executors：The number of DAG executors per second
+- Total Ops Details (Table Scan)：The number of RocksDB internal operations per second when executing select scan in coprocessor
+- Total Ops Details (Index Scan)：The number of RocksDB internal operations per second when executing index scan in coprocessor
+- Total Ops Details by CF (Table Scan)：The number of RocksDB internal operations for each CF per second when executing select scan in coprocessor
+- Total Ops Details by CF (Index Scan)：The number of RocksDB internal operations for each CF per second when executing index scan in coprocessor
 
 ## Threads
 
@@ -288,16 +288,16 @@ This document provides a detailed description of these key metrics on the **TiKV
 
 ## RocksDB - kv/raft
 
-- Get operations：The count of get operations
+- Get operations：The count of get operations per second
 - Get duration：The time consumed when executing get operations
-- Seek operations：The count of seek operations
+- Seek operations：The count of seek operations per second
 - Seek duration：The time consumed when executing seek operations
-- Write operations：The count of write operations
+- Write operations：The count of write operations per second
 - Write duration：The time consumed when executing write operations
-- WAL sync operations：The count of WAL sync operations
+- WAL sync operations：The count of WAL sync operations per second
 - Write WAL duration：The time consumed for writing WAL
 - WAL sync duration：The time consumed when executing WAL sync operations
-- Compaction operationsThe count of compaction and flush operations
+- Compaction operationsThe count of compaction and flush operations per second
 - Compaction duration：The time consumed when executing the compaction and flush operations
 - SST read duration：The time consumed when reading SST files
 - Write stall duration：Write stall duration. It should be `0` in normal case.

From a85ad8c3fc6676120a750c5da7089e843bc6b50d Mon Sep 17 00:00:00 2001
From: Win-Man <825895587@qq.com>
Date: Fri, 3 Jul 2020 23:41:35 +0800
Subject: [PATCH 06/13] fix ditto

---
 grafana-tikv-dashboard.md | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/grafana-tikv-dashboard.md b/grafana-tikv-dashboard.md
index 3931030223e2f..944706ae49fca 100644
--- a/grafana-tikv-dashboard.md
+++ b/grafana-tikv-dashboard.md
@@ -393,7 +393,7 @@ This document provides a detailed description of these key metrics on the **TiKV
 1. Transactional API：
 
     - kv_get：The command of getting the latest version of data specified by ts
-    - kv_scan：The command of scanning a continuous piece of data
+    - kv_scan：The command of scanning a range of data
     - kv_prewrite：The command of prewriting the data to be committed at first phase of 2PC
     - kv_pessimistic_lock：The command of adding a pessimistic lock to the key to prevent other transaction from modifying
     - kv_pessimistic_rollback：The command of deleting the pessimistic lock on the key
@@ -406,16 +406,16 @@ This document provides a detailed description of these key metrics on the **TiKV
     - kv_scan_lock：The command of scanning all locks with a version number before `max_version` to clean up expired transactions
     - kv_resolve_lock：The command of committing or rollback the transaction lock, according to the transaction status.
     - kv_gc：The command of GC
-    - kv_delete_range：The command of deleting a continuous piece of data from TiKV
+    - kv_delete_range：The command of deleting a range of data from TiKV
 
 2. Raw API：
 
     - raw_get：The command of getting the value of key
     - raw_batch_get：The command of getting the value of batch keys
-    - raw_scan：The command of scanning a continuous piece of data
+    - raw_scan：The command of scanning a range of data
     - raw_batch_scan：The command of scanning multiple consecutive data
     - raw_put：The command of writing a key/value pair
     - raw_batch_put：The command of writing a batch of key/value pairs
     - raw_delete：The command of deleting a key/value pair
     - raw_batch_delete：The command of a batch of key/value pairs
-    - raw_delete_range：The command of deleting a continuous interval
+    - raw_delete_range：The command of deleting a range of data

From bf2881a501242eb6c72474c3323640d0b2be5bb9 Mon Sep 17 00:00:00 2001
From: Win-Man <825895587@qq.com>
Date: Wed, 8 Jul 2020 12:53:57 +0800
Subject: [PATCH 07/13] Update grafana-tikv-dashboard.md

Co-authored-by: TomShawn <41534398+TomShawn@users.noreply.github.com>
---
 grafana-tikv-dashboard.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/grafana-tikv-dashboard.md b/grafana-tikv-dashboard.md
index 944706ae49fca..c8779b3b4c3f4 100644
--- a/grafana-tikv-dashboard.md
+++ b/grafana-tikv-dashboard.md
@@ -151,7 +151,7 @@ This document provides a detailed description of these key metrics on the **TiKV
 
 ## Unified Read Pool
 
-- Time used by level：The time consumed for each level in unified read pool, level 0 means small query 
+- Time used by level：The time consumed for each level in the unified read pool. Level 0 means small queries.
 - Level 0 chance：The proportion of level 0 tasks in unified read pool
 - Running tasks：The number of tasks running concurrently in the unified read pool
 

From 2d21f2031386627752508dd05db48848c4067562 Mon Sep 17 00:00:00 2001
From: Win-Man <825895587@qq.com>
Date: Wed, 8 Jul 2020 12:54:14 +0800
Subject: [PATCH 08/13] Update grafana-tikv-dashboard.md

Co-authored-by: TomShawn <41534398+TomShawn@users.noreply.github.com>
---
 grafana-tikv-dashboard.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/grafana-tikv-dashboard.md b/grafana-tikv-dashboard.md
index c8779b3b4c3f4..42532c1e1c5d3 100644
--- a/grafana-tikv-dashboard.md
+++ b/grafana-tikv-dashboard.md
@@ -167,7 +167,7 @@ This document provides a detailed description of these key metrics on the **TiKV
 ## Scheduler
 
 - Scheduler stage total：The number of commands at each stage per second. There should not be lots of errors in a short time.
-- Scheduler writing bytes：The total bytes of writing bytes per TiKV instance
+- Scheduler writing bytes：The total written bytes of commands processed by each TiKV instance
 - Scheduler priority commands：The count of different priority commands
 - Scheduler pending commands：The count of pending commands per TiKV instance
 

From 0e9613dfa2ec8a354d7716f9b100eee0d52cf35a Mon Sep 17 00:00:00 2001
From: Win-Man <825895587@qq.com>
Date: Wed, 8 Jul 2020 14:13:21 +0800
Subject: [PATCH 09/13] Update grafana-tikv-dashboard

---
 grafana-tikv-dashboard.md | 58 +++++++++++++++++++--------------------
 1 file changed, 29 insertions(+), 29 deletions(-)

diff --git a/grafana-tikv-dashboard.md b/grafana-tikv-dashboard.md
index 42532c1e1c5d3..cc15ae462435b 100644
--- a/grafana-tikv-dashboard.md
+++ b/grafana-tikv-dashboard.md
@@ -168,8 +168,8 @@ This document provides a detailed description of these key metrics on the **TiKV
 
 - Scheduler stage total：The number of commands at each stage per second. There should not be lots of errors in a short time.
 - Scheduler writing bytes：The total written bytes of commands processed by each TiKV instance
-- Scheduler priority commands：The count of different priority commands
-- Scheduler pending commands：The count of pending commands per TiKV instance
+- Scheduler priority commands：The count of different priority commands per second
+- Scheduler pending commands：The count of pending commands per TiKV instance per second
 
 ![TiKV Dashboard - Scheduler metrics](/media/tikv-dashboard-scheduler.png)
 
@@ -234,9 +234,9 @@ This document provides a detailed description of these key metrics on the **TiKV
 - TiDB GC seconds：The GC duration
 - GC speed：The number of keys deleted by GC per second
 - TiKV AutoGC Working：The status of Auto GC 
-- ResolveLocks Progress：The progress of the first phase of GC(ResolveLocks)
+- ResolveLocks Progress：The progress of the first phase of GC(Resolve Locks)
 - TiKV Auto GC Progress：The progress of the second phase of GC
-- TiKV Auto GC SafePoint：TiKV GC safe point value, safe point is the current GC timestamp
+- TiKV Auto GC SafePoint：The value of TiKV GC safe point. The safe point is the current GC timestamp
 - GC lifetime：The lifetime of TiDB GC
 - GC interval：The interval of TiDB GC
 
@@ -250,17 +250,17 @@ This document provides a detailed description of these key metrics on the **TiKV
 
 ## Task
 
-- Worker handled tasks：The number of tasks handled by worker persecond
+- Worker handled tasks：The number of tasks handled by worker per second
 - Worker pending tasks：Current number of pending and running tasks of worker per second. It should be less than `1000` in normal case.
-- FuturePool handled tasks：The number of tasks handled by future_pool per second
-- FuturePool pending tasks：Current number of pending and running tasks of future_pool per second
+- FuturePool handled tasks：The number of tasks handled by future pool per second
+- FuturePool pending tasks：Current number of pending and running tasks of future pool per second
 
 ## Coprocessor Overview
 
-- Request duration：The total time spent from receiving the coprocessor request to the end of processing
+- Request duration：The total time spent from receiving the coprocessor request to the end of request processing
 - Total Requests：The number of requests by type per second
 - Handle duration：The histogram of time spent actually processing coprocessor requests per minute
-- Total Request Errors：The number of request error of Coprocessor.  There should not be lots of errors in a short time.
+- Total Request Errors：The number of request errors of Coprocessor. There should not be lots of errors in a short time.
 - Total KV Cursor Operations：The total number of the KV cursor operations by type per second, such as select, index, analyze_table, analyze_index, checksum_table, checksum_index, and so on.
 - KV Cursor Operations：The histogram of KV cursor operations by type per second
 - Total RocksDB Perf Statistics：The performance statistics of RocksDB
@@ -269,11 +269,11 @@ This document provides a detailed description of these key metrics on the **TiKV
 ## Coprocessor Detail
 
 - Handle duration：The histogram of time spent actually processing coprocessor requests per minute
-- 95% Handle duration by store：The time consumed to handle coprocessor requests per TiKV instance (P95)
-- Wait duration：The time consumed when coprocessor requests are waiting to be handled. It should be less than `10s`(P99.99).
-- 95% Wait duration by store：The time consumed when coprocessor requests are waiting to be handled per TiKV instance (P95)
-- Total DAG Requests：The number of DAG requests per second
-- Total DAG Executors：The number of DAG executors per second
+- 95% Handle duration by store：The time consumed to handle coprocessor requests per TiKV instance per second (P95)
+- Wait duration：The time consumed when coprocessor requests are waiting to be handled. It should be less than `10s` (P99.99).
+- 95% Wait duration by store：The time consumed when coprocessor requests are waiting to be handled per TiKV instance per second (P95)
+- Total DAG Requests：The total number of DAG requests per second
+- Total DAG Executors：The total number of DAG executors per second
 - Total Ops Details (Table Scan)：The number of RocksDB internal operations per second when executing select scan in coprocessor
 - Total Ops Details (Index Scan)：The number of RocksDB internal operations per second when executing index scan in coprocessor
 - Total Ops Details by CF (Table Scan)：The number of RocksDB internal operations for each CF per second when executing select scan in coprocessor
@@ -297,7 +297,7 @@ This document provides a detailed description of these key metrics on the **TiKV
 - WAL sync operations：The count of WAL sync operations per second
 - Write WAL duration：The time consumed for writing WAL
 - WAL sync duration：The time consumed when executing WAL sync operations
-- Compaction operationsThe count of compaction and flush operations per second
+- Compaction operations: The count of compaction and flush operations per second
 - Compaction duration：The time consumed when executing the compaction and flush operations
 - SST read duration：The time consumed when reading SST files
 - Write stall duration：Write stall duration. It should be `0` in normal case.
@@ -325,12 +325,12 @@ This document provides a detailed description of these key metrics on the **TiKV
 
 ## Titan - All
 
-- Blob file count：The number of Titan blob file
+- Blob file count：The number of Titan blob files
 - Blob file size：The total size of Titan blob file
 - Live blob size：The total size of valid blob record
 - Blob cache hit：The hit rate of Titan block cache
 - Iter touched blob file count：The number of blob file involved in a single iterator
-- Blob file discardable ratio distribution：The distribution of blob file failure blob record ratio
+- Blob file discardable ratio distribution：The ratio distribution of blob record failure of blob files
 - Blob key size：The size of Titan blob keys
 - Blob value size：The size of Titan blob values
 - Blob get operations：The count of get operations in Titan blob
@@ -344,7 +344,7 @@ This document provides a detailed description of these key metrics on the **TiKV
 - Blob file read duration：The time consumed when reading Titan blob file
 - Blob file write duration：The time consumed when writing Titan blob file
 - Blob file sync operations：The count of blob file sync operations
-- Blob file sync duration：The time consumed when sync blob file
+- Blob file sync duration：The time consumed when synchronizing blob file
 - Blob GC action：The count of Titan GC actions
 - Blob GC duration：The Titan GC duration
 - Blob GC keys flow：The flow rate of keys read and written by Titan GC
@@ -357,11 +357,11 @@ This document provides a detailed description of these key metrics on the **TiKV
 
 - Thread CPU：The CPU utilization of the lock manager thread
 - Handled tasks：The number of taks handled by lock manager
-- Waiter lifetime duration：The time consumed for the transaction waitting for the lock to be released
-- Wait table：The status information of wait table, including the number of locks and the number of transactions waitting for the lock
+- Waiter lifetime duration：The waiting time of the transaction for the lock to be released
+- Wait table：The status information of wait table, including the number of locks and the number of transactions waiting for the lock
 - Deadlock detect duration：The time consumed for detecting deadlock
 - Detect error：The number of errors encountered when detecting deadlock, including the number of deadlocks
-- Deadlock detector leader：The information about the node where the deadlock detector leader is located
+- Deadlock detector leader：The information of the node where the deadlock detector leader is located
 
 ## Memory
 
@@ -374,16 +374,16 @@ This document provides a detailed description of these key metrics on the **TiKV
 - Backup Duration：The time consumed for backup
 - Backup Flow：The total bytes of backup
 - Disk Throughput：The disk throughput per instance
-- Backup Range Duration：The time consumed for range backup
-- Backup Errors：The number of errors encountered when making a backup
+- Backup Range Duration：The time consumed for backing up a range
+- Backup Errors：The number of errors encountered during a backup
 
 ## Encryption
 
 - Encryption data keys：The total number of encrypted data keys
 - Encrypted files：The number of encrypted files
-- Encryption initialized：It shows whether encryption is enabled, `1` means enabled.
-- Encryption meta files size：The size of meta file about encrpytion
-- Encrypt/decrypt data nanos：The histogram of time on encrypting/decrypting data ecch time
+- Encryption initialized：Shows whether encryption is enabled, `1` means enabled.
+- Encryption meta files size：The size of the encryption meta file
+- Encrypt/decrypt data nanos：The histogram of duration on encrypting/decrypting data each time
 - Read/write encryption meta duration：The time consumed for reading/writing encryption meta file
 
 ## Explanation of Common Parameters
@@ -395,12 +395,12 @@ This document provides a detailed description of these key metrics on the **TiKV
     - kv_get：The command of getting the latest version of data specified by ts
     - kv_scan：The command of scanning a range of data
     - kv_prewrite：The command of prewriting the data to be committed at first phase of 2PC
-    - kv_pessimistic_lock：The command of adding a pessimistic lock to the key to prevent other transaction from modifying
+    - kv_pessimistic_lock：The command of adding a pessimistic lock to the key to prevent other transaction from modifying this key
     - kv_pessimistic_rollback：The command of deleting the pessimistic lock on the key
     - kv_txn_heart_beat：The command of updating `lock_ttl` for pessimistic transactions or large transactions to prevent them from rolling back
     - kv_check_txn_status：The command of checking the status of the transaction
     - kv_commit：The command of committing the data written by prewrite command
-    - kv_cleanup：The command of rolling back a transaction, it will abolished in 4.0
+    - kv_cleanup：The command of rolling back a transaction, which is deprecated in v4.0
     - kv_batch_get：The command of getting the value of batch key at once, similar to `kv_get`.
     - kv_batch_rollback：The command of batch rollback of multiple prewrite transaction
     - kv_scan_lock：The command of scanning all locks with a version number before `max_version` to clean up expired transactions
@@ -413,7 +413,7 @@ This document provides a detailed description of these key metrics on the **TiKV
     - raw_get：The command of getting the value of key
     - raw_batch_get：The command of getting the value of batch keys
     - raw_scan：The command of scanning a range of data
-    - raw_batch_scan：The command of scanning multiple consecutive data
+    - raw_batch_scan：The command of scanning multiple consecutive data range
     - raw_put：The command of writing a key/value pair
     - raw_batch_put：The command of writing a batch of key/value pairs
     - raw_delete：The command of deleting a key/value pair

From b672b2a2d356ae0ddd84333324027c60f482bacb Mon Sep 17 00:00:00 2001
From: TomShawn <41534398+TomShawn@users.noreply.github.com>
Date: Tue, 14 Jul 2020 14:34:07 +0800
Subject: [PATCH 10/13] Update grafana-tikv-dashboard.md

---
 grafana-tikv-dashboard.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/grafana-tikv-dashboard.md b/grafana-tikv-dashboard.md
index cc15ae462435b..4d801c3b589fe 100644
--- a/grafana-tikv-dashboard.md
+++ b/grafana-tikv-dashboard.md
@@ -11,7 +11,7 @@ If you use TiUP to deploy the TiDB cluster, the monitoring system (Prometheus/Gr
 
 The Grafana dashboard is divided into a series of sub dashboards which include Overview, PD, TiDB, TiKV, Node_exporter, and so on. A lot of metrics are there to help you diagnose.
 
-You can get an overview of the component TiKV status from the **TiKV-Details** dashboard, where the key metrics are displayed. According to the [Performance Map](https://asktug.com/_/tidb-performance-map/#/), you can check whether the status of the cluster is as expected.
+You can get an overview of the component TiKV status from the **TiKV-Details** dashboard, where the key metrics are displayed.
 
 This document provides a detailed description of these key metrics on the **TiKV-Details** dashboard.
 

From 0546baf3e0d33c5c1621b814be6574db4ab83749 Mon Sep 17 00:00:00 2001
From: TomShawn <41534398+TomShawn@users.noreply.github.com>
Date: Tue, 14 Jul 2020 15:25:40 +0800
Subject: [PATCH 11/13] fix typos and refine format

---
 grafana-tikv-dashboard.md | 552 +++++++++++++++++++-------------------
 1 file changed, 276 insertions(+), 276 deletions(-)

diff --git a/grafana-tikv-dashboard.md b/grafana-tikv-dashboard.md
index 4d801c3b589fe..5b75e3107b93f 100644
--- a/grafana-tikv-dashboard.md
+++ b/grafana-tikv-dashboard.md
@@ -18,43 +18,43 @@ This document provides a detailed description of these key metrics on the **TiKV
 ## Cluster
 
 - Store size: The storage size per TiKV instance
-- Available size：The available capacity per TiKV instance
-- Capacity size：The capacity size per TiKV instance
-- CPU：The CPU usage per TiKV instance
-- Memory：The memory usage per TiKV instance
-- IO utilization：The I/O utilization per TiKV instance
-- MBps：The total bytes of read and write in each TiKV instance
-- QPS：The QPS per command in each TiKV instance
-- Errps：The rate of gRPC message failures
-- leader：The number of leaders per TiKV instance
-- Region：The number of Regions per TiKV instance
-- Uptime：The runtime of TiKV since last restart
+- Available size: The available capacity per TiKV instance
+- Capacity size: The capacity size per TiKV instance
+- CPU: The CPU utilization per TiKV instance
+- Memory: The memory usage per TiKV instance
+- IO utilization: The I/O utilization per TiKV instance
+- MBps: The total bytes of read and write in each TiKV instance
+- QPS: The QPS per command in each TiKV instance
+- Errps: The rate of gRPC message failures
+- leader: The number of leaders per TiKV instance
+- Region: The number of Regions per TiKV instance
+- Uptime: The runtime of TiKV since last restart
 
 ![TiKV Dashboard - Cluster metrics](/media/tikv-dashboard-cluster.png)
 
 ## Errors
 
-- Critical error：The number of critical errors
-- Server is busy：Indicates occurrences of events that make the TiKV instance unavailable temporarily, such as Write Stall, Channel Full, and so on. It should be `0` in normal case.
-- Server report failures：The number of error messages reported by server. It should be `0` in normal case.
-- Raftstore error：The number of Raftstore errors per type on each TiKV instance
-- Scheduler error：The number of scheduler errors per type on each TiKV instance
-- Coprocessor error：The number of coprocessor errors per type on each TiKV instance
-- gRPC message error：The number of gRPC message errors per type on each TiKV instance
-- Leader drop：The count of dropped leaders per TiKV instance
-- Leader missing：The count of missing leaders per TiKV instance
+- Critical error: The number of critical errors
+- Server is busy: Indicates occurrences of events that make the TiKV instance unavailable temporarily, such as Write Stall, Channel Full, and so on. It should be `0` in normal case.
+- Server report failures: The number of error messages reported by server. It should be `0` in normal case.
+- Raftstore error: The number of Raftstore errors per type on each TiKV instance
+- Scheduler error: The number of scheduler errors per type on each TiKV instance
+- Coprocessor error: The number of coprocessor errors per type on each TiKV instance
+- gRPC message error: The number of gRPC message errors per type on each TiKV instance
+- Leader drop: The count of dropped leaders per TiKV instance
+- Leader missing: The count of missing leaders per TiKV instance
 
 ![TiKV Dashboard - Errors metrics](/media/tikv-dashboard-errors.png)
 
 ## Server
 
-- CF size：The size of each column family
-- Store size：The storage size per TiKV instance
-- Channel full：The number of Channel Full errors per TiKV instance. It should be `0` in normal case.
-- Active written leaders：The number of leaders being written on each TiKV instance
-- Approximate Region size：The approximate Region size
-- Approximate Region size Histogram：The histogram of each approximate Region size
-- Region average written keys：The average number of written keys to Regions per TiKV instance
+- CF size: The size of each column family
+- Store size: The storage size per TiKV instance
+- Channel full: The number of Channel Full errors per TiKV instance. It should be `0` in normal case.
+- Active written leaders: The number of leaders being written on each TiKV instance
+- Approximate Region size: The approximate Region size
+- Approximate Region size Histogram: The histogram of each approximate Region size
+- Region average written keys: The average number of written keys to Regions per TiKV instance
 - Region average written bytes: The average written bytes to Regions per TiKV instance
 
 ![TiKV Dashboard - Server metrics](/media/tikv-dashboard-server.png)
@@ -62,360 +62,360 @@ This document provides a detailed description of these key metrics on the **TiKV
 ## gRPC
 
 - gRPC message count: The rate of gRPC messages per type
-- gRPC message failed：The rate of failed gRPC messages
+- gRPC message failed: The rate of failed gRPC messages
 - 99% gRPC message duration: The gRPC message duration per message type (P99)
 - Average gRPC message duration: The average execution time of gRPC messages
-- gRPC batch size：The batch size of gRPC messages between TiDB and TiKV
-- Raft message batch size：The batch size of Raft messages between TiKV instances
+- gRPC batch size: The batch size of gRPC messages between TiDB and TiKV
+- Raft message batch size: The batch size of Raft messages between TiKV instances
 
 ## Thread CPU
 
-- Raft store CPU：The CPU utilization of the `raftstore` thread. The CPU usage should be less than 80% * `raftstore.store-pool-size` in normal case.
-- Async apply CPU：The CPU utilization of the `async apply` thread. The CPU usage should be less than 90% * `raftstore.apply-pool-size` in normal cases.
-- Scheduler worker CPU：The CPU utilization of the `scheduler worker` thread. The CPU usage should be less than 90% * `storage.scheduler-worker-pool-size` in normal cases.
-- gRPC poll CPU：The CPU utilization of the `gRPC` thread. The CPU usage should be less than 80% * `server.grpc-concurrency` in normal cases.
-- Unified read pool CPU：The CPU utilization of `unified read pool` thread
-- Storage ReadPool CPU：The CPU utilization of `storage read pool` thread
-- Coprocessor CPU：The CPU utilization of `coprocessor` thread
-- RocksDB CPU：The CPU utilization of RocksDB thread
-- Split check CPU：The CPU utilization of `split check` thread
-- GC worker CPU：The CPU utilization of `GC worker` thread 
-- Snapshot worker CPU：The CPU utilization of `snapshot worker` thread
+- Raft store CPU: The CPU utilization of the `raftstore` thread. The CPU utilization should be less than 80% * `raftstore.store-pool-size` in normal case.
+- Async apply CPU: The CPU utilization of the `async apply` thread. The CPU utilization should be less than 90% * `raftstore.apply-pool-size` in normal cases.
+- Scheduler worker CPU: The CPU utilization of the `scheduler worker` thread. The CPU utilization should be less than 90% * `storage.scheduler-worker-pool-size` in normal cases.
+- gRPC poll CPU: The CPU utilization of the `gRPC` thread. The CPU utilization should be less than 80% * `server.grpc-concurrency` in normal cases.
+- Unified read pool CPU: The CPU utilization of the `unified read pool` thread
+- Storage ReadPool CPU: The CPU utilization of the `storage read pool` thread
+- Coprocessor CPU: The CPU utilization of the `coprocessor` thread
+- RocksDB CPU: The CPU utilization of the RocksDB thread
+- Split check CPU: The CPU utilization of the `split check` thread
+- GC worker CPU: The CPU utilization of the `GC worker` thread
+- Snapshot worker CPU: The CPU utilization of the `snapshot worker` thread
 
 ## PD
 
-- PD requests：The rate of requests that TiKV sends to PD
-- PD request duration (average)：The average time consumed by requests that TiKV sends to PD
-- PD heartbeats：The rate of heartbeat messages sended from TiKV to PD
-- PD validate peers：The rate of messages that sended from TiKV to PD to validate peer
+- PD requests: The rate at which TiKV sends to PD
+- PD request duration (average): The average duration of processing requests that TiKV sends to PD
+- PD heartbeats: The rate at which heartbeat messages are sent from TiKV to PD
+- PD validate peers: The rate at which messages are sent from TiKV to PD to validate TiKV peers
 
 ## Raft IO
 
-- Apply log duration：The time consumed for Raft to apply logs
-- Apply log duration per server：The time consumed for Raft to apply logs per TiKV instance
-- Append log duration：The time consumed for Raft to append logs
-- Append log duration per server：The time consumed for Raft to append logs per TiKV instance
-- Commit log duration：The time consumed for Raft to commit logs
-- Commit log duration per server：The time consumed for Raft to commit logs per TiKV instance
+- Apply log duration: The time consumed for Raft to apply logs
+- Apply log duration per server: The time consumed for Raft to apply logs per TiKV instance
+- Append log duration: The time consumed for Raft to append logs
+- Append log duration per server: The time consumed for Raft to append logs per TiKV instance
+- Commit log duration: The time consumed by Raft to commit logs
+- Commit log duration per server: The time consumed by Raft to commit logs per TiKV instance
 
 ![TiKV Dashboard - Raft IO metrics](/media/tikv-dashboard-raftio.png)
 
 ## Raft process
 
-- Ready handled：The count of handled ready operations per second
-- 0.99 Duration of Raft store events：The time consumed by raftstore events (P99)
-- Process ready duration：The time consumed for processes to be ready in Raft
-- Process ready duration per server：The time consumed for peer processes to be ready in Raft per TiKV instance. It should be less than 2 seconds (P99.99).
+- Ready handled: The count of handled ready operations per second
+- 0.99 Duration of Raft store events: The time consumed by Raftstore events (P99)
+- Process ready duration: The time consumed for processes to be ready in Raft
+- Process ready duration per server: The time consumed for peer processes to be ready in Raft per TiKV instance. It should be less than 2 seconds (P99.99).
 
 ![TiKV Dashboard - Raft process metrics](/media/tikv-dashboard-raft-process.png)
 
 ## Raft message
 
-- Sent messages per server：The number of Raft messages sent by each TiKV instance per second
-- Flush messages per server：The number of Raft messages flushed by the Raft client in each TiKV instance per second
-- Receive messages per server：The number of Raft messages received by each TiKV instance per second
-- Messages：The number of Raft messages sent per type per second
-- Vote：The number of Vote messages sent in Raft per second
-- Raft dropped messages：The number of dropped Raft messages per type per second
+- Sent messages per server: The number of Raft messages sent by each TiKV instance per second
+- Flush messages per server: The number of Raft messages flushed by the Raft client in each TiKV instance per second
+- Receive messages per server: The number of Raft messages received by each TiKV instance per second
+- Messages: The number of Raft messages sent per type per second
+- Vote: The number of Vote messages sent in Raft per second
+- Raft dropped messages: The number of dropped Raft messages per type per second
 
 ![TiKV Dashboard - Raft message metrics](/media/tikv-dashboard-raft-message.png)
 
 ## Raft propose
 
-- Raft apply proposals per ready：The histogram of the number of proposals that each ready operation containes in a batch while applying proposal.
-- Raft read/write proposals：The number of proposals per type per second
-- Raft read proposals per server：The number of read proposals made by each TiKV instance per second
-- Raft write proposals per server：The number of write proposals made by each TiKV instance per second
-- Propose wait duration：The histogram of wait time of each proposal
-- Propose wait duration per server：The histogram of wait time of each proposal per TiKV instance
-- Apply wait duration：The histogram of apply time of each proposal
-- Apply wait duration per server：The histogram of apply time of each proposal per TiKV instance
-- Raft log speed：The average rate at which peers propose logs
+- Raft apply proposals per ready: The histogram of the number of proposals that each ready operation contains in a batch while applying proposal.
+- Raft read/write proposals: The number of proposals per type per second
+- Raft read proposals per server: The number of read proposals made by each TiKV instance per second
+- Raft write proposals per server: The number of write proposals made by each TiKV instance per second
+- Propose wait duration: The histogram of waiting time of each proposal
+- Propose wait duration per server: The histogram of waiting time of each proposal per TiKV instance
+- Apply wait duration: The histogram of apply time of each proposal
+- Apply wait duration per server: The histogram of apply time of each proposal per TiKV instance
+- Raft log speed: The average rate at which peers propose logs
 
 ![TiKV Dashboard - Raft propose metrics](/media/tikv-dashboard-raft-propose.png)
 
 ## Raft admin
 
-- Admin proposals：The number of admin proposals per second
-- Admin apply：The number of processed apply commands per second
-- Check split：The number of raftstore split check commands per second
-- 99.99% Check split duration：The time consumed when running split check commands (P99.99)
+- Admin proposals: The number of admin proposals per second
+- Admin apply: The number of processed apply commands per second
+- Check split: The number of Raftstore split check commands per second
+- 99.99% Check split duration: The time consumed when running split check commands (P99.99)
 
 ![TiKV Dashboard - Raft admin metrics](/media/tikv-dashboard-raft-admin.png)
 
 ## Local reader
 
-- Local reader requests：The number of total requests and the number of rejections from the local read thread
+- Local reader requests: The number of total requests and the number of rejections from the local read thread
 
 ![TiKV Dashboard - Local reader metrics](/media/tikv-dashboard-local-reader.png)
 
 ## Unified Read Pool
 
-- Time used by level：The time consumed for each level in the unified read pool. Level 0 means small queries.
-- Level 0 chance：The proportion of level 0 tasks in unified read pool
-- Running tasks：The number of tasks running concurrently in the unified read pool
+- Time used by level: The time consumed for each level in the unified read pool. Level 0 means small queries.
+- Level 0 chance: The proportion of level 0 tasks in unified read pool
+- Running tasks: The number of tasks running concurrently in the unified read pool
 
 ## Storage
 
-- Storage command total：The number of received command by type per second
-- Storage async request error：The number of engine asynchronous request errors per second
-- Storage async snapshot duration：The time consumed by processing asynchronous snapshot requests. It should be less than `1s` in `.99`.
-- Storage async write duration：The time consumed by processing asynchronous write requests. It should be less than `1s` in `.99`.
+- Storage command total: The number of received command by type per second
+- Storage async request error: The number of engine asynchronous request errors per second
+- Storage async snapshot duration: The time consumed by processing asynchronous snapshot requests. It should be less than `1s` in `.99`.
+- Storage async write duration: The time consumed by processing asynchronous write requests. It should be less than `1s` in `.99`.
 
 ![TiKV Dashboard - Storage metrics](/media/tikv-dashboard-storage.png)
 
 ## Scheduler
 
-- Scheduler stage total：The number of commands at each stage per second. There should not be lots of errors in a short time.
-- Scheduler writing bytes：The total written bytes of commands processed by each TiKV instance
-- Scheduler priority commands：The count of different priority commands per second
-- Scheduler pending commands：The count of pending commands per TiKV instance per second
+- Scheduler stage total: The number of commands at each stage per second. There should not be a lot of errors in a short time.
+- Scheduler writing bytes: The total written bytes by commands processed on each TiKV instance
+- Scheduler priority commands: The count of different priority commands per second
+- Scheduler pending commands: The count of pending commands per TiKV instance per second
 
 ![TiKV Dashboard - Scheduler metrics](/media/tikv-dashboard-scheduler.png)
 
 ## Scheduler - commit
 
-- Scheduler stage total：The number of commands at each stage per second when executing the commit command. There should not be lots of errors in a short time.
-- Scheduler command duration：The time consumed when executing the commit command. It should be less than `1s`.
-- Scheduler latch wait duration：The wait time caused by latch when executing the commit command. It should be less than `1s`.
-- Scheduler keys read：The count of keys read by a commit command
-- Scheduler keys written：The count of keys written by a commit command
-- Scheduler scan details：The keys scan details of each CF when executing the commit command.
-- Scheduler scan details [lock]：The keys scan details of lock CF when executing the commit command
-- Scheduler scan details [write]：The keys scan details of write CF when executing the commit command
-- Scheduler scan details [default]：The keys scan details of default CF when executing the commit command
+- Scheduler stage total: The number of commands at each stage per second when executing the commit command. There should not be a lot of errors in a short time.
+- Scheduler command duration: The time consumed when executing the commit command. It should be less than `1s`.
+- Scheduler latch wait duration: The waiting time caused by latch when executing the commit command. It should be less than `1s`.
+- Scheduler keys read: The count of keys read by a commit command
+- Scheduler keys written: The count of keys written by a commit command
+- Scheduler scan details: The keys scan details of each CF when executing the commit command.
+- Scheduler scan details [lock]: The keys scan details of lock CF when executing the commit command
+- Scheduler scan details [write]: The keys scan details of write CF when executing the commit command
+- Scheduler scan details [default]: The keys scan details of default CF when executing the commit command
 
 ![TiKV Dashboard - Scheduler commit metrics](/media/tikv-dashboard-scheduler-commit.png)
 
 ## Scheduler - pessimistic_rollback
 
-- Scheduler stage total：The number of commands at each stage per second when executing the pessimistic_rollback command. There should not be lots of errors in a short time.
-- Scheduler command duration：The time consumed when executing the pessimistic_rollback command. It should be less than `1s`.
-- Scheduler latch wait duration：The wait time caused by latch when executing the pessimistic_rollback command. It should be less than `1s`.
-- Scheduler keys read：The count of keys read by a pessimistic_rollback command
-- Scheduler keys written：The count of keys written by a pessimistic_rollback command
-- Scheduler scan details：The keys scan details of each CF when executing the pessimistic_rollback command.
-- Scheduler scan details [lock]：The keys scan details of lock CF when executing the pessimistic_rollback command
-- Scheduler scan details [write]：The keys scan details of write CF when executing the pessimistic_rollback command
-- Scheduler scan details [default]：The keys scan details of default CF when executing the pessimistic_rollback command
+- Scheduler stage total: The number of commands at each stage per second when executing the `pessimistic_rollback` command. There should not be a lot of errors in a short time.
+- Scheduler command duration: The time consumed when executing the `pessimistic_rollback` command. It should be less than `1s`.
+- Scheduler latch wait duration: The waiting time caused by latch when executing the `pessimistic_rollback` command. It should be less than `1s`.
+- Scheduler keys read: The count of keys read by a `pessimistic_rollback` command
+- Scheduler keys written: The count of keys written by a `pessimistic_rollback` command
+- Scheduler scan details: The keys scan details of each CF when executing the `pessimistic_rollback` command.
+- Scheduler scan details [lock]: The keys scan details of lock CF when executing the `pessimistic_rollback` command
+- Scheduler scan details [write]: The keys scan details of write CF when executing the `pessimistic_rollback` command
+- Scheduler scan details [default]: The keys scan details of default CF when executing the `pessimistic_rollback` command
 
 ## Scheduler - prewrite
 
-- Scheduler stage total：The number of commands at each stage per second when executing the prewrite command. There should not be lots of errors in a short time.
-- Scheduler command duration：The time consumed when executing the prewrite command. It should be less than `1s`.
-- Scheduler latch wait duration：The wait time caused by latch when executing the prewrite command. It should be less than `1s`.
-- Scheduler keys read：The count of keys read by a prewrite command
-- Scheduler keys written：The count of keys written by a prewrite command
-- Scheduler scan details：The keys scan details of each CF when executing the prewrite command.
-- Scheduler scan details [lock]：The keys scan details of lock CF when executing the prewrite command
-- Scheduler scan details [write]：The keys scan details of write CF when executing the prewrite command
-- Scheduler scan details [default]：The keys scan details of default CF when executing the prewrite command
+- Scheduler stage total: The number of commands at each stage per second when executing the prewrite command. There should not be a lot of errors in a short time.
+- Scheduler command duration: The time consumed when executing the prewrite command. It should be less than `1s`.
+- Scheduler latch wait duration: The waiting time caused by latch when executing the prewrite command. It should be less than `1s`.
+- Scheduler keys read: The count of keys read by a prewrite command
+- Scheduler keys written: The count of keys written by a prewrite command
+- Scheduler scan details: The keys scan details of each CF when executing the prewrite command.
+- Scheduler scan details [lock]: The keys scan details of lock CF when executing the prewrite command
+- Scheduler scan details [write]: The keys scan details of write CF when executing the prewrite command
+- Scheduler scan details [default]: The keys scan details of default CF when executing the prewrite command
 
 ## Scheduler - rollback
 
-- Scheduler stage total：The number of commands at each stage per second when executing the rollback command. There should not be lots of errors in a short time.
-- Scheduler command duration：The time consumed when executing the rollback command. It should be less than `1s`.
-- Scheduler latch wait duration：The wait time caused by latch when executing the rollback command. It should be less than `1s`.
-- Scheduler keys read：The count of keys read by a rollback command
-- Scheduler keys written：The count of keys written by a rollback command
-- Scheduler scan details：The keys scan details of each CF when executing the rollback command.
-- Scheduler scan details [lock]：The keys scan details of lock CF when executing the rollback command
-- Scheduler scan details [write]：The keys scan details of write CF when executing the rollback command
-- Scheduler scan details [default]：The keys scan details of default CF when executing the rollback command
+- Scheduler stage total: The number of commands at each stage per second when executing the rollback command. There should not be a lot of errors in a short time.
+- Scheduler command duration: The time consumed when executing the rollback command. It should be less than `1s`.
+- Scheduler latch wait duration: The waiting time caused by latch when executing the rollback command. It should be less than `1s`.
+- Scheduler keys read: The count of keys read by a rollback command
+- Scheduler keys written: The count of keys written by a rollback command
+- Scheduler scan details: The keys scan details of each CF when executing the rollback command.
+- Scheduler scan details [lock]: The keys scan details of lock CF when executing the rollback command
+- Scheduler scan details [write]: The keys scan details of write CF when executing the rollback command
+- Scheduler scan details [default]: The keys scan details of default CF when executing the rollback command
 
 ## GC
 
-- MVCC versions：The number of versions for each key
-- MVCC delete versions：The number of versions deleted by GC for each key
-- GC tasks：The count of GC tasks processed by gc_worker
-- GC tasks Duration：The time consumed when executing GC tasks
-- GC keys (write CF)：The count of keys in write CF affected during GC
-- TiDB GC worker actions：The count of TiDB GC worker actions
-- TiDB GC seconds：The GC duration
-- GC speed：The number of keys deleted by GC per second
-- TiKV AutoGC Working：The status of Auto GC 
-- ResolveLocks Progress：The progress of the first phase of GC(Resolve Locks)
-- TiKV Auto GC Progress：The progress of the second phase of GC
-- TiKV Auto GC SafePoint：The value of TiKV GC safe point. The safe point is the current GC timestamp
-- GC lifetime：The lifetime of TiDB GC
-- GC interval：The interval of TiDB GC
+- MVCC versions: The number of versions for each key
+- MVCC delete versions: The number of versions deleted by GC for each key
+- GC tasks: The count of GC tasks processed by gc_worker
+- GC tasks Duration: The time consumed when executing GC tasks
+- GC keys (write CF): The count of keys in write CF affected during GC
+- TiDB GC worker actions: The count of TiDB GC worker actions
+- TiDB GC seconds: The GC duration
+- GC speed: The number of keys deleted by GC per second
+- TiKV AutoGC Working: The status of Auto GC
+- ResolveLocks Progress: The progress of the first phase of GC (Resolve Locks)
+- TiKV Auto GC Progress: The progress of the second phase of GC
+- TiKV Auto GC SafePoint: The value of TiKV GC safe point. The safe point is the current GC timestamp
+- GC lifetime: The lifetime of TiDB GC
+- GC interval: The interval of TiDB GC
 
 ## Snapshot
 
-- Rate snapshot message：The rate at which Raft snapshot messages are sent
-- 99% Handle snapshot duration：The time consumed to handle snapshots (P99)
-- Snapshot state count：The number of snapshots per state
-- 99.99% Snapshot size：The snapshot size (P99.99)
-- 99.99% Snapshot KV count：The number of KV within a snapshot (P99.99)
+- Rate snapshot message: The rate at which Raft snapshot messages are sent
+- 99% Handle snapshot duration: The time consumed to handle snapshots (P99)
+- Snapshot state count: The number of snapshots per state
+- 99.99% Snapshot size: The snapshot size (P99.99)
+- 99.99% Snapshot KV count: The number of KV within a snapshot (P99.99)
 
 ## Task
 
-- Worker handled tasks：The number of tasks handled by worker per second
-- Worker pending tasks：Current number of pending and running tasks of worker per second. It should be less than `1000` in normal case.
-- FuturePool handled tasks：The number of tasks handled by future pool per second
-- FuturePool pending tasks：Current number of pending and running tasks of future pool per second
+- Worker handled tasks: The number of tasks handled by worker per second
+- Worker pending tasks: Current number of pending and running tasks of worker per second. It should be less than `1000` in normal case.
+- FuturePool handled tasks: The number of tasks handled by future pool per second
+- FuturePool pending tasks: Current number of pending and running tasks of future pool per second
 
 ## Coprocessor Overview
 
-- Request duration：The total time spent from receiving the coprocessor request to the end of request processing
-- Total Requests：The number of requests by type per second
-- Handle duration：The histogram of time spent actually processing coprocessor requests per minute
-- Total Request Errors：The number of request errors of Coprocessor. There should not be lots of errors in a short time.
-- Total KV Cursor Operations：The total number of the KV cursor operations by type per second, such as select, index, analyze_table, analyze_index, checksum_table, checksum_index, and so on.
-- KV Cursor Operations：The histogram of KV cursor operations by type per second
-- Total RocksDB Perf Statistics：The performance statistics of RocksDB
-- Total Response Size：The total size of coprocessor response
+- Request duration: The total duration from the time of receiving the coprocessor request to the time of finishing processing the request
+- Total Requests: The number of requests by type per second
+- Handle duration: The histogram of time spent actually processing coprocessor requests per minute
+- Total Request Errors: The number of request errors of Coprocessor per second. There should not be a lot of errors in a short time.
+- Total KV Cursor Operations: The total number of the KV cursor operations by type per second, such as `select`, `index`, `analyze_table`, `analyze_index`, `checksum_table`, `checksum_index`, and so on.
+- KV Cursor Operations: The histogram of KV cursor operations by type per second
+- Total RocksDB Perf Statistics: The statistics of RocksDB performance
+- Total Response Size: The total size of coprocessor response
 
 ## Coprocessor Detail
 
-- Handle duration：The histogram of time spent actually processing coprocessor requests per minute
-- 95% Handle duration by store：The time consumed to handle coprocessor requests per TiKV instance per second (P95)
-- Wait duration：The time consumed when coprocessor requests are waiting to be handled. It should be less than `10s` (P99.99).
-- 95% Wait duration by store：The time consumed when coprocessor requests are waiting to be handled per TiKV instance per second (P95)
-- Total DAG Requests：The total number of DAG requests per second
-- Total DAG Executors：The total number of DAG executors per second
-- Total Ops Details (Table Scan)：The number of RocksDB internal operations per second when executing select scan in coprocessor
-- Total Ops Details (Index Scan)：The number of RocksDB internal operations per second when executing index scan in coprocessor
-- Total Ops Details by CF (Table Scan)：The number of RocksDB internal operations for each CF per second when executing select scan in coprocessor
-- Total Ops Details by CF (Index Scan)：The number of RocksDB internal operations for each CF per second when executing index scan in coprocessor
+- Handle duration: The histogram of time spent actually processing coprocessor requests per minute
+- 95% Handle duration by store: The time consumed to handle coprocessor requests per TiKV instance per second (P95)
+- Wait duration: The time consumed when coprocessor requests are waiting to be handled. It should be less than `10s` (P99.99).
+- 95% Wait duration by store: The time consumed when coprocessor requests are waiting to be handled per TiKV instance per second (P95)
+- Total DAG Requests: The total number of DAG requests per second
+- Total DAG Executors: The total number of DAG executors per second
+- Total Ops Details (Table Scan): The number of RocksDB internal operations per second when executing select scan in coprocessor
+- Total Ops Details (Index Scan): The number of RocksDB internal operations per second when executing index scan in coprocessor
+- Total Ops Details by CF (Table Scan): The number of RocksDB internal operations for each CF per second when executing select scan in coprocessor
+- Total Ops Details by CF (Index Scan): The number of RocksDB internal operations for each CF per second when executing index scan in coprocessor
 
 ## Threads
 
-- Threads state：The state of TiKV threads
-- Threads IO：The I/O traffic of each TiKV thread
-- Thread Voluntary Context Switches：The number of TiKV threads voluntary context switches
-- Thread Nonvoluntary Context Switches：The number of TiKV threads nonvoluntary context switches
+- Threads state: The state of TiKV threads
+- Threads IO: The I/O traffic of each TiKV thread
+- Thread Voluntary Context Switches: The number of TiKV threads voluntary context switches
+- Thread Nonvoluntary Context Switches: The number of TiKV threads nonvoluntary context switches
 
 ## RocksDB - kv/raft
 
-- Get operations：The count of get operations per second
-- Get duration：The time consumed when executing get operations
-- Seek operations：The count of seek operations per second
-- Seek duration：The time consumed when executing seek operations
-- Write operations：The count of write operations per second
-- Write duration：The time consumed when executing write operations
-- WAL sync operations：The count of WAL sync operations per second
-- Write WAL duration：The time consumed for writing WAL
-- WAL sync duration：The time consumed when executing WAL sync operations
+- Get operations: The count of get operations per second
+- Get duration: The time consumed when executing get operations
+- Seek operations: The count of seek operations per second
+- Seek duration: The time consumed when executing seek operations
+- Write operations: The count of write operations per second
+- Write duration: The time consumed when executing write operations
+- WAL sync operations: The count of WAL sync operations per second
+- Write WAL duration: The time consumed for writing WAL
+- WAL sync duration: The time consumed when executing WAL sync operations
 - Compaction operations: The count of compaction and flush operations per second
-- Compaction duration：The time consumed when executing the compaction and flush operations
-- SST read duration：The time consumed when reading SST files
-- Write stall duration：Write stall duration. It should be `0` in normal case.
-- Memtable size：The memtable size of each column family
-- Memtable hit：The hit rate of memtable
-- Block cache size：The block cache size. Broken down by column family if shared block cache is disabled.
-- Block cache hit：The hit rate of block cache
-- Block cache flow：The flow rate of block cache operations per type
+- Compaction duration: The time consumed when executing the compaction and flush operations
+- SST read duration: The time consumed when reading SST files
+- Write stall duration: Write stall duration. It should be `0` in normal case.
+- Memtable size: The memtable size of each column family
+- Memtable hit: The hit rate of memtable
+- Block cache size: The block cache size. Broken down by column family if shared block cache is disabled.
+- Block cache hit: The hit rate of block cache
+- Block cache flow: The flow rate of block cache operations per type
 - Block cache operations: The count of block cache operations per type
-- Keys flow：The flow rate of operations on keys per type
-- Total keys：The count of keys in each column family
-- Read flow：The flow rate of read operations per type
-- Bytes / Read：The bytes per read operation
-- Write flow：The flow rate of write operations per type
-- Bytes / Write：The bytes per write operation
-- Compaction flow：The flow rate of compaction operations per type
-- Compaction pending bytes：The pending bytes to be compacted
-- Read amplification：The read amplification per TiKV instance
-- Compression ratio：The compression ratio of each level
-- Number of snapshots：The number of snapshots per TiKV instance
-- Oldest snapshots duration：The time that the oldest unreleased snapshot survivals
-- Number files at each level：The number of SST files for different column families in each level
-- Ingest SST duration seconds：The time consumed to ingest SST files
-- Stall conditions changed of each CF：Stall conditions changed of each column family
+- Keys flow: The flow rate of operations on keys per type
+- Total keys: The count of keys in each column family
+- Read flow: The flow rate of read operations per type
+- Bytes / Read: The bytes per read operation
+- Write flow: The flow rate of write operations per type
+- Bytes / Write: The bytes per write operation
+- Compaction flow: The flow rate of compaction operations per type
+- Compaction pending bytes: The pending bytes to be compacted
+- Read amplification: The read amplification per TiKV instance
+- Compression ratio: The compression ratio of each level
+- Number of snapshots: The number of snapshots per TiKV instance
+- Oldest snapshots duration: The time that the oldest unreleased snapshot survivals
+- Number files at each level: The number of SST files for different column families in each level
+- Ingest SST duration seconds: The time consumed to ingest SST files
+- Stall conditions changed of each CF: Stall conditions changed of each column family
 
 ## Titan - All
 
-- Blob file count：The number of Titan blob files
-- Blob file size：The total size of Titan blob file
-- Live blob size：The total size of valid blob record
-- Blob cache hit：The hit rate of Titan block cache
-- Iter touched blob file count：The number of blob file involved in a single iterator
-- Blob file discardable ratio distribution：The ratio distribution of blob record failure of blob files
-- Blob key size：The size of Titan blob keys
-- Blob value size：The size of Titan blob values
-- Blob get operations：The count of get operations in Titan blob
-- Blob get duration：The time consumed when executing get operations in Titan blob
-- Blob iter operations：The time consumed when executing iter operations in Titan blob
-- Blob seek duration：The time consumed when executing seek operations in Titan blob
-- Blob next duration：The time consumed when executing next operations in Titan blob
-- Blob prev duration：The time consumed when executing prev operations in Titan blob
-- Blob keys flow：The flow rate of operations on Titan blob keys
-- Blob bytes flow：The flow rate of bytes on Titan blob keys
-- Blob file read duration：The time consumed when reading Titan blob file
-- Blob file write duration：The time consumed when writing Titan blob file
-- Blob file sync operations：The count of blob file sync operations
-- Blob file sync duration：The time consumed when synchronizing blob file
-- Blob GC action：The count of Titan GC actions
-- Blob GC duration：The Titan GC duration
-- Blob GC keys flow：The flow rate of keys read and written by Titan GC
-- Blob GC bytes flow：The flow rate of bytes read and written by Titan GC
-- Blob GC input file size：The size of Titan GC input file 
-- Blob GC output file size：The size of Titan GC output file
-- Blob GC file count：The count of blob files involved in Titan GC
+- Blob file count: The number of Titan blob files
+- Blob file size: The total size of Titan blob file
+- Live blob size: The total size of valid blob record
+- Blob cache hit: The hit rate of Titan block cache
+- Iter touched blob file count: The number of blob file involved in a single iterator
+- Blob file discardable ratio distribution: The ratio distribution of blob record failure of blob files
+- Blob key size: The size of Titan blob keys
+- Blob value size: The size of Titan blob values
+- Blob get operations: The count of get operations in Titan blob
+- Blob get duration: The time consumed when executing get operations in Titan blob
+- Blob iter operations: The time consumed when executing iter operations in Titan blob
+- Blob seek duration: The time consumed when executing seek operations in Titan blob
+- Blob next duration: The time consumed when executing next operations in Titan blob
+- Blob prev duration: The time consumed when executing prev operations in Titan blob
+- Blob keys flow: The flow rate of operations on Titan blob keys
+- Blob bytes flow: The flow rate of bytes on Titan blob keys
+- Blob file read duration: The time consumed when reading Titan blob file
+- Blob file write duration: The time consumed when writing Titan blob file
+- Blob file sync operations: The count of blob file sync operations
+- Blob file sync duration: The time consumed when synchronizing blob file
+- Blob GC action: The count of Titan GC actions
+- Blob GC duration: The Titan GC duration
+- Blob GC keys flow: The flow rate of keys read and written by Titan GC
+- Blob GC bytes flow: The flow rate of bytes read and written by Titan GC
+- Blob GC input file size: The size of Titan GC input file
+- Blob GC output file size: The size of Titan GC output file
+- Blob GC file count: The count of blob files involved in Titan GC
 
 ## Lock manager
 
-- Thread CPU：The CPU utilization of the lock manager thread
-- Handled tasks：The number of taks handled by lock manager
-- Waiter lifetime duration：The waiting time of the transaction for the lock to be released
-- Wait table：The status information of wait table, including the number of locks and the number of transactions waiting for the lock
-- Deadlock detect duration：The time consumed for detecting deadlock
-- Detect error：The number of errors encountered when detecting deadlock, including the number of deadlocks
-- Deadlock detector leader：The information of the node where the deadlock detector leader is located
+- Thread CPU: The CPU utilization of the lock manager thread
+- Handled tasks: The number of tasks handled by lock manager
+- Waiter lifetime duration: The waiting time of the transaction for the lock to be released
+- Wait table: The status information of wait table, including the number of locks and the number of transactions waiting for the lock
+- Deadlock detect duration: The time consumed for detecting deadlock
+- Detect error: The number of errors encountered when detecting deadlock, including the number of deadlocks
+- Deadlock detector leader: The information of the node where the deadlock detector leader is located
 
 ## Memory
 
-- Allocator Stats：The statistics of the memory allocator
+- Allocator Stats: The statistics of the memory allocator
 
 ## Backup
 
-- Backup CPU：The CPU utilization of the backup thread
-- Range Size：The histogram of backup range size
-- Backup Duration：The time consumed for backup
-- Backup Flow：The total bytes of backup
-- Disk Throughput：The disk throughput per instance
-- Backup Range Duration：The time consumed for backing up a range
-- Backup Errors：The number of errors encountered during a backup
+- Backup CPU: The CPU utilization of the backup thread
+- Range Size: The histogram of backup range size
+- Backup Duration: The time consumed for backup
+- Backup Flow: The total bytes of backup
+- Disk Throughput: The disk throughput per instance
+- Backup Range Duration: The time consumed for backing up a range
+- Backup Errors: The number of errors encountered during a backup
 
 ## Encryption
 
-- Encryption data keys：The total number of encrypted data keys
-- Encrypted files：The number of encrypted files
-- Encryption initialized：Shows whether encryption is enabled, `1` means enabled.
-- Encryption meta files size：The size of the encryption meta file
-- Encrypt/decrypt data nanos：The histogram of duration on encrypting/decrypting data each time
-- Read/write encryption meta duration：The time consumed for reading/writing encryption meta file
+- Encryption data keys: The total number of encrypted data keys
+- Encrypted files: The number of encrypted files
+- Encryption initialized: Shows whether encryption is enabled. `1` means enabled.
+- Encryption meta files size: The size of the encryption meta file
+- Encrypt/decrypt data nanos: The histogram of duration on encrypting/decrypting data each time
+- Read/write encryption meta duration: The time consumed for reading/writing encryption meta files
 
 ## Explanation of Common Parameters
 
 ### gRPC Message Type
 
-1. Transactional API：
-
-    - kv_get：The command of getting the latest version of data specified by ts
-    - kv_scan：The command of scanning a range of data
-    - kv_prewrite：The command of prewriting the data to be committed at first phase of 2PC
-    - kv_pessimistic_lock：The command of adding a pessimistic lock to the key to prevent other transaction from modifying this key
-    - kv_pessimistic_rollback：The command of deleting the pessimistic lock on the key
-    - kv_txn_heart_beat：The command of updating `lock_ttl` for pessimistic transactions or large transactions to prevent them from rolling back
-    - kv_check_txn_status：The command of checking the status of the transaction
-    - kv_commit：The command of committing the data written by prewrite command
-    - kv_cleanup：The command of rolling back a transaction, which is deprecated in v4.0
-    - kv_batch_get：The command of getting the value of batch key at once, similar to `kv_get`.
-    - kv_batch_rollback：The command of batch rollback of multiple prewrite transaction
-    - kv_scan_lock：The command of scanning all locks with a version number before `max_version` to clean up expired transactions
-    - kv_resolve_lock：The command of committing or rollback the transaction lock, according to the transaction status.
-    - kv_gc：The command of GC
-    - kv_delete_range：The command of deleting a range of data from TiKV
-
-2. Raw API：
-
-    - raw_get：The command of getting the value of key
-    - raw_batch_get：The command of getting the value of batch keys
-    - raw_scan：The command of scanning a range of data
-    - raw_batch_scan：The command of scanning multiple consecutive data range
-    - raw_put：The command of writing a key/value pair
-    - raw_batch_put：The command of writing a batch of key/value pairs
-    - raw_delete：The command of deleting a key/value pair
-    - raw_batch_delete：The command of a batch of key/value pairs
-    - raw_delete_range：The command of deleting a range of data
+1. Transactional API:
+
+    - kv_get: The command of getting the latest version of data specified by `ts`
+    - kv_scan: The command of scanning a range of data
+    - kv_prewrite: The command of prewriting the data to be committed at first phase of 2PC
+    - kv_pessimistic_lock: The command of adding a pessimistic lock to the key to prevent other transaction from modifying this key
+    - kv_pessimistic_rollback: The command of deleting the pessimistic lock on the key
+    - kv_txn_heart_beat: The command of updating `lock_ttl` for pessimistic transactions or large transactions to prevent them from rolling back
+    - kv_check_txn_status: The command of checking the status of the transaction
+    - kv_commit: The command of committing the data written by the prewrite command
+    - kv_cleanup: The command of rolling back a transaction, which is deprecated in v4.0
+    - kv_batch_get: The command of getting the value of batch key at once, similar to `kv_get`
+    - kv_batch_rollback: The command of batch rollback of multiple prewrite transactions
+    - kv_scan_lock: The command of scanning all locks with a version number before `max_version` to clean up expired transactions
+    - kv_resolve_lock: The command of committing or rollback the transaction lock, according to the transaction status.
+    - kv_gc: The command of GC
+    - kv_delete_range: The command of deleting a range of data from TiKV
+
+2. Raw API:
+
+    - raw_get: The command of getting the value of key
+    - raw_batch_get: The command of getting the value of batch keys
+    - raw_scan: The command of scanning a range of data
+    - raw_batch_scan: The command of scanning multiple consecutive data range
+    - raw_put: The command of writing a key/value pair
+    - raw_batch_put: The command of writing a batch of key/value pairs
+    - raw_delete: The command of deleting a key/value pair
+    - raw_batch_delete: The command of a batch of key/value pairs
+    - raw_delete_range: The command of deleting a range of data

From 3c91884adec1364f0d19992ba929c4221b221dc2 Mon Sep 17 00:00:00 2001
From: TomShawn <41534398+TomShawn@users.noreply.github.com>
Date: Tue, 14 Jul 2020 15:48:06 +0800
Subject: [PATCH 12/13] Update grafana-tikv-dashboard.md

---
 grafana-tikv-dashboard.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/grafana-tikv-dashboard.md b/grafana-tikv-dashboard.md
index 5b75e3107b93f..c31e498d86ee7 100644
--- a/grafana-tikv-dashboard.md
+++ b/grafana-tikv-dashboard.md
@@ -5,7 +5,7 @@ category: reference
 aliases: ['/docs/dev/grafana-tikv-dashboard/','/docs/dev/reference/key-monitoring-metrics/tikv-dashboard/']
 ---
 
-# Description of TiKV Monitoring Metrics
+# Key Monitoring Metrics of TiKV
 
 If you use TiUP to deploy the TiDB cluster, the monitoring system (Prometheus/Grafana) is deployed at the same time. For more information, see [Overview of the Monitoring Framework](/tidb-monitoring-framework.md).
 

From 9b8d7f23eb44b93652d410f2652a4ec460ccd99e Mon Sep 17 00:00:00 2001
From: Win-Man <825895587@qq.com>
Date: Sat, 18 Jul 2020 21:57:17 +0800
Subject: [PATCH 13/13] add content

---
 grafana-tikv-dashboard.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/grafana-tikv-dashboard.md b/grafana-tikv-dashboard.md
index c31e498d86ee7..9fb469c24a191 100644
--- a/grafana-tikv-dashboard.md
+++ b/grafana-tikv-dashboard.md
@@ -11,7 +11,7 @@ If you use TiUP to deploy the TiDB cluster, the monitoring system (Prometheus/Gr
 
 The Grafana dashboard is divided into a series of sub dashboards which include Overview, PD, TiDB, TiKV, Node_exporter, and so on. A lot of metrics are there to help you diagnose.
 
-You can get an overview of the component TiKV status from the **TiKV-Details** dashboard, where the key metrics are displayed.
+You can get an overview of the component TiKV status from the **TiKV-Details** dashboard, where the key metrics are displayed. According to the [Performance Map](https://asktug.com/_/tidb-performance-map/#/), you can check whether the status of the cluster is as expected.
 
 This document provides a detailed description of these key metrics on the **TiKV-Details** dashboard.