From f2b1aeec10e3805c2ac40ae1eecbf4ce2bf45e3e Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Tue, 4 Jun 2024 15:20:57 +0800 Subject: [PATCH] statistics: update the note for sampling rate (#17662) --- statistics.md | 14 +------------- 1 file changed, 1 insertion(+), 13 deletions(-) diff --git a/statistics.md b/statistics.md index 905c411418bcc..aba2f64083e3f 100644 --- a/statistics.md +++ b/statistics.md @@ -116,21 +116,9 @@ Before v5.3.0, TiDB uses the reservoir sampling method to collect statistics. Si The current sampling rate is calculated based on an adaptive algorithm. When you can observe the number of rows in a table using [`SHOW STATS_META`](/sql-statements/sql-statement-show-stats-meta.md), you can use this number of rows to calculate the sampling rate corresponding to 100,000 rows. If you cannot observe this number, you can use the sum of all the values in the `APPROXIMATE_KEYS` column in the results of [`SHOW TABLE REGIONS`](/sql-statements/sql-statement-show-table-regions.md) of the table as another reference to calculate the sampling rate. - - > **Note:** > -> Normally, `STATS_META` is more credible than `APPROXIMATE_KEYS`. However, after importing data through the methods like [TiDB Lightning](https://docs.pingcap.com/tidb/stable/tidb-lightning-overview), the result of `STATS_META` is `0`. To handle this situation, you can use `APPROXIMATE_KEYS` to calculate the sampling rate when the result of `STATS_META` is much smaller than the result of `APPROXIMATE_KEYS`. - - - - - -> **Note:** -> -> Normally, `STATS_META` is more credible than `APPROXIMATE_KEYS`. However, after importing data through TiDB Cloud console (see [Import Sample Data](/tidb-cloud/import-sample-data.md)), the result of `STATS_META` is `0`. To handle this situation, you can use `APPROXIMATE_KEYS` to calculate the sampling rate when the result of `STATS_META` is much smaller than the result of `APPROXIMATE_KEYS`. - - +> Normally, `STATS_META` is more credible than `APPROXIMATE_KEYS`. However, when the result of `STATS_META` is much smaller than the result of `APPROXIMATE_KEYS`, it is recommended that you use `APPROXIMATE_KEYS` to calculate the sampling rate. #### Collect statistics on some columns