From f71863d8a88f9b961eee5fef57886eba46412193 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Tue, 4 Jun 2024 15:20:57 +0800 Subject: [PATCH] statistics: update the note for sampling rate (#17662) --- statistics.md | 14 +------------- 1 file changed, 1 insertion(+), 13 deletions(-) diff --git a/statistics.md b/statistics.md index 9378c44e55414..54f29baf1843b 100644 --- a/statistics.md +++ b/statistics.md @@ -116,21 +116,9 @@ Before v5.3.0, TiDB uses the reservoir sampling method to collect statistics. Si The current sampling rate is calculated based on an adaptive algorithm. When you can observe the number of rows in a table using [`SHOW STATS_META`](/sql-statements/sql-statement-show-stats-meta.md), you can use this number of rows to calculate the sampling rate corresponding to 100,000 rows. If you cannot observe this number, you can use the sum of all the values in the `APPROXIMATE_KEYS` column in the results of [`SHOW TABLE REGIONS`](/sql-statements/sql-statement-show-table-regions.md) of the table as another reference to calculate the sampling rate. - - > **Note:** > -> Normally, `STATS_META` is more credible than `APPROXIMATE_KEYS`. However, after importing data through the methods like [TiDB Lightning](https://docs.pingcap.com/tidb/stable/tidb-lightning-overview), the result of `STATS_META` is `0`. To handle this situation, you can use `APPROXIMATE_KEYS` to calculate the sampling rate when the result of `STATS_META` is much smaller than the result of `APPROXIMATE_KEYS`. - - - - - -> **Note:** -> -> Normally, `STATS_META` is more credible than `APPROXIMATE_KEYS`. However, after importing data through TiDB Cloud console (see [Import Sample Data](/tidb-cloud/import-sample-data.md)), the result of `STATS_META` is `0`. To handle this situation, you can use `APPROXIMATE_KEYS` to calculate the sampling rate when the result of `STATS_META` is much smaller than the result of `APPROXIMATE_KEYS`. - - +> Normally, `STATS_META` is more credible than `APPROXIMATE_KEYS`. However, when the result of `STATS_META` is much smaller than the result of `APPROXIMATE_KEYS`, it is recommended that you use `APPROXIMATE_KEYS` to calculate the sampling rate. #### Collect statistics on some columns