-
Notifications
You must be signed in to change notification settings - Fork 711
add two HTAP documents (#6206) #6321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
ti-chi-bot
merged 129 commits into
pingcap:release-5.2
from
ti-chi-bot:cherry-pick-6206-to-release-5.2
Aug 27, 2021
Merged
Changes from all commits
Commits
Show all changes
129 commits
Select commit
Hold shift + click to select a range
8fe89ce
Remove all aliases from release-5.1 (#5818)
TomShawn ff81e85
add docs for enforce mpp (#5811) (#5819)
ti-chi-bot ed00c06
Update tidb-configuration-file.md (#5070) (#5821)
ti-chi-bot b7263ce
Add Lock View documents (#5798) (#5822)
ti-chi-bot 92ffdf7
Fix typos (#5827) (#5828)
ti-chi-bot a9838d9
add document about stale read transaction (#5809) (#5829)
ti-chi-bot 10e91bf
Delete docker deployment docs (#5830) (#5832)
ti-chi-bot 41af2b5
Fix ebnf display (#5833) (#5834)
ti-chi-bot a966936
tidb-configuration-file: add graceful-wait-before-shutdown (#5825) (#…
ti-chi-bot 57be14d
lightning: fix a lightning config bug (#5820) (#5837)
ti-chi-bot 55f61ac
change docs configs to release-5.1 (#5817)
TomShawn 3f5cb7f
Update content about upgrade (#5813) (#5841)
ti-chi-bot 0ccfa07
add v510 benchmark doc (#5842) (#5845)
ti-chi-bot fcd26c2
system variables: add tidb_analyze_version (#5824) (#5846)
ti-chi-bot 4642cd0
Add note about stale read (#5843) (#5850)
ti-chi-bot cca3435
Add v5.1 mpp tpch test report (#5844) (#5851)
ti-chi-bot f58f3ea
Bump version for v5.1.0 (#5849) (#5852)
ti-chi-bot 670fe11
Add the release note for TiDB v5.1 (#5840) (#5853)
ti-chi-bot 695e8ab
Update v5.1 release notes for lint issues (#5854) (#5855)
ti-chi-bot af53b3e
Fix the display issues of two lists (#5857) (#5858)
ti-chi-bot c601037
Update TOC.md (#5862)
qiancai 74861a1
Add a missing sentence in 5.1 rn (#5863) (#5864)
ti-chi-bot 247b380
Overhauling TiKV RocksDB configuration file (#5746) (#5873)
ti-chi-bot 14815c5
Not support to set Tombstone (#5878) (#5885)
ti-chi-bot 98811bf
Update tiup-bench.md (#5880) (#5883)
ti-chi-bot a9c84a2
tidb-functions: extend tidb_decode_key docs (#5877) (#5887)
ti-chi-bot fb04272
*: upadate roadmap (#5888) (#5889)
ti-chi-bot b718358
system-variables: add datadir and license (#5761) (#5890)
ti-chi-bot af1cf83
Add v5.1.0 / release-5.1 where it is missing (#5893) (#5902)
ti-chi-bot 9fb760c
Clarify tidb lightning backend description (#5904) (#5907)
ti-chi-bot f5baca7
system-variables: Add charset documentation (#5867) (#5909)
ti-chi-bot 0849f75
fix partition table doc error (#5903) (#5912)
ti-chi-bot 7a7da4d
releases: add 5.0.3 release notes (#5874) (#5917)
ti-chi-bot 30e06c6
Add workaround about using Stale Read with TiFlash (#5875) (#5922)
ti-chi-bot 0cc435a
Fix ticdc canal-json related doc (#5774) (#5925)
ti-chi-bot 4e7dad5
release-5.1.0: update CTE description (#5918) (#5923)
ti-chi-bot 97335d6
remove some useless configurations (#5558) (#5926)
ti-chi-bot 8b39e57
Remove a blank line that causes display failure (#5934) (#5935)
ti-chi-bot 94b3436
configure-memory-usage: update default value (#5913) (#5939)
ti-chi-bot 0cae595
quick-start-with-tidb: add a note to clarify the example ip (#5865) (…
ti-chi-bot 2dfc3fa
TiKV configuration: defaultcf.titan config should not apply to other …
ti-chi-bot 9b6059f
added config example for s3.region (#5919) (#5949)
ti-chi-bot 48fc0a2
partitioned-table: adding the correction to incorrect examples (#5931…
ti-chi-bot 98989b5
fix ctc doc (#5914) (#5953)
ti-chi-bot 55e8034
*: make the scene of stale read clearer (#5932) (#5954)
ti-chi-bot b508e67
clustered-index.md: nonclusterd -> nonclustered (#5956) (#5958)
ti-chi-bot 2b43641
Add the default value description for tidb_enforce_mpp (#5955) (#5961)
ti-chi-bot 0f67d4b
Add a note to clarify the purpose of the quick start guide (#5876) (#…
ti-chi-bot a8c3b4f
fix-broken-external-links (#5971) (#5977)
ti-chi-bot 68b2ee5
Fix some default configurations for RocksDB (#5969) (#5979)
ti-chi-bot afe2e92
Added a note about grpc-compression-type (#5960) (#5981)
ti-chi-bot 2ef0ca7
Updated TiUP version (#5970) (#5983)
ti-chi-bot 4472c8f
Remove the swappiness parameter (#5987) (#5990)
ti-chi-bot b196fe6
Change tidb_memory_usage_alarm_ratio scope to instance (#5988) (#5994)
ti-chi-bot c153c26
system-variables: update for consistency (#5826) (#5991)
ti-chi-bot 2a0f15a
index page: updated the phase of TiDB Cloud from Beta to Public Previ…
ti-chi-bot 595dcca
TiUP cluster: update data_dir (#6009) (#6019)
ti-chi-bot e67c799
update docs related to partition table dynamic mode (#5997) (#6025)
ti-chi-bot c8bc874
partitioning: Corrected partition management (#5498) (#6027)
ti-chi-bot 852173f
TiDB binlog: update descriptions about commit ts and passoword (#5986…
ti-chi-bot f1f00b1
releases: add tidb 4.0.14 release notes (#5996) (#6039)
ti-chi-bot b0374ce
fix a typo for sync_diff_inspector (#6041) (#6043)
ti-chi-bot 18ff150
Update dashboard FAQ (#5895) (#6051)
ti-chi-bot 68eb8ec
correct document of using br backup and restore system tables (#6057)…
ti-chi-bot 01e9b66
Add TiDB Dashboard session docs (#6058) (#6063)
ti-chi-bot 2c75adc
Update the default value of tidb_stmt_summary_max_stmt_count (#6021) …
ti-chi-bot f0ff847
update br faq (#6060) (#6064)
ti-chi-bot 4d27b79
releases: add TiDB 5.1.1 release notes (#6030) (#6065)
ti-chi-bot bc5e698
update stale read doc for ga (#6047) (#6052)
ti-chi-bot 8f72414
grafana-overview-dashboard: update the monitoring item for CPS (#6066…
ti-chi-bot e049aa2
deploy-tidb-binlog: make the expression on deployment clearer (#6073)…
ti-chi-bot 11d7adb
Add documentation on how to modify gcttl by tiup (#6071) (#6077)
ti-chi-bot e204267
remove useless variable (#6076) (#6078)
ti-chi-bot d919ebe
deleted roadmap.md (#6079) (#6082)
ti-chi-bot 2eadade
TiDB Monitoring Metrics: remove a line (#6081) (#6083)
ti-chi-bot 7c80eda
chore: lock plugin versions (#6089) (#6095)
ti-chi-bot 878e03e
cdc: add compatibility notes for sort-dir (#6086) (#6096)
ti-chi-bot 4854222
Add gc ttl (#6102) (#6104)
ti-chi-bot afa6b40
5.1.0 release notes: Fix link to telemetry docs (#6106) (#6107)
ti-chi-bot 441738e
TiCDC: update a golang demo link (#6055) (#6114)
ti-chi-bot affee15
br/use-br-command-line-tool: supplement br note (#6000) (#6116)
ti-chi-bot f11a139
pr_template: Provides tips for cherry-pick (#6042) (#6122)
ti-chi-bot 6141cd2
ticdc: add explicit_defaults_for_timestamp compatibility troubleshoot…
ti-chi-bot bf74697
TiFlash: remove outdated tune advise (#6133) (#6135)
ti-chi-bot fa2b427
tidb-scheduling: fix typo (#6140) (#6143)
ti-chi-bot 431277a
tiup: fix dead links (#6153) (#6155)
ti-chi-bot ebdda11
chore: update pdf version tag (#6150)
YiniXu9506 bb85ae0
high-concurrency-best-practices: fix the support info of follower rea…
ti-chi-bot 543d588
Fix broken link in error codes doc and support doc (#6196) (#6200)
ti-chi-bot 0c57caa
docs: fix format for TiKV and PD configuration file template invalid.…
ti-chi-bot a280ce8
adopters: add zhihu case study (#6207) (#6213)
ti-chi-bot 0e94de9
add notice about scaling in pd node (#6099) (#6221)
ti-chi-bot 51e576a
adopters: remove 404 links (#6214) (#6225)
ti-chi-bot 05e4084
update PR template for v5.2 (#6166) (#6241)
ti-chi-bot c9633c0
alert rules: update some descriptions (#6250) (#6256)
ti-chi-bot c3da29e
Fix image display error (#6169)
TomShawn 0cf5b85
fix typo: relaod -> reload (#6235) (#6260)
ti-chi-bot 02e2408
br: add restore to systables (#6004) (#6271)
ti-chi-bot df1c98a
Require process privilege for dumpling (#6187) (#6274)
ti-chi-bot 56ff335
statement summary: update statement summary doc (#6084) (#6270)
ti-chi-bot 278ada5
Add description about table name/alias specifying for read_from_stora…
ti-chi-bot 6c4befb
TiKV configuration: remove redundant instructions (#6218) (#6288)
ti-chi-bot 69fca38
alert rules: remove some descriptions (#6223) (#6292)
ti-chi-bot 0358567
sql: improve kill's description (#6233) (#6306)
ti-chi-bot 97713d9
system variables.md: add a warning message (#6298) (#6312)
ti-chi-bot f4687a3
tiup: add notice about importing cluster (#6108) (#6316)
ti-chi-bot 1d2f500
chore: update PDF setting for 5.1 (#6307)
TomShawn 6343c3e
add two HTAP documents
en-jin19 5b781fb
fix CI error
en-jin19 56b9e4c
fix CI error
en-jin19 8c81066
fix CI error
en-jin19 2a30b5f
fix CI errors
en-jin19 0249468
Update explore-htap.md
en-jin19 7ede85b
Update quick-start-with-htap.md
en-jin19 107b808
Update quick-start-with-htap.md
en-jin19 b2e4842
Update quick-start-with-htap.md
en-jin19 c92428c
Apply suggestions from code review
en-jin19 aa7eb2a
Apply suggestions from code review
en-jin19 06d34ea
fix CI error
en-jin19 949da58
Update explore-htap.md
qiancai 2181340
Apply suggestions from code review
qiancai c84bb42
Apply suggestions from code review
en-jin19 5b40ad5
Apply suggestions from code review
en-jin19 d6046b4
Update quick-start-with-htap.md
qiancai aeff6a7
Update explore-htap.md
qiancai cb016e4
Update explore-htap.md
qiancai 0c7cb0e
Update explore-htap.md
qiancai ae5235b
Merge branch 'release-5.1' into pr/6321
qiancai 996aacc
Merge branch 'release-5.2' into pr/6321
qiancai File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,104 @@ | ||
| --- | ||
| title: Explore HTAP | ||
| summary: Learn how to explore and use the features of TiDB HTAP. | ||
|
qiancai marked this conversation as resolved.
|
||
| --- | ||
|
|
||
| # Explore HTAP | ||
|
|
||
| This guide describes how to explore and use the features of TiDB Hybrid Transactional and Analytical Processing (HTAP). | ||
|
|
||
| > **Note:** | ||
| > | ||
| > If you are new to TiDB HTAP and want to start using it quickly, see [Quick start with HTAP](/quick-start-with-htap.md). | ||
|
|
||
| ## Use cases | ||
|
|
||
| TiDB HTAP can handle the massive data that increases rapidly, reduce the cost of DevOps, and be deployed in either on-premises or cloud environments easily, which brings the value of data assets in real time. | ||
|
|
||
| The following are the typical use cases of HTAP: | ||
|
|
||
| - Hybrid workload | ||
|
|
||
| When using TiDB for real-time Online Analytical Processing (OLAP) in hybrid load scenarios, you only need to provide an entry point of TiDB to your data. TiDB automatically selects different processing engines based on the specific business. | ||
|
|
||
| - Real-time stream processing | ||
|
|
||
| When using TiDB in real-time stream processing scenarios, TiDB ensures that all the data flowed in constantly can be queried in real time. At the same time, TiDB also can handle highly concurrent data workloads and Business Intelligence (BI) queries. | ||
|
|
||
| - Data hub | ||
|
|
||
| When using TiDB as a data hub, TiDB can meet specific business needs by seamlessly connecting the data for the application and the data warehouse. | ||
|
|
||
| For more information about use cases of TiDB HTAP, see [blogs about HTAP on the PingCAP website](https://en.pingcap.com/blog/tag/HTAP). | ||
|
|
||
| ## Architecture | ||
|
|
||
| In TiDB, a row-based storage engine [TiKV](/tikv-overview.md) for Online Transactional Processing (OLTP) and a columnar storage engine [TiFlash](/tiflash/tiflash-overview.md) for Online Analytical Processing (OLAP) co-exist, replicate data automatically, and keep strong consistency. | ||
|
|
||
| For more information about the architecture, see [architecture of TiDB HTAP](/tiflash/tiflash-overview.md#architecture). | ||
|
|
||
| ## Environment preparation | ||
|
|
||
| Before exploring the features of TiDB HTAP, you need to deploy TiDB and the corresponding storage engines according to the data volume. If the data volume is large (for example, 100 T), it is recommended to use TiFlash Massively Parallel Processing (MPP) as the primary solution and TiSpark as the supplementary solution. | ||
|
|
||
| - TiFlash | ||
|
|
||
| - If you have deployed a TiDB cluster with no TiFlash node, add the TiFlash nodes in the current TiDB cluster. For detailed information, see [Scale out a TiFlash cluster](/scale-tidb-using-tiup.md#scale-out-a-tiflash-cluster). | ||
| - If you have not deployed a TiDB cluster, see [Deploy a TiDB Cluster using TiUP](/production-deployment-using-tiup.md). Based on the minimal TiDB topology, you also need to deploy the [topology of TiFlash](/tiflash-deployment-topology.md). | ||
| - When deciding how to choose the number of TiFlash nodes, consider the following scenarios: | ||
|
|
||
| - If your use case requires OLTP with small-scale analytical processing and Ad-Hoc queries, deploy one or several TiFlash nodes. They can dramatically increase the speed of analytic queries. | ||
| - If the OLTP throughput does not cause significant pressure to I/O usage rate of the TiFlash nodes, each TiFlash node uses more resources for computation, and thus the TiFlash cluster can have near-linear scalability. The number of TiFlash nodes should be tuned based on expected performance and response time. | ||
| - If the OLTP throughput is relatively high (for example, the write or update throughput is higher than 10 million lines/hours), due to the limited write capacity of network and physical disks, the I/O between TiKV and TiFlash becomes a bottleneck and is also prone to read and write hotspots. In this case, the number of TiFlash nodes has a complex non-linear relationship with the computation volume of analytical processing, so you need to tune the number of TiFlash nodes based on the actual status of the system. | ||
|
|
||
| - TiSpark | ||
|
|
||
| - If your data needs to be analyzed with Spark, deploy TiSpark (Spark 3.x is not currently supported). For specific process, see [TiSpark User Guide](/tispark-overview.md). | ||
|
|
||
| <!-- - Real-time stream processing | ||
| - If you want to build an efficient and easy-to-use real-time data warehouse with TiDB and Flink, you are welcome to participate in Apache Flink x TiDB meetups.--> | ||
|
|
||
| ## Data preparation | ||
|
|
||
| After TiFlash is deployed, TiKV does not replicate data to TiFlash automatically. You need to manually specify which tables need to be replicated to TiFlash. After that, TiDB creates the corresponding TiFlash replicas. | ||
|
|
||
| - If there is no data in the TiDB Cluster, migrate the data to TiDB first. For detailed information, see [data migration](/migration-overview.md). | ||
| - If the TiDB cluster already has the replicated data from upstream, after TiFlash is deployed, data replication does not automatically begin. You need to manually specify the tables to be replicated to TiFlash. For detailed information, see [Use TiFlash](/tiflash/use-tiflash.md). | ||
|
|
||
| ## Data processing | ||
|
|
||
| With TiDB, you can simply enter SQL statements for query or write requests. For the tables with TiFlash replicas, TiDB uses the front-end optimizer to automatically choose the optimal execution plan. | ||
|
|
||
| > **Note:** | ||
| > | ||
| > The MPP mode of TiFlash is enabled by default. When an SQL statement is executed, TiDB automatically determines whether to run in the MPP mode through the optimizer. | ||
| > | ||
| > - To disable the MPP mode of TiFlash, set the value of the [tidb_allow_mpp](/system-variables.md#tidb_allow_mpp-new-in-v50) system variable to `OFF`. | ||
| > - To forcibly enable MPP mode of TiFlash for query execution, set the values of [tidb_allow_mpp](/system-variables.md#tidb_allow_mpp-new-in-v50) and [tidb_enforce_mpp](/system-variables.md#tidb_enforce_mpp-new-in-v51) to `ON`. | ||
| > - To check whether TiDB chooses the MPP mode to execute a specific query, see [Explain Statements in the MPP Mode](/explain-mpp.md#explain-statements-in-the-mpp-mode). If the output of `EXPLAIN` statement includes the `ExchangeSender` and `ExchangeReceiver` operators, the MPP mode is in use. | ||
|
|
||
| ## Performance monitoring | ||
|
|
||
| When using TiDB, you can monitor the TiDB cluster status and performance metrics in either of the following ways: | ||
|
|
||
| - [TiDB Dashboard](/dashboard/dashboard-intro.md): you can see the overall running status of the TiDB cluster, analyse distribution and trends of read and write traffic, and learn the detailed execution information of slow queries. | ||
| - [Monitoring system (Prometheus & Grafana)](/grafana-overview-dashboard.md): you can see the monitoring parameters of TiDB cluster-related componants including PD, TiDB, TiKV, TiFlash,TiCDC, and Node_exporter. | ||
|
|
||
| To see the alert rules of TiDB cluster and TiFlash cluster, see [TiDB cluster alert rules](/alert-rules.md) and [TiFlash alert rules](/tiflash/tiflash-alert-rules.md). | ||
|
|
||
| ## Troubleshooting | ||
|
|
||
| If any issue occurs during using TiDB, refer to the following documents: | ||
|
|
||
| - [Analyze slow queries](/analyze-slow-queries.md) | ||
| - [Identify expensive queries](/identify-expensive-queries.md) | ||
| - [Troubleshoot hotspot issues](/troubleshoot-hot-spot-issues.md) | ||
| - [TiDB cluster troubleshooting guide](/troubleshoot-tidb-cluster.md) | ||
| - [Troubleshoot a TiFlash Cluster](/tiflash/troubleshoot-tiflash.md) | ||
|
|
||
| You are also welcome to create [Github Issues](https://github.com/pingcap/tiflash/issues) or submit your questions on [AskTUG](https://asktug.com/). | ||
|
|
||
| ## What's next | ||
|
|
||
| - To check the TiFlash version, critical logs, system tables, see [Maintain a TiFlash cluster](/tiflash/maintain-tiflash.md). | ||
| - To remove a specific TiFlash node, see [Scale out a TiFlash cluster](/scale-tidb-using-tiup.md#scale-out-a-tiflash-cluster). | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,213 @@ | ||
| --- | ||
| title: Quick start with HTAP | ||
| summary: Learn how to quickly get started with the TiDB HTAP. | ||
| --- | ||
|
|
||
| # Quick Start Guide for TiDB HTAP | ||
|
|
||
| This guide walks you through the quickest way to get started with TiDB's one-stop solution of Hybrid Transactional and Analytical Processing (HTAP). | ||
|
|
||
| > **Note:** | ||
| > | ||
| > The steps provided in this guide is ONLY for quick start in the test environment. For production environments, [explore HTAP](/explore-htap.md) is recommended. | ||
|
|
||
| ## Basic concepts | ||
|
|
||
| Before using TiDB HTAP, you need to have some basic knowledge about [TiKV](/tikv-overview.md), a row-based storage engine for TiDB Online Transactional Processing (OLTP), and [TiFlash](/tiflash/tiflash-overview.md), a columnar storage engine for TiDB Online Analytical Processing (OLAP). | ||
|
|
||
| - Storage engines of HTAP: The row-based storage engine and the columnar storage engine co-exist for HTAP. Both storage engines can replicate data automatically and keep strong consistency. The row-based storage engine optimizes OLTP performance, and the columnar storage engine optimizes OLAP performance. | ||
| - Data consistency of HTAP: As a distributed and transactional key-value database, TiKV provides transactional interfaces with ACID compliance, and guarantees data consistency between multiple replicas and high availability with the implementation of the [Raft consensus algorithm](https://raft.github.io/raft.pdf). As a columnar storage extension of TiKV, TiFlash replicates data from TiKV in real time according to the Raft Learner consensus algorithm, which ensures that data is strongly consistent between TiKV and TiFlash. | ||
| - Data isolation of HTAP: TiKV and TiFlash can be deployed on different machines as needed to solve the problem of HTAP resource isolation. | ||
| - MPP computing engine: [MPP](/tiflash/use-tiflash.md#control-whether-to-select-the-mpp-mode) is a distributed computing framework provided by the TiFlash engine since TiDB 5.0, which allows data exchange between nodes and provides high-performance, high-throughput SQL algorithms. In the MPP mode, the run time of the analytic queries can be significantly reduced. | ||
|
|
||
| ## Steps | ||
|
|
||
| In this document, you can experience the convenience and high performance of TiDB HTAP by querying an example table in a popular [TPC-H](http://www.tpc.org/tpch/) dataset. | ||
|
|
||
| ### Step 1. Deploy a local test environment | ||
|
|
||
| Before using TiDB HTAP, follow the steps in the [Quick Start Guide for the TiDB Database Platform](/quick-start-with-tidb.md) to prepare a local test environment, and run the following command to deploy a TiDB cluster: | ||
|
|
||
| {{< copyable "shell-regular" >}} | ||
|
|
||
| ```shell | ||
| tiup playground | ||
| ``` | ||
|
|
||
| > **Note:** | ||
| > | ||
| > `tiup playground` command is ONLY for quick start, NOT for production. | ||
|
|
||
| ### Step 2. Prepare test data | ||
|
|
||
| In the following steps, you can create a [TPC-H](http://www.tpc.org/tpch/) dataset as the test data to use TiDB HTAP. If you are interested in TPC-H, see [General Implementation Guidelines](http://tpc.org/tpc_documents_current_versions/pdf/tpc-h_v3.0.0.pdf). | ||
|
|
||
| > **Note:** | ||
| > | ||
| > If you want to use your existing data for analytic queries, you can [migrate your data to TiDB](/migration-overview.md). If you want to design and create your own test data, you can create it by executing SQL statements or using related tools. | ||
|
|
||
| 1. Install the test data generation tool by running the following command: | ||
|
|
||
| {{< copyable "shell-regular" >}} | ||
|
|
||
| ```shell | ||
| tiup install bench | ||
| ``` | ||
|
|
||
| 2. Generate the test data by running the following command: | ||
|
|
||
| {{< copyable "shell-regular" >}} | ||
|
|
||
| ```shell | ||
| tiup bench tpch --sf=1 prepare | ||
| ``` | ||
|
|
||
| If the output of this command shows `Finished`, it indicates that the data is created. | ||
|
|
||
| 3. Execute the following SQL statement to view the generated data: | ||
|
|
||
| {{< copyable "sql" >}} | ||
|
|
||
| ```sql | ||
| SELECT | ||
| CONCAT(table_schema,'.',table_name) AS 'Table Name', | ||
| table_rows AS 'Number of Rows', | ||
| FORMAT_BYTES(data_length) AS 'Data Size', | ||
| FORMAT_BYTES(index_length) AS 'Index Size', | ||
| FORMAT_BYTES(data_length+index_length) AS'Total' | ||
| FROM | ||
| information_schema.TABLES | ||
| WHERE | ||
| table_schema='test'; | ||
| ``` | ||
|
|
||
| As you can see from the output, eight tables are created in total, and the largest table has 6.5 million rows (the number of rows created by the tool depends on the actual SQL query result because the data is randomly generated). | ||
|
|
||
| ```sql | ||
| +---------------+----------------+-----------+------------+-----------+ | ||
| | Table Name | Number of Rows | Data Size | Index Size | Total | | ||
| +---------------+----------------+-----------+------------+-----------+ | ||
| | test.nation | 25 | 2.44 KiB | 0 bytes | 2.44 KiB | | ||
| | test.region | 5 | 416 bytes | 0 bytes | 416 bytes | | ||
| | test.part | 200000 | 25.07 MiB | 0 bytes | 25.07 MiB | | ||
| | test.supplier | 10000 | 1.45 MiB | 0 bytes | 1.45 MiB | | ||
| | test.partsupp | 800000 | 120.17 MiB| 12.21 MiB | 132.38 MiB| | ||
| | test.customer | 150000 | 24.77 MiB | 0 bytes | 24.77 MiB | | ||
| | test.orders | 1527648 | 174.40 MiB| 0 bytes | 174.40 MiB| | ||
| | test.lineitem | 6491711 | 849.07 MiB| 99.06 MiB | 948.13 MiB| | ||
| +---------------+----------------+-----------+------------+-----------+ | ||
| 8 rows in set (0.06 sec) | ||
| ``` | ||
|
|
||
| This is a database of a commercial ordering system. In which, the `test.nation` table indicates the information about countries, the `test.region` table indicates the information about regions, the `test.part` table indicates the information about parts, the `test.supplier` table indicates the information about suppliers, the `test.partsupp` table indicates the information about parts of suppliers, the `test.customer` table indicates the information about customers, the `test.customer` table indicates the information about orders, and the `test.lineitem` table indicates the information about online items. | ||
|
|
||
| ### Step 3. Query data with the row-based storage engine | ||
|
|
||
| To know the performance of TiDB with only the row-based storage engine, execute the following SQL statements: | ||
|
|
||
| {{< copyable "sql" >}} | ||
|
|
||
| ```sql | ||
| SELECT | ||
| l_orderkey, | ||
| SUM( | ||
| l_extendedprice * (1 - l_discount) | ||
| ) AS revenue, | ||
| o_orderdate, | ||
| o_shippriority | ||
| FROM | ||
| customer, | ||
| orders, | ||
| lineitem | ||
| WHERE | ||
| c_mktsegment = 'BUILDING' | ||
| AND c_custkey = o_custkey | ||
| AND l_orderkey = o_orderkey | ||
| AND o_orderdate < DATE '1996-01-01' | ||
| AND l_shipdate > DATE '1996-02-01' | ||
| GROUP BY | ||
| l_orderkey, | ||
| o_orderdate, | ||
| o_shippriority | ||
| ORDER BY | ||
| revenue DESC, | ||
| o_orderdate | ||
| limit 10; | ||
| ``` | ||
|
|
||
| This is a shipping priority query, which provides the priority and potential revenue of the highest-revenue order that has not been shipped before a specified date. The potential revenue is defined as the sum of `l_extendedprice * (1-l_discount)`. The orders are listed in the descending order of revenue. In this example, this query lists the unshipped orders with potential query revenue in the top 10. | ||
|
|
||
| ### Step 4. Replicate the test data to the columnar storage engine | ||
|
|
||
| After TiFlash is deployed, TiKV does not replicate data to TiFlash immediately. You need to execute the following DDL statements in a MySQL client of TiDB to specify which tables need to be replicated. After that, TiDB will create the specified replicas in TiFlash accordingly. | ||
|
|
||
| {{< copyable "sql" >}} | ||
|
|
||
| ```sql | ||
| ALTER TABLE test.customer SET TIFLASH REPLICA 1; | ||
| ALTER TABLE test.orders SET TIFLASH REPLICA 1; | ||
| ALTER TABLE test.lineitem SET TIFLASH REPLICA 1; | ||
| ``` | ||
|
|
||
| To check the replication status of the specific tables, execute the following statements: | ||
|
|
||
| {{< copyable "sql" >}} | ||
|
|
||
| ```sql | ||
| SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = 'test' and TABLE_NAME = 'customer'; | ||
| SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = 'test' and TABLE_NAME = 'orders'; | ||
| SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = 'test' and TABLE_NAME = 'lineitem'; | ||
| ``` | ||
|
|
||
| In the result of the above statements: | ||
|
|
||
| - `AVAILABLE` indicates whether the TiFlash replica of a specific table is available or not. `1` means available and `0` means unavailable. Once a replica becomes available, this status does not change any more. If you use DDL statements to modify the number of replicas, the replication status will be recalculated. | ||
| - `PROGRESS` means the progress of the replication. The value is between 0.0 and 1.0. 1 means at least one replica is replicated. | ||
|
|
||
| ### Step 5. Analyze data faster using HTAP | ||
|
|
||
| Execute the SQL statements in [Step 3](#step-3-query-data-with-the-row-based-storage-engine) again, and you can see the performance of TiDB HTAP. | ||
|
|
||
| For tables with TiFlash replicas, the TiDB optimizer automatically determines whether to use TiFlash replicas based on the cost estimation. To check whether or not a TiFlash replica is selected, you can use the `desc` or `explain analyze` statement. For example: | ||
|
|
||
| {{< copyable "sql" >}} | ||
|
|
||
| ```sql | ||
| explain analyze SELECT | ||
| l_orderkey, | ||
| SUM( | ||
| l_extendedprice * (1 - l_discount) | ||
| ) AS revenue, | ||
| o_orderdate, | ||
| o_shippriority | ||
| FROM | ||
| customer, | ||
| orders, | ||
| lineitem | ||
| WHERE | ||
| c_mktsegment = 'BUILDING' | ||
| AND c_custkey = o_custkey | ||
| AND l_orderkey = o_orderkey | ||
| AND o_orderdate < DATE '1996-01-01' | ||
| AND l_shipdate > DATE '1996-02-01' | ||
| GROUP BY | ||
| l_orderkey, | ||
| o_orderdate, | ||
| o_shippriority | ||
| ORDER BY | ||
| revenue DESC, | ||
| o_orderdate | ||
| limit 10; | ||
| ``` | ||
|
|
||
| If the result of the `EXPLAIN` statement shows `ExchangeSender` and `ExchangeReceiver` operators, it indicates that the MPP mode has taken effect. | ||
|
|
||
| In addition, you can specify that each part of the entire query is computed using only the TiFlash engine. For detailed information, see [Use TiDB to read TiFlash replicas](/tiflash/use-tiflash.md#use-tidb-to-read-tiflash-replicas). | ||
|
|
||
| You can compare query results and query performance of these two methods. | ||
|
|
||
| ## What's next | ||
|
|
||
| - [Architecture of TiDB HTAP](/tiflash/tiflash-overview.md#architecture) | ||
| - [Explore HTAP](/explore-htap.md) | ||
| - [Use TiFlash](/tiflash/use-tiflash.md#use-tiflash) |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.