Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
129 commits
Select commit Hold shift + click to select a range
8fe89ce
Remove all aliases from release-5.1 (#5818)
TomShawn Jun 22, 2021
ff81e85
add docs for enforce mpp (#5811) (#5819)
ti-chi-bot Jun 22, 2021
ed00c06
Update tidb-configuration-file.md (#5070) (#5821)
ti-chi-bot Jun 22, 2021
b7263ce
Add Lock View documents (#5798) (#5822)
ti-chi-bot Jun 22, 2021
92ffdf7
Fix typos (#5827) (#5828)
ti-chi-bot Jun 23, 2021
a9838d9
add document about stale read transaction (#5809) (#5829)
ti-chi-bot Jun 23, 2021
10e91bf
Delete docker deployment docs (#5830) (#5832)
ti-chi-bot Jun 23, 2021
41af2b5
Fix ebnf display (#5833) (#5834)
ti-chi-bot Jun 24, 2021
a966936
tidb-configuration-file: add graceful-wait-before-shutdown (#5825) (#…
ti-chi-bot Jun 24, 2021
57be14d
lightning: fix a lightning config bug (#5820) (#5837)
ti-chi-bot Jun 24, 2021
55f61ac
change docs configs to release-5.1 (#5817)
TomShawn Jun 24, 2021
3f5cb7f
Update content about upgrade (#5813) (#5841)
ti-chi-bot Jun 24, 2021
0ccfa07
add v510 benchmark doc (#5842) (#5845)
ti-chi-bot Jun 24, 2021
fcd26c2
system variables: add tidb_analyze_version (#5824) (#5846)
ti-chi-bot Jun 24, 2021
4642cd0
Add note about stale read (#5843) (#5850)
ti-chi-bot Jun 24, 2021
cca3435
Add v5.1 mpp tpch test report (#5844) (#5851)
ti-chi-bot Jun 24, 2021
f58f3ea
Bump version for v5.1.0 (#5849) (#5852)
ti-chi-bot Jun 24, 2021
670fe11
Add the release note for TiDB v5.1 (#5840) (#5853)
ti-chi-bot Jun 24, 2021
695e8ab
Update v5.1 release notes for lint issues (#5854) (#5855)
ti-chi-bot Jun 24, 2021
af53b3e
Fix the display issues of two lists (#5857) (#5858)
ti-chi-bot Jun 24, 2021
c601037
Update TOC.md (#5862)
qiancai Jun 24, 2021
74861a1
Add a missing sentence in 5.1 rn (#5863) (#5864)
ti-chi-bot Jun 25, 2021
247b380
Overhauling TiKV RocksDB configuration file (#5746) (#5873)
ti-chi-bot Jun 28, 2021
14815c5
Not support to set Tombstone (#5878) (#5885)
ti-chi-bot Jun 29, 2021
98811bf
Update tiup-bench.md (#5880) (#5883)
ti-chi-bot Jun 29, 2021
a9c84a2
tidb-functions: extend tidb_decode_key docs (#5877) (#5887)
ti-chi-bot Jun 29, 2021
fb04272
*: upadate roadmap (#5888) (#5889)
ti-chi-bot Jun 29, 2021
b718358
system-variables: add datadir and license (#5761) (#5890)
ti-chi-bot Jun 29, 2021
af1cf83
Add v5.1.0 / release-5.1 where it is missing (#5893) (#5902)
ti-chi-bot Jul 1, 2021
9fb760c
Clarify tidb lightning backend description (#5904) (#5907)
ti-chi-bot Jul 1, 2021
f5baca7
system-variables: Add charset documentation (#5867) (#5909)
ti-chi-bot Jul 1, 2021
0849f75
fix partition table doc error (#5903) (#5912)
ti-chi-bot Jul 1, 2021
7a7da4d
releases: add 5.0.3 release notes (#5874) (#5917)
ti-chi-bot Jul 2, 2021
30e06c6
Add workaround about using Stale Read with TiFlash (#5875) (#5922)
ti-chi-bot Jul 5, 2021
0cc435a
Fix ticdc canal-json related doc (#5774) (#5925)
ti-chi-bot Jul 5, 2021
4e7dad5
release-5.1.0: update CTE description (#5918) (#5923)
ti-chi-bot Jul 5, 2021
97335d6
remove some useless configurations (#5558) (#5926)
ti-chi-bot Jul 5, 2021
8b39e57
Remove a blank line that causes display failure (#5934) (#5935)
ti-chi-bot Jul 7, 2021
94b3436
configure-memory-usage: update default value (#5913) (#5939)
ti-chi-bot Jul 8, 2021
0cae595
quick-start-with-tidb: add a note to clarify the example ip (#5865) (…
ti-chi-bot Jul 8, 2021
2dfc3fa
TiKV configuration: defaultcf.titan config should not apply to other …
ti-chi-bot Jul 8, 2021
9b6059f
added config example for s3.region (#5919) (#5949)
ti-chi-bot Jul 8, 2021
48fc0a2
partitioned-table: adding the correction to incorrect examples (#5931…
ti-chi-bot Jul 8, 2021
98989b5
fix ctc doc (#5914) (#5953)
ti-chi-bot Jul 8, 2021
55e8034
*: make the scene of stale read clearer (#5932) (#5954)
ti-chi-bot Jul 8, 2021
b508e67
clustered-index.md: nonclusterd -> nonclustered (#5956) (#5958)
ti-chi-bot Jul 8, 2021
2b43641
Add the default value description for tidb_enforce_mpp (#5955) (#5961)
ti-chi-bot Jul 9, 2021
0f67d4b
Add a note to clarify the purpose of the quick start guide (#5876) (#…
ti-chi-bot Jul 12, 2021
a8c3b4f
fix-broken-external-links (#5971) (#5977)
ti-chi-bot Jul 13, 2021
68b2ee5
Fix some default configurations for RocksDB (#5969) (#5979)
ti-chi-bot Jul 13, 2021
afe2e92
Added a note about grpc-compression-type (#5960) (#5981)
ti-chi-bot Jul 13, 2021
2ef0ca7
Updated TiUP version (#5970) (#5983)
ti-chi-bot Jul 14, 2021
4472c8f
Remove the swappiness parameter (#5987) (#5990)
ti-chi-bot Jul 16, 2021
b196fe6
Change tidb_memory_usage_alarm_ratio scope to instance (#5988) (#5994)
ti-chi-bot Jul 16, 2021
c153c26
system-variables: update for consistency (#5826) (#5991)
ti-chi-bot Jul 16, 2021
2a0f15a
index page: updated the phase of TiDB Cloud from Beta to Public Previ…
ti-chi-bot Jul 23, 2021
595dcca
TiUP cluster: update data_dir (#6009) (#6019)
ti-chi-bot Jul 23, 2021
e67c799
update docs related to partition table dynamic mode (#5997) (#6025)
ti-chi-bot Jul 23, 2021
c8bc874
partitioning: Corrected partition management (#5498) (#6027)
ti-chi-bot Jul 23, 2021
852173f
TiDB binlog: update descriptions about commit ts and passoword (#5986…
ti-chi-bot Jul 27, 2021
f1f00b1
releases: add tidb 4.0.14 release notes (#5996) (#6039)
ti-chi-bot Jul 27, 2021
b0374ce
fix a typo for sync_diff_inspector (#6041) (#6043)
ti-chi-bot Jul 28, 2021
18ff150
Update dashboard FAQ (#5895) (#6051)
ti-chi-bot Jul 29, 2021
68eb8ec
correct document of using br backup and restore system tables (#6057)…
ti-chi-bot Jul 30, 2021
01e9b66
Add TiDB Dashboard session docs (#6058) (#6063)
ti-chi-bot Jul 30, 2021
2c75adc
Update the default value of tidb_stmt_summary_max_stmt_count (#6021) …
ti-chi-bot Jul 30, 2021
f0ff847
update br faq (#6060) (#6064)
ti-chi-bot Jul 30, 2021
4d27b79
releases: add TiDB 5.1.1 release notes (#6030) (#6065)
ti-chi-bot Jul 30, 2021
bc5e698
update stale read doc for ga (#6047) (#6052)
ti-chi-bot Jul 30, 2021
8f72414
grafana-overview-dashboard: update the monitoring item for CPS (#6066…
ti-chi-bot Jul 30, 2021
e049aa2
deploy-tidb-binlog: make the expression on deployment clearer (#6073)…
ti-chi-bot Aug 2, 2021
11d7adb
Add documentation on how to modify gcttl by tiup (#6071) (#6077)
ti-chi-bot Aug 2, 2021
e204267
remove useless variable (#6076) (#6078)
ti-chi-bot Aug 2, 2021
d919ebe
deleted roadmap.md (#6079) (#6082)
ti-chi-bot Aug 2, 2021
2eadade
TiDB Monitoring Metrics: remove a line (#6081) (#6083)
ti-chi-bot Aug 2, 2021
7c80eda
chore: lock plugin versions (#6089) (#6095)
ti-chi-bot Aug 4, 2021
878e03e
cdc: add compatibility notes for sort-dir (#6086) (#6096)
ti-chi-bot Aug 4, 2021
4854222
Add gc ttl (#6102) (#6104)
ti-chi-bot Aug 5, 2021
afa6b40
5.1.0 release notes: Fix link to telemetry docs (#6106) (#6107)
ti-chi-bot Aug 5, 2021
441738e
TiCDC: update a golang demo link (#6055) (#6114)
ti-chi-bot Aug 5, 2021
affee15
br/use-br-command-line-tool: supplement br note (#6000) (#6116)
ti-chi-bot Aug 5, 2021
f11a139
pr_template: Provides tips for cherry-pick (#6042) (#6122)
ti-chi-bot Aug 5, 2021
6141cd2
ticdc: add explicit_defaults_for_timestamp compatibility troubleshoot…
ti-chi-bot Aug 6, 2021
bf74697
TiFlash: remove outdated tune advise (#6133) (#6135)
ti-chi-bot Aug 10, 2021
fa2b427
tidb-scheduling: fix typo (#6140) (#6143)
ti-chi-bot Aug 11, 2021
431277a
tiup: fix dead links (#6153) (#6155)
ti-chi-bot Aug 12, 2021
ebdda11
chore: update pdf version tag (#6150)
YiniXu9506 Aug 13, 2021
bb85ae0
high-concurrency-best-practices: fix the support info of follower rea…
ti-chi-bot Aug 13, 2021
543d588
Fix broken link in error codes doc and support doc (#6196) (#6200)
ti-chi-bot Aug 20, 2021
0c57caa
docs: fix format for TiKV and PD configuration file template invalid.…
ti-chi-bot Aug 20, 2021
a280ce8
adopters: add zhihu case study (#6207) (#6213)
ti-chi-bot Aug 23, 2021
0e94de9
add notice about scaling in pd node (#6099) (#6221)
ti-chi-bot Aug 23, 2021
51e576a
adopters: remove 404 links (#6214) (#6225)
ti-chi-bot Aug 24, 2021
05e4084
update PR template for v5.2 (#6166) (#6241)
ti-chi-bot Aug 25, 2021
c9633c0
alert rules: update some descriptions (#6250) (#6256)
ti-chi-bot Aug 25, 2021
c3da29e
Fix image display error (#6169)
TomShawn Aug 25, 2021
0cf5b85
fix typo: relaod -> reload (#6235) (#6260)
ti-chi-bot Aug 25, 2021
02e2408
br: add restore to systables (#6004) (#6271)
ti-chi-bot Aug 26, 2021
df1c98a
Require process privilege for dumpling (#6187) (#6274)
ti-chi-bot Aug 26, 2021
56ff335
statement summary: update statement summary doc (#6084) (#6270)
ti-chi-bot Aug 26, 2021
278ada5
Add description about table name/alias specifying for read_from_stora…
ti-chi-bot Aug 26, 2021
6c4befb
TiKV configuration: remove redundant instructions (#6218) (#6288)
ti-chi-bot Aug 26, 2021
69fca38
alert rules: remove some descriptions (#6223) (#6292)
ti-chi-bot Aug 26, 2021
0358567
sql: improve kill's description (#6233) (#6306)
ti-chi-bot Aug 27, 2021
97713d9
system variables.md: add a warning message (#6298) (#6312)
ti-chi-bot Aug 27, 2021
f4687a3
tiup: add notice about importing cluster (#6108) (#6316)
ti-chi-bot Aug 27, 2021
1d2f500
chore: update PDF setting for 5.1 (#6307)
TomShawn Aug 27, 2021
6343c3e
add two HTAP documents
en-jin19 Aug 22, 2021
5b781fb
fix CI error
en-jin19 Aug 22, 2021
56b9e4c
fix CI error
en-jin19 Aug 22, 2021
8c81066
fix CI error
en-jin19 Aug 22, 2021
2a30b5f
fix CI errors
en-jin19 Aug 22, 2021
0249468
Update explore-htap.md
en-jin19 Aug 23, 2021
7ede85b
Update quick-start-with-htap.md
en-jin19 Aug 23, 2021
107b808
Update quick-start-with-htap.md
en-jin19 Aug 24, 2021
b2e4842
Update quick-start-with-htap.md
en-jin19 Aug 24, 2021
c92428c
Apply suggestions from code review
en-jin19 Aug 26, 2021
aa7eb2a
Apply suggestions from code review
en-jin19 Aug 26, 2021
06d34ea
fix CI error
en-jin19 Aug 26, 2021
949da58
Update explore-htap.md
qiancai Aug 26, 2021
2181340
Apply suggestions from code review
qiancai Aug 26, 2021
c84bb42
Apply suggestions from code review
en-jin19 Aug 27, 2021
5b40ad5
Apply suggestions from code review
en-jin19 Aug 27, 2021
d6046b4
Update quick-start-with-htap.md
qiancai Aug 27, 2021
aeff6a7
Update explore-htap.md
qiancai Aug 27, 2021
cb016e4
Update explore-htap.md
qiancai Aug 27, 2021
0c7cb0e
Update explore-htap.md
qiancai Aug 27, 2021
ae5235b
Merge branch 'release-5.1' into pr/6321
qiancai Aug 27, 2021
996aacc
Merge branch 'release-5.2' into pr/6321
qiancai Aug 27, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,9 @@
+ [Credits](/credits.md)
+ Quick Start
+ [Try Out TiDB](/quick-start-with-tidb.md)
+ [Try Out HTAP](/quick-start-with-htap.md)
+ [Learn TiDB SQL](/basic-sql-operations.md)
+ [Learn HTAP](/explore-htap.md)
+ [Import Example Database](/import-example-data.md)
+ Deploy
+ [Software and Hardware Requirements](/hardware-and-software-requirements.md)
Expand Down
4 changes: 3 additions & 1 deletion _index.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,10 @@ Designed for the cloud, TiDB provides flexible scalability, reliability and secu
<NavColumn>
<ColumnTitle>Quick Start</ColumnTitle>

- [Quick Start Guide](/quick-start-with-tidb.md)
- [Quick Start with TiDB](/quick-start-with-tidb.md)
- [Quick Start with HTAP](/quick-start-with-htap.md)
- [Explore SQL with TiDB](/basic-sql-operations.md)
- [Explore HTAP](/explore-htap.md)

</NavColumn>

Expand Down
104 changes: 104 additions & 0 deletions explore-htap.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
---
title: Explore HTAP
summary: Learn how to explore and use the features of TiDB HTAP.
Comment thread
qiancai marked this conversation as resolved.
Comment thread
qiancai marked this conversation as resolved.
---

# Explore HTAP

This guide describes how to explore and use the features of TiDB Hybrid Transactional and Analytical Processing (HTAP).

> **Note:**
>
> If you are new to TiDB HTAP and want to start using it quickly, see [Quick start with HTAP](/quick-start-with-htap.md).

## Use cases

TiDB HTAP can handle the massive data that increases rapidly, reduce the cost of DevOps, and be deployed in either on-premises or cloud environments easily, which brings the value of data assets in real time.

The following are the typical use cases of HTAP:

- Hybrid workload

When using TiDB for real-time Online Analytical Processing (OLAP) in hybrid load scenarios, you only need to provide an entry point of TiDB to your data. TiDB automatically selects different processing engines based on the specific business.

- Real-time stream processing

When using TiDB in real-time stream processing scenarios, TiDB ensures that all the data flowed in constantly can be queried in real time. At the same time, TiDB also can handle highly concurrent data workloads and Business Intelligence (BI) queries.

- Data hub

When using TiDB as a data hub, TiDB can meet specific business needs by seamlessly connecting the data for the application and the data warehouse.

For more information about use cases of TiDB HTAP, see [blogs about HTAP on the PingCAP website](https://en.pingcap.com/blog/tag/HTAP).

## Architecture

In TiDB, a row-based storage engine [TiKV](/tikv-overview.md) for Online Transactional Processing (OLTP) and a columnar storage engine [TiFlash](/tiflash/tiflash-overview.md) for Online Analytical Processing (OLAP) co-exist, replicate data automatically, and keep strong consistency.

For more information about the architecture, see [architecture of TiDB HTAP](/tiflash/tiflash-overview.md#architecture).

## Environment preparation

Before exploring the features of TiDB HTAP, you need to deploy TiDB and the corresponding storage engines according to the data volume. If the data volume is large (for example, 100 T), it is recommended to use TiFlash Massively Parallel Processing (MPP) as the primary solution and TiSpark as the supplementary solution.

- TiFlash

- If you have deployed a TiDB cluster with no TiFlash node, add the TiFlash nodes in the current TiDB cluster. For detailed information, see [Scale out a TiFlash cluster](/scale-tidb-using-tiup.md#scale-out-a-tiflash-cluster).
- If you have not deployed a TiDB cluster, see [Deploy a TiDB Cluster using TiUP](/production-deployment-using-tiup.md). Based on the minimal TiDB topology, you also need to deploy the [topology of TiFlash](/tiflash-deployment-topology.md).
- When deciding how to choose the number of TiFlash nodes, consider the following scenarios:

- If your use case requires OLTP with small-scale analytical processing and Ad-Hoc queries, deploy one or several TiFlash nodes. They can dramatically increase the speed of analytic queries.
- If the OLTP throughput does not cause significant pressure to I/O usage rate of the TiFlash nodes, each TiFlash node uses more resources for computation, and thus the TiFlash cluster can have near-linear scalability. The number of TiFlash nodes should be tuned based on expected performance and response time.
- If the OLTP throughput is relatively high (for example, the write or update throughput is higher than 10 million lines/hours), due to the limited write capacity of network and physical disks, the I/O between TiKV and TiFlash becomes a bottleneck and is also prone to read and write hotspots. In this case, the number of TiFlash nodes has a complex non-linear relationship with the computation volume of analytical processing, so you need to tune the number of TiFlash nodes based on the actual status of the system.

- TiSpark

- If your data needs to be analyzed with Spark, deploy TiSpark (Spark 3.x is not currently supported). For specific process, see [TiSpark User Guide](/tispark-overview.md).

<!-- - Real-time stream processing
- If you want to build an efficient and easy-to-use real-time data warehouse with TiDB and Flink, you are welcome to participate in Apache Flink x TiDB meetups.-->

## Data preparation

After TiFlash is deployed, TiKV does not replicate data to TiFlash automatically. You need to manually specify which tables need to be replicated to TiFlash. After that, TiDB creates the corresponding TiFlash replicas.

- If there is no data in the TiDB Cluster, migrate the data to TiDB first. For detailed information, see [data migration](/migration-overview.md).
- If the TiDB cluster already has the replicated data from upstream, after TiFlash is deployed, data replication does not automatically begin. You need to manually specify the tables to be replicated to TiFlash. For detailed information, see [Use TiFlash](/tiflash/use-tiflash.md).

## Data processing

With TiDB, you can simply enter SQL statements for query or write requests. For the tables with TiFlash replicas, TiDB uses the front-end optimizer to automatically choose the optimal execution plan.

> **Note:**
>
> The MPP mode of TiFlash is enabled by default. When an SQL statement is executed, TiDB automatically determines whether to run in the MPP mode through the optimizer.
>
> - To disable the MPP mode of TiFlash, set the value of the [tidb_allow_mpp](/system-variables.md#tidb_allow_mpp-new-in-v50) system variable to `OFF`.
> - To forcibly enable MPP mode of TiFlash for query execution, set the values of [tidb_allow_mpp](/system-variables.md#tidb_allow_mpp-new-in-v50) and [tidb_enforce_mpp](/system-variables.md#tidb_enforce_mpp-new-in-v51) to `ON`.
> - To check whether TiDB chooses the MPP mode to execute a specific query, see [Explain Statements in the MPP Mode](/explain-mpp.md#explain-statements-in-the-mpp-mode). If the output of `EXPLAIN` statement includes the `ExchangeSender` and `ExchangeReceiver` operators, the MPP mode is in use.

## Performance monitoring

When using TiDB, you can monitor the TiDB cluster status and performance metrics in either of the following ways:

- [TiDB Dashboard](/dashboard/dashboard-intro.md): you can see the overall running status of the TiDB cluster, analyse distribution and trends of read and write traffic, and learn the detailed execution information of slow queries.
- [Monitoring system (Prometheus & Grafana)](/grafana-overview-dashboard.md): you can see the monitoring parameters of TiDB cluster-related componants including PD, TiDB, TiKV, TiFlash,TiCDC, and Node_exporter.

To see the alert rules of TiDB cluster and TiFlash cluster, see [TiDB cluster alert rules](/alert-rules.md) and [TiFlash alert rules](/tiflash/tiflash-alert-rules.md).

## Troubleshooting

If any issue occurs during using TiDB, refer to the following documents:

- [Analyze slow queries](/analyze-slow-queries.md)
- [Identify expensive queries](/identify-expensive-queries.md)
- [Troubleshoot hotspot issues](/troubleshoot-hot-spot-issues.md)
- [TiDB cluster troubleshooting guide](/troubleshoot-tidb-cluster.md)
- [Troubleshoot a TiFlash Cluster](/tiflash/troubleshoot-tiflash.md)

You are also welcome to create [Github Issues](https://github.com/pingcap/tiflash/issues) or submit your questions on [AskTUG](https://asktug.com/).

## What's next

- To check the TiFlash version, critical logs, system tables, see [Maintain a TiFlash cluster](/tiflash/maintain-tiflash.md).
- To remove a specific TiFlash node, see [Scale out a TiFlash cluster](/scale-tidb-using-tiup.md#scale-out-a-tiflash-cluster).
213 changes: 213 additions & 0 deletions quick-start-with-htap.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,213 @@
---
title: Quick start with HTAP
summary: Learn how to quickly get started with the TiDB HTAP.
---

# Quick Start Guide for TiDB HTAP

This guide walks you through the quickest way to get started with TiDB's one-stop solution of Hybrid Transactional and Analytical Processing (HTAP).

> **Note:**
>
> The steps provided in this guide is ONLY for quick start in the test environment. For production environments, [explore HTAP](/explore-htap.md) is recommended.

## Basic concepts

Before using TiDB HTAP, you need to have some basic knowledge about [TiKV](/tikv-overview.md), a row-based storage engine for TiDB Online Transactional Processing (OLTP), and [TiFlash](/tiflash/tiflash-overview.md), a columnar storage engine for TiDB Online Analytical Processing (OLAP).

- Storage engines of HTAP: The row-based storage engine and the columnar storage engine co-exist for HTAP. Both storage engines can replicate data automatically and keep strong consistency. The row-based storage engine optimizes OLTP performance, and the columnar storage engine optimizes OLAP performance.
- Data consistency of HTAP: As a distributed and transactional key-value database, TiKV provides transactional interfaces with ACID compliance, and guarantees data consistency between multiple replicas and high availability with the implementation of the [Raft consensus algorithm](https://raft.github.io/raft.pdf). As a columnar storage extension of TiKV, TiFlash replicates data from TiKV in real time according to the Raft Learner consensus algorithm, which ensures that data is strongly consistent between TiKV and TiFlash.
- Data isolation of HTAP: TiKV and TiFlash can be deployed on different machines as needed to solve the problem of HTAP resource isolation.
- MPP computing engine: [MPP](/tiflash/use-tiflash.md#control-whether-to-select-the-mpp-mode) is a distributed computing framework provided by the TiFlash engine since TiDB 5.0, which allows data exchange between nodes and provides high-performance, high-throughput SQL algorithms. In the MPP mode, the run time of the analytic queries can be significantly reduced.

## Steps

In this document, you can experience the convenience and high performance of TiDB HTAP by querying an example table in a popular [TPC-H](http://www.tpc.org/tpch/) dataset.

### Step 1. Deploy a local test environment

Before using TiDB HTAP, follow the steps in the [Quick Start Guide for the TiDB Database Platform](/quick-start-with-tidb.md) to prepare a local test environment, and run the following command to deploy a TiDB cluster:

{{< copyable "shell-regular" >}}

```shell
tiup playground
```

> **Note:**
>
> `tiup playground` command is ONLY for quick start, NOT for production.

### Step 2. Prepare test data

In the following steps, you can create a [TPC-H](http://www.tpc.org/tpch/) dataset as the test data to use TiDB HTAP. If you are interested in TPC-H, see [General Implementation Guidelines](http://tpc.org/tpc_documents_current_versions/pdf/tpc-h_v3.0.0.pdf).

> **Note:**
>
> If you want to use your existing data for analytic queries, you can [migrate your data to TiDB](/migration-overview.md). If you want to design and create your own test data, you can create it by executing SQL statements or using related tools.

1. Install the test data generation tool by running the following command:

{{< copyable "shell-regular" >}}

```shell
tiup install bench
```

2. Generate the test data by running the following command:

{{< copyable "shell-regular" >}}

```shell
tiup bench tpch --sf=1 prepare
```

If the output of this command shows `Finished`, it indicates that the data is created.

3. Execute the following SQL statement to view the generated data:

{{< copyable "sql" >}}

```sql
SELECT
CONCAT(table_schema,'.',table_name) AS 'Table Name',
table_rows AS 'Number of Rows',
FORMAT_BYTES(data_length) AS 'Data Size',
FORMAT_BYTES(index_length) AS 'Index Size',
FORMAT_BYTES(data_length+index_length) AS'Total'
FROM
information_schema.TABLES
WHERE
table_schema='test';
```

As you can see from the output, eight tables are created in total, and the largest table has 6.5 million rows (the number of rows created by the tool depends on the actual SQL query result because the data is randomly generated).

```sql
+---------------+----------------+-----------+------------+-----------+
| Table Name | Number of Rows | Data Size | Index Size | Total |
+---------------+----------------+-----------+------------+-----------+
| test.nation | 25 | 2.44 KiB | 0 bytes | 2.44 KiB |
| test.region | 5 | 416 bytes | 0 bytes | 416 bytes |
| test.part | 200000 | 25.07 MiB | 0 bytes | 25.07 MiB |
| test.supplier | 10000 | 1.45 MiB | 0 bytes | 1.45 MiB |
| test.partsupp | 800000 | 120.17 MiB| 12.21 MiB | 132.38 MiB|
| test.customer | 150000 | 24.77 MiB | 0 bytes | 24.77 MiB |
| test.orders | 1527648 | 174.40 MiB| 0 bytes | 174.40 MiB|
| test.lineitem | 6491711 | 849.07 MiB| 99.06 MiB | 948.13 MiB|
+---------------+----------------+-----------+------------+-----------+
8 rows in set (0.06 sec)
```

This is a database of a commercial ordering system. In which, the `test.nation` table indicates the information about countries, the `test.region` table indicates the information about regions, the `test.part` table indicates the information about parts, the `test.supplier` table indicates the information about suppliers, the `test.partsupp` table indicates the information about parts of suppliers, the `test.customer` table indicates the information about customers, the `test.customer` table indicates the information about orders, and the `test.lineitem` table indicates the information about online items.

### Step 3. Query data with the row-based storage engine

To know the performance of TiDB with only the row-based storage engine, execute the following SQL statements:

{{< copyable "sql" >}}

```sql
SELECT
l_orderkey,
SUM(
l_extendedprice * (1 - l_discount)
) AS revenue,
o_orderdate,
o_shippriority
FROM
customer,
orders,
lineitem
WHERE
c_mktsegment = 'BUILDING'
AND c_custkey = o_custkey
AND l_orderkey = o_orderkey
AND o_orderdate < DATE '1996-01-01'
AND l_shipdate > DATE '1996-02-01'
GROUP BY
l_orderkey,
o_orderdate,
o_shippriority
ORDER BY
revenue DESC,
o_orderdate
limit 10;
```

This is a shipping priority query, which provides the priority and potential revenue of the highest-revenue order that has not been shipped before a specified date. The potential revenue is defined as the sum of `l_extendedprice * (1-l_discount)`. The orders are listed in the descending order of revenue. In this example, this query lists the unshipped orders with potential query revenue in the top 10.

### Step 4. Replicate the test data to the columnar storage engine

After TiFlash is deployed, TiKV does not replicate data to TiFlash immediately. You need to execute the following DDL statements in a MySQL client of TiDB to specify which tables need to be replicated. After that, TiDB will create the specified replicas in TiFlash accordingly.

{{< copyable "sql" >}}

```sql
ALTER TABLE test.customer SET TIFLASH REPLICA 1;
ALTER TABLE test.orders SET TIFLASH REPLICA 1;
ALTER TABLE test.lineitem SET TIFLASH REPLICA 1;
```

To check the replication status of the specific tables, execute the following statements:

{{< copyable "sql" >}}

```sql
SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = 'test' and TABLE_NAME = 'customer';
SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = 'test' and TABLE_NAME = 'orders';
SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = 'test' and TABLE_NAME = 'lineitem';
```

In the result of the above statements:

- `AVAILABLE` indicates whether the TiFlash replica of a specific table is available or not. `1` means available and `0` means unavailable. Once a replica becomes available, this status does not change any more. If you use DDL statements to modify the number of replicas, the replication status will be recalculated.
- `PROGRESS` means the progress of the replication. The value is between 0.0 and 1.0. 1 means at least one replica is replicated.

### Step 5. Analyze data faster using HTAP

Execute the SQL statements in [Step 3](#step-3-query-data-with-the-row-based-storage-engine) again, and you can see the performance of TiDB HTAP.

For tables with TiFlash replicas, the TiDB optimizer automatically determines whether to use TiFlash replicas based on the cost estimation. To check whether or not a TiFlash replica is selected, you can use the `desc` or `explain analyze` statement. For example:

{{< copyable "sql" >}}

```sql
explain analyze SELECT
l_orderkey,
SUM(
l_extendedprice * (1 - l_discount)
) AS revenue,
o_orderdate,
o_shippriority
FROM
customer,
orders,
lineitem
WHERE
c_mktsegment = 'BUILDING'
AND c_custkey = o_custkey
AND l_orderkey = o_orderkey
AND o_orderdate < DATE '1996-01-01'
AND l_shipdate > DATE '1996-02-01'
GROUP BY
l_orderkey,
o_orderdate,
o_shippriority
ORDER BY
revenue DESC,
o_orderdate
limit 10;
```

If the result of the `EXPLAIN` statement shows `ExchangeSender` and `ExchangeReceiver` operators, it indicates that the MPP mode has taken effect.

In addition, you can specify that each part of the entire query is computed using only the TiFlash engine. For detailed information, see [Use TiDB to read TiFlash replicas](/tiflash/use-tiflash.md#use-tidb-to-read-tiflash-replicas).

You can compare query results and query performance of these two methods.

## What's next

- [Architecture of TiDB HTAP](/tiflash/tiflash-overview.md#architecture)
- [Explore HTAP](/explore-htap.md)
- [Use TiFlash](/tiflash/use-tiflash.md#use-tiflash)