-
Notifications
You must be signed in to change notification settings - Fork 710
reference, media: add tiflash overview doc #2155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
dd8461c
reference, media: add tiflash overview doc
ran-huang e7d7951
update pic
ran-huang b2a163d
update reference link
ran-huang 0831967
Update reference/tiflash/overview.md
ilovesoup 2a6bab9
Update reference/tiflash/overview.md
ilovesoup 9cb8a3c
Update reference/tiflash/overview.md
ilovesoup 358a9b3
Update reference/tiflash/overview.md
ilovesoup 30f77f4
Update overview.md
ilovesoup 549f3e9
Update reference/tiflash/overview.md
ilovesoup b45dbe1
Update reference/tiflash/overview.md
ilovesoup eb214f4
Update reference/tiflash/overview.md
ilovesoup 15a78fe
Update reference/tiflash/overview.md
ilovesoup e8f21e6
Update reference/tiflash/overview.md
ilovesoup 6eba9c0
Update reference/tiflash/overview.md
ilovesoup 0cc91aa
Update reference/tiflash/overview.md
ilovesoup 9bcd5c2
Update reference/tiflash/overview.md
ilovesoup 198f014
Update reference/tiflash/overview.md
ilovesoup 2227662
Update overview.md
ilovesoup 227ed2e
Update reference/tiflash/overview.md
yikeke 83fa9b7
Merge branch 'master' into tiflash-overview
yikeke d9726c9
Merge branch 'master' into tiflash-overview
TomShawn 702c3d9
Update TOC.md
ran-huang File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,70 @@ | ||
| --- | ||
| title: TiFlash Overview | ||
| summary: Learn the architecture and key features of TiFlash. | ||
| category: reference | ||
| --- | ||
|
|
||
| # TiFlash Overview | ||
|
|
||
| TiFlash is the key component that makes TiDB essentially an Hybrid Transactional/Analytical Processing (HTAP) database. As a columnar storage extension of TiKV, TiFlash provides both good isolation level and strong consistency guarantee. | ||
|
|
||
| In TiFlash, the columnar replicas are asynchronously replicated according to the Raft Learner consensus algorithm. When these replicas are read, the Snapshot Isolation level of consistency is achieved by validating Raft index and multi-version concurrency control (MVCC). | ||
|
|
||
| ## Architecture | ||
|
|
||
|  | ||
|
|
||
| The above figure is the architecture of TiDB in its HTAP form, including TiFlash nodes. | ||
|
|
||
| TiFlash provides the columnar storage, with a layer of coprocessors efficiently implemented by ClickHouse. Similar to TiKV, TiFlash also has a Multi-Raft system, which supports replicating and distributing data in the unit of Region (see [Data Storage](https://pingcap.com/blog/2017-07-11-tidbinternal1/) for details). | ||
|
|
||
| TiFlash conducts real-time replication of data in the TiKV nodes at a low cost that does not block writes in TiKV. Meanwhile, it provides the same read consistency as in TiKV and ensures that the latest data is read. The Region replica in TiFlash is logically identical to those in TiKV, and is split and merged along with the Leader replica in TiKV at the same time. | ||
|
|
||
| TiFlash is compatible with both TiDB and TiSpark, which enables you to freely choose between these two computing engines. | ||
|
|
||
| It is recommended that you deploy TiFlash in different nodes from TiKV to ensure workload isolation. It is also acceptable to deploy TiFlash and TiKV in the same node if no business isolation is required. | ||
|
|
||
| Currently, data cannot be written directly into TiFlash. You need to write data in TiKV and then replicate it to TiFlash, because it connects to the TiDB cluster as a Learner role. TiFlash supports data replication in the unit of table, but no data is replicated by default after deployment. To replicate data of a specified table, see [Create TiFlash replicas for tables](/reference/tiflash/use-tiflash.md#create-tiflash-replicas-for-tables). | ||
|
|
||
| TiFlash has three components: the columnar storage module, `tiflash proxy`, and `pd buddy`. `tiflash proxy` is responsible for the communication using the Multi-Raft consensus algorithm. `pd buddy` works with PD to replicate data from TiKV to TiFlash in the unit of table. | ||
|
|
||
| When TiDB receives the DDL command to create replicas in TiFlash, the `pd buddy` component acquires the information of the table to be replicated via the status port of TiDB, and sends the information to PD. Then PD performs the corresponding data scheduling according to the information provided by `pd buddy`. | ||
|
|
||
| ## Key features | ||
|
|
||
| TiFlash has the following key features: | ||
|
|
||
| - [Asynchronous replication](#asynchronous-replication) | ||
| - [Consistency](#consistency) | ||
| - [Intelligent choice](#intelligent-choice) | ||
| - [Computing acceleration](#computing-acceleration) | ||
|
|
||
| ### Asynchronous replication | ||
|
|
||
| The replica in TiFlash is asynchronously replicated as a special role, Raft Learner. This means when the TiFlash node is down or high network latency occurs, applications in TiKV can still proceed normally. | ||
|
|
||
| This replication mechanism inherits two advantages of TiKV: automatic load balancing and high availability. | ||
|
|
||
| - TiFlash does not rely on additional replication channels, but directly receives data from TiKV in a many-to-many manner. | ||
| - As long as the data is not lost in TiKV, you can restore the replica in TiFlash at any time. | ||
|
|
||
| ### Consistency | ||
|
|
||
| TiFlash provides the same Snapshot Isolation level of consistency as TiKV, and ensures that the latest data is read, which means that you can read the data previously written in TiKV. Such consistency is achieved by validating the data replication progress. | ||
|
|
||
| Every time TiFlash receives a read request, the Region replica sends a progress validation request (a lightweight RPC request) to the Leader replica. TiFlash performs the read operation only after the current replication progress includes the data covered by the timestamp of the read request. | ||
|
|
||
| ### Intelligent choice | ||
|
|
||
| TiDB can automatically choose to use TiFlash (column-wise) or TiKV (row-wise), or use both of them in one query to ensure the best performance. | ||
|
|
||
| This selection mechanism is similar to that of TiDB which chooses different indexes to execute query. TiDB optimizer makes the appropriate choice based on statistics of the read cost. | ||
|
|
||
| ### Computing acceleration | ||
|
|
||
| TiFlash accelerates the computing of TiDB in two ways: | ||
|
|
||
| - The columnar storage engine is more efficient in performing read operation. | ||
| - TiFlash shares part of the computing workload of TiDB. | ||
|
|
||
| TiFlash shares the computing workload in the same way as the TiKV Coprocessor does: TiDB pushes down the computing that can be completed in the storage layer. Whether the computing can be pushed down depends on the support of TiFlash. For details, see [Supported pushdown calculations](/reference/tiflash/use-tiflash.md#supported-push-down-calculations). | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.