Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,6 @@
+ [BR Use Cases](/br/backup-and-restore-use-cases.md)
+ [External Storages](/br/backup-and-restore-storages.md)
+ [BR FAQ](/br/backup-and-restore-faq.md)
+ [Read Historical Data](/read-historical-data.md)
+ [Configure Time Zone](/configure-time-zone.md)
+ [Daily Checklist](/daily-check.md)
+ [Maintain TiFlash](/tiflash/maintain-tiflash.md)
Expand Down Expand Up @@ -141,6 +140,11 @@
+ Tutorials
+ [Multiple Data Centers in One City Deployment](/multi-data-centers-in-one-city-deployment.md)
+ [Three Data Centers in Two Cities Deployment](/three-data-centers-in-two-cities-deployment.md)
+ Read Historical Data
+ Use Stale Read (Recommended)
+ [Usage Scenarios of Stale Read](/stale-read.md)
+ [Perform Stale Read Using `As OF TIMESTAMP`](/as-of-timestamp.md)
+ [Use the `tidb_snapshot` System Variable](/read-historical-data.md)
+ Best Practices
+ [Use TiDB](/best-practices/tidb-best-practices.md)
+ [Java Application Development](/best-practices/java-app-best-practices.md)
Expand All @@ -150,6 +154,7 @@
+ [PD Scheduling](/best-practices/pd-scheduling-best-practices.md)
+ [TiKV Performance Tuning with Massive Regions](/best-practices/massive-regions-best-practices.md)
+ [Three-node Hybrid Deployment](/best-practices/three-nodes-hybrid-deployment.md)
+ [Local Read Under Three Data Centers Deployment](/best-practices/three-dc-local-read.md)
+ [Use Placement Rules](/configure-placement-rules.md)
+ [Use Load Base Split](/configure-load-base-split.md)
+ [Use Store Limit](/configure-store-limit.md)
Expand Down
261 changes: 261 additions & 0 deletions as-of-timestamp.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,261 @@
---
title: Read Historical Data Using the `AS OF TIMESTAMP` Clause
summary: Learn how to read historical data using the `AS OF TIMESTAMP` statement clause.
---

# Read Historical Data Using the `AS OF TIMESTAMP` Clause

This document describes how to perform the [Stale Read](/stale-read.md) feature using the `AS OF TIMESTAMP` clause to read historical data in TiDB, including specific usage examples and strategies for saving historical data.

TiDB supports reading historical data through a standard SQL interface, which is the `AS OF TIMESTAMP` SQL clause, without the need for special clients or drivers. After data is updated or deleted, you can read the historical data before the update or deletion using this SQL interface.

> **Note:**
>
> When reading historical data, TiDB returns the data with the old table structure even if the current table structure is different.

## Syntax

You can use the `AS OF TIMESTAMP` clause in the following three ways:

- [`SELECT ... FROM ... AS OF TIMESTAMP`](/sql-statements/sql-statement-select.md)
- [`START TRANSACTION READ ONLY AS OF TIMESTAMP`](/sql-statements/sql-statement-start-transaction.md)
- [`SET TRANSACTION READ ONLY AS OF TIMESTAMP`](/sql-statements/sql-statement-set-transaction.md)

If you want to specify an exact point of time, you can set a datetime value or use a time function in the `AS OF TIMESTAMP` clause. The format of datetime is like "2016-10-08 16:45:26.999", with millisecond as the minimum time unit, but for most of the time, the time unit of second is enough for specifying a datetime, such as "2016-10-08 16:45:26". You can also get the current time to the millisecond using the `NOW(3)` function. If you want to read the data of several seconds ago, it is **recommended** to use an expression such as `NOW() - INTERVAL 10 SECOND`.

If you want to specify a time range, you can use the `TIDB_BOUNDED_STALENESS()` function in the clause. When this function is used, TiDB selects a suitable timestamp within the specified time range. "Suitable" means there are no transactions that start before this timestamp and have not been committed on the accessed replica, that is, TiDB can perform read operations on the accessed replica and the read operations are not blocked. You need to use `TIDB_BOUNDED_STALENESS(t1, t2)` to call this function. `t1` and `t2` are the two ends of the time range, which can be specified using either datetime values or time functions.

Here are some examples of the `AS OF TIMESTAMP` clause:

- `AS OF TIMESTAMP '2016-10-08 16:45:26'`: Tells TiDB to read the latest data stored at 16:45:26 on October 8, 2016.
- `AS OF TIMESTAMP NOW() - INTERVAL 10 SECOND`: Tells TiDB to read the latest data stored 10 seconds ago.
- `AS OF TIMESTAMP TIDB_BOUNDED_STALENESS('2016-10-08 16:45:26', '2016-10-08 16:45:29')`: Tells TiDB to read the data as new as possible within the time range of 16:45:26 to 16:45:29 on October 8, 2016.
- `AS OF TIMESTAMP TIDB_BOUNDED_STALENESS(NOW() - INTERVAL 20 SECOND, NOW())`: Tells TiDB to read the data as new as possible within the time range of 20 seconds ago to the present.

Note that in addition to specifying a timestamp, the most common use of the `AS OF TIMESTAMP` clause is to read data that is several seconds old. If this approach is used, it is recommended to read historical data older than 5 seconds.

## Usage examples

This section describes different ways to use the `AS OF TIMESTAMP` clause with several examples. It first introduces how to prepare the data for recovery, and then shows how to use `AS OF TIMESTAMP` in `SELECT`, `START TRANSACTION READ ONLY AS OF TIMESTAMP`, and `SET TRANSACTION READ ONLY AS OF TIMESTAMP` respectively.

### Prepare data sample

To prepare data for recovery, create a table first and insert several rows of data:

```sql
create table t (c int);
```

```
Query OK, 0 rows affected (0.01 sec)
```

```sql
insert into t values (1), (2), (3);
```

```
Query OK, 3 rows affected (0.00 sec)
```

View the data in the table:

```sql
select * from t;
```

```
+------+
| c |
+------+
| 1 |
| 2 |
| 3 |
+------+
3 rows in set (0.00 sec)
```

View the current time:

```sql
select now();
```

```
+---------------------+
| now() |
+---------------------+
| 2021-05-26 16:45:26 |
+---------------------+
1 row in set (0.00 sec)
```

Update the data in a row:

```sql
update t set c=22 where c=2;
```

```
Query OK, 1 row affected (0.00 sec)
```

Confirm that the data of the row is updated:

```sql
select * from t;
```

```
+------+
| c |
+------+
| 1 |
| 22 |
| 3 |
+------+
3 rows in set (0.00 sec)
```

### Read historical data using the `SELECT` statement

You can use the [`SELECT ... FROM ... AS OF TIMESTAMP`](/sql-statements/sql-statement-select.md) statement to read data from a time point in the past.

```sql
select * from t as of timestamp '2021-05-26 16:45:26';
```

```
+------+
| c |
+------+
| 1 |
| 2 |
| 3 |
+------+
3 rows in set (0.00 sec)
```

> **Note:**
>
> When reading multiple tables using one `SELECT` statement, you need to make sure that the format of TIMESTAMP EXPRESSIONs is consistent. For example, `select * from t as of timestamp NOW() - INTERVAL 2 SECOND, c as of timestamp NOW() - INTERVAL 2 SECOND;`. In addition, you must specify the `AS OF` information for the relevant table in the `SELECT` statement; otherwise, the `SELECT` statement reads the latest data by default.

### Read historical data using the `START TRANSACTION READ ONLY AS OF TIMESTAMP` statement

You can use the [`START TRANSACTION READ ONLY AS OF TIMESTAMP`](/sql-statements/sql-statement-start-transaction.md) statement to start a read-only transaction based on a time point in the past. The transaction reads historical data of the given time.

```sql
start transaction read only as of timestamp '2021-05-26 16:45:26';
```

```
Query OK, 0 rows affected (0.00 sec)
```

```sql
select * from t;
```

```
+------+
| c |
+------+
| 1 |
| 2 |
| 3 |
+------+
3 rows in set (0.00 sec)
```

```sql
commit;
```

```
Query OK, 0 rows affected (0.00 sec)
```

After the transaction is committed, you can read the latest data.

```sql
select * from t;
```

```
+------+
| c |
+------+
| 1 |
| 22 |
| 3 |
+------+
3 rows in set (0.00 sec)
```

> **Note:**
>
> If you start a transaction with the statement `START TRANSACTION READ ONLY AS OF TIMESTAMP`, it is a read-only transaction. Write operations are rejected in this transaction.

### Read historical data using the `SET TRANSACTION READ ONLY AS OF TIMESTAMP` statement

You can use the [`SET TRANSACTION READ ONLY AS OF TIMESTAMP`](/sql-statements/sql-statement-set-transaction.md) statement to set the next transaction as a read-only transaction based on a specified time point in the past. The transaction reads historical data of the given time.

```sql
set transaction read only as of timestamp '2021-05-26 16:45:26';
```

```
Query OK, 0 rows affected (0.00 sec)
```

```sql
begin;
```

```
Query OK, 0 rows affected (0.00 sec)
```

```sql
select * from t;
```

```
+------+
| c |
+------+
| 1 |
| 2 |
| 3 |
+------+
3 rows in set (0.00 sec)
```

```sql
commit;
```

```
Query OK, 0 rows affected (0.00 sec)
```

After the transaction is committed, you can read the latest data.

```sql
select * from t;
```

```
+------+
| c |
+------+
| 1 |
| 22 |
| 3 |
+------+
3 rows in set (0.00 sec)
```

> **Note:**
>
> If you start a transaction with the statement `SET TRANSACTION READ ONLY AS OF TIMESTAMP`, it is a read-only transaction. Write operations are rejected in this transaction.
29 changes: 29 additions & 0 deletions best-practices/three-dc-local-read.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
---
title: Local Read under Three Data Centers Deployment
summary: Learn how to use the Stale Read feature to read local data under three DCs deployment and thus reduce cross-center requests.
---

# Local Read under Three Data Centers Deployment

In the model of three data centers, a Region has three replicas which are isolated in each data center. However, due to the requirement of strongly consistent read, TiDB must access the Leader replica of the corresponding data for every query. If the query is generated in a data center different from that of the Leader replica, TiDB needs to read data from another data center, thus causing the access latency to increase.

This document describes how to use the [Stale Read](/stale-read.md) feature to avoid cross-center access and reduce the access latency at the expense of real-time data availability.

## Deploy a TiDB cluster of three data centers

For the three-data-center deployment method, refer to [Multiple Data Centers in One City Deployment](/multi-data-centers-in-one-city-deployment.md).

Note that if both the TiKV and TiDB nodes have the configuration item `labels` configured, the TiKV and TiDB nodes in the same data center must have the same value for the `zone` label. For example, if a TiKV node and a TiDB node are both in the data center `dc-1`, then the two nodes need to be configured with the following label:

```
[labels]
zone=dc-1
```

## Perform local read using Stale Read

[Stale Read](/stale-read.md) is a mechanism that TiDB provides for the users to read historical data. Using this mechanism, you can read the corresponding historical data of a specific point in time or within a specified time range, and thus save the latency brought by data replication between storage nodes. When using Stale Read in some scenarios of geo-distributed deployment, TiDB accesses the replica in the current data center to read the corresponding data at the expense of some real-time performance, which avoids network latency brought by cross-center connection and reduces the access latency for the entire query process.

When TiDB receives a Stale Read query, if the `zone` label of that TiDB node is configured, then TiDB sends the request to the TiKV node with the same `zone` label where the corresponding data replica resides.

For how to perform Stale Read, see [Perform Stale Read using the `AS OF TIMESTAMP` clause](/as-of-timestamp.md).
2 changes: 1 addition & 1 deletion pd-control.md
Original file line number Diff line number Diff line change
Expand Up @@ -326,7 +326,7 @@ Usage:

- `enable-debug-metrics` is used to enable the metrics for debugging. When you set it to `true`, PD enables some metrics such as `balance-tolerant-size`.

- `enable-placement-rules` is used to enable placement rules.
- `enable-placement-rules` is used to enable placement rules, which is enabled by default in v5.0 and later versions.

- `store-limit-mode` is used to control the mode of limiting the store speed. The optional modes are `auto` and `manual`. In `auto` mode, the stores are automatically balanced according to the load (experimental).

Expand Down
20 changes: 13 additions & 7 deletions read-historical-data.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,25 @@
---
title: Read Historical Data
summary: Learn about how TiDB reads data from history versions.
title: Read Historical Data Using the System Variable `tidb_snapshot`
summary: Learn about how TiDB reads data from history versions using the system variable `tidb_snapshot`.
aliases: ['/docs/dev/read-historical-data/','/docs/dev/how-to/get-started/read-historical-data/']
---

# Read Historical Data
# Read Historical Data Using the System Variable `tidb_snapshot`
Comment thread
TomShawn marked this conversation as resolved.

This document describes how TiDB reads data from the history versions, how TiDB manages the data versions, as well as an example to show how to use the feature.
This document describes how to read data from the history versions using the system variable `tidb_snapshot`, including specific usage examples and strategies for saving historical data.

> **Note:**
>
> You can also use the [Stale Read](/stale-read.md) feature to read historical data, which is more recommended.

## Feature description

TiDB implements a feature to read history data using the standard SQL interface directly without special clients or drivers. By using this feature:
TiDB implements a feature to read history data using the standard SQL interface directly without special clients or drivers.

- Even when data is updated or removed, its history versions can be read using the SQL interface.
- Even if the table structure changes after the data is updated, TiDB can use the old structure to read the history data.
> **Note:**
>
> - Even when data is updated or removed, its history versions can be read using the SQL interface.
> - When reading historical data, TiDB returns the data with the old table structure even if the current table structure is different.

## How TiDB reads data from history versions

Expand Down
8 changes: 7 additions & 1 deletion sql-statements/sql-statement-select.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,13 @@ The `SELECT` statement is used to read data from TiDB.

**TableRefsClause:**

![TableRefsClause](/media/sqlgram/TableRefsClause.png)
```ebnf+diagram
Comment thread
CharLotteiu marked this conversation as resolved.
TableRefsClause ::=
TableRef AsOfClause? ( ',' TableRef AsOfClause? )*

AsOfClause ::=
'AS' 'OF' 'TIMESTAMP' Expression
```

**WhereClauseOptional:**

Expand Down
Loading