Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
117 changes: 29 additions & 88 deletions sql-statements/sql-statement-split-region.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,42 +17,33 @@ To solve the hotspot problem in the above scenario, TiDB introduces the pre-spli

## Synopsis

**SplitRegionStmt:**
```ebnf+diagram
SplitRegionStmt ::=
"SPLIT" SplitSyntaxOption "TABLE" TableName PartitionNameList? ("INDEX" IndexName)? SplitOption

![SplitRegionStmt](/media/sqlgram/SplitRegionStmt.png)
SplitSyntaxOption ::=
("REGION" "FOR")? "PARTITION"?

**SplitSyntaxOption:**
TableName ::=
(SchemaName ".")? Identifier

![SplitSyntaxOption](/media/sqlgram/SplitSyntaxOption.png)
PartitionNameList ::=
"PARTITION" "(" PartitionName ("," PartitionName)* ")"

**TableName:**
SplitOption ::=
("BETWEEN" RowValue "AND" RowValue "REGIONS" NUM
| "BY" RowValue ("," RowValue)* )

![TableName](/media/sqlgram/TableName.png)

**PartitionNameListOpt:**

![PartitionNameListOpt](/media/sqlgram/PartitionNameListOpt.png)

**SplitOption:**

![SplitOption](/media/sqlgram/SplitOption.png)

**RowValue:**

![RowValue](/media/sqlgram/RowValue.png)

**Int64Num:**

![Int64Num](/media/sqlgram/Int64Num.png)
RowValue ::=
"(" ValuesOpt ")"
```

## Usage of Split Region

There are two types of Split Region syntax:

- The syntax of even split:

{{< copyable "sql" >}}

```sql
SPLIT TABLE table_name [INDEX index_name] BETWEEN (lower_value) AND (upper_value) REGIONS region_num
```
Expand All @@ -61,8 +52,6 @@ There are two types of Split Region syntax:

- The syntax of uneven split:

{{< copyable "sql" >}}

```sql
SPLIT TABLE table_name [INDEX index_name] BY (value_list) [, (value_list)] ...
```
Expand Down Expand Up @@ -111,16 +100,12 @@ Because `row_id` is an integer, the value of the key to be split can be calculat

For example, if you want 16 evenly split Regions split from key range`minInt64`~`maxInt64` for table t, you can use this statement:

{{< copyable "sql" >}}

```sql
SPLIT TABLE t BETWEEN (-9223372036854775808) AND (9223372036854775807) REGIONS 16;
```

This statement splits table t into 16 Regions between minInt64 and maxInt64. If the given primary key range is smaller than the specified one, for example, 0~1000000000, you can use 0 and 1000000000 take place of minInt64 and maxInt64 respectively to split Regions.

{{< copyable "sql" >}}

```sql
SPLIT TABLE t BETWEEN (0) AND (1000000000) REGIONS 16;
```
Expand All @@ -129,8 +114,6 @@ SPLIT TABLE t BETWEEN (0) AND (1000000000) REGIONS 16;

If the known data is unevenly distributed, and you want a Region to be split respectively in key ranges -inf ~ 10000, 10000 ~ 90000, and 90000 ~ +inf, you can achieve this by setting fixed points, as shown below:

{{< copyable "sql" >}}

```sql
SPLIT TABLE t BY (10000), (90000);
```
Expand Down Expand Up @@ -159,8 +142,6 @@ The values of `upper` and `lower` are encoded into a byte array firstly. After r

If the column of the `idx` index is of the integer type, you can use the following SQL statement to split index data:

{{< copyable "sql" >}}

```sql
SPLIT TABLE t INDEX idx BETWEEN (-9223372036854775808) AND (9223372036854775807) REGIONS 16;
```
Expand All @@ -169,8 +150,6 @@ This statement splits the Region of index idx in table t into 16 Regions from `m

If the column of index idx1 is of varchar type, and you want to split index data by prefix letters.

{{< copyable "sql" >}}

```sql
SPLIT TABLE t INDEX idx1 BETWEEN ("a") AND ("z") REGIONS 25;
```
Expand All @@ -179,8 +158,6 @@ This statement splits index idx1 into 25 Regions from a~z. The range of Region 1

In the split method above, both data with the `y` and `z` prefixes are written into Region 25, because the upper bound is not `z`, but `{` (the character next to `z` in ASCII). Therefore, a more accurate split method is as follows:

{{< copyable "sql" >}}

```sql
SPLIT TABLE t INDEX idx1 BETWEEN ("a") AND ("{") REGIONS 26;
```
Expand All @@ -189,8 +166,6 @@ This statement splits index idx1 of the table `t` into 26 Regions from a~`{`. Th

If the column of index `idx2` is of time type like timestamp/datetime, and you want to split index Region by year:

{{< copyable "sql" >}}

```sql
SPLIT TABLE t INDEX idx2 BETWEEN ("2010-01-01 00:00:00") AND ("2020-01-01 00:00:00") REGIONS 10;
```
Expand All @@ -199,8 +174,6 @@ This statement splits the Region of index `idx2` in table `t` into 10 Regions fr

If you want to split the index Region by day, see the following example:

{{< copyable "sql" >}}

```sql
SPLIT TABLE t INDEX idx2 BETWEEN ("2020-06-01 00:00:00") AND ("2020-07-01 00:00:00") REGIONS 30;
```
Expand All @@ -213,16 +186,12 @@ For data Region split of joint indexes, the only difference is that you can spec

For example, index `idx3 (a, b)` contains 2 columns, with column `a` of timestamp type and column `b` int. If you just want to do a time range split according to column `a`, you can use the SQL statement for splitting time index of a single column. In this case, do not specify the value of column `b` in `lower_value` and `upper_velue`.

{{< copyable "sql" >}}

```sql
SPLIT TABLE t INDEX idx3 BETWEEN ("2010-01-01 00:00:00") AND ("2020-01-01 00:00:00") REGIONS 10;
```

Within the same range of time, if you want to do one more split according to column b column. Just specify the value for column b when splitting.

{{< copyable "sql" >}}

```sql
SPLIT TABLE t INDEX idx3 BETWEEN ("2010-01-01 00:00:00", "a") AND ("2010-01-01 00:00:00", "z") REGIONS 10;
```
Expand All @@ -241,8 +210,6 @@ Index data can also be split by specified index values.

For example, there is `idx4 (a,b)`, with column `a` of the varchar type and column `b` of the timestamp type.

{{< copyable "sql" >}}

```sql
SPLIT TABLE t1 INDEX idx4 BY ("a", "2000-01-01 00:00:01"), ("b", "2019-04-17 14:26:19"), ("c", "");
```
Expand All @@ -262,16 +229,12 @@ Splitting Regions for partitioned tables is the same as splitting Regions for or

+ The syntax of even split:

{{< copyable "sql" >}}

```sql
SPLIT [PARTITION] TABLE t [PARTITION] [(partition_name_list...)] [INDEX index_name] BETWEEN (lower_value) AND (upper_value) REGIONS region_num
```

+ The syntax of uneven split:

{{< copyable "sql" >}}

```sql
SPLIT [PARTITION] TABLE table_name [PARTITION (partition_name_list...)] [INDEX index_name] BY (value_list) [, (value_list)] ...
```
Expand All @@ -280,18 +243,14 @@ Splitting Regions for partitioned tables is the same as splitting Regions for or

1. Create a partitioned table `t`. Suppose that you want to create a Hash table divided into two partitions. The example statement is as follows:

{{< copyable "sql" >}}

```sql
create table t (a int,b int,index idx(a)) partition by hash(a) partitions 2;
CREATE TABLE t (a INT, b INT, INDEX idx(a)) PARTITION BY HASH(a) PARTITIONS 2;
```

After creating the table `t`, a Region is split for each partition. Use the `SHOW TABLE REGIONS` syntax to view the Regions of this table:

{{< copyable "sql" >}}
After creating the table `t`, a Region is split for each partition. Use the [`SHOW TABLE REGIONS`](/sql-statements/sql-statement-show-table-regions.md) syntax to view the Regions of this table:

```sql
show table t regions;
SHOW TABLE t REGIONS;
```

```sql
Expand All @@ -305,8 +264,6 @@ Splitting Regions for partitioned tables is the same as splitting Regions for or

2. Use the `SPLIT` syntax to split a Region for each partition. Suppose that you want to split the data in the `[0,10000]` range of each partition into four Regions. The example statement is as follows:

{{< copyable "sql" >}}

```sql
split partition table t between (0) and (10000) regions 4;
```
Expand All @@ -319,10 +276,8 @@ Splitting Regions for partitioned tables is the same as splitting Regions for or

3. Use the `SHOW TABLE REGIONS` syntax to view the Regions of this table again. You can see that this table now has ten Regions, each partition with five Regions, four of which are the row data and one is the index data.

{{< copyable "sql" >}}

```sql
show table t regions;
SHOW TABLE t REGIONS;
```

```sql
Expand All @@ -344,10 +299,8 @@ Splitting Regions for partitioned tables is the same as splitting Regions for or

4. You can also split Regions for the index of each partition. For example, you can split the `[1000,10000]` range of the `idx` index into two Regions. The example statement is as follows:

{{< copyable "sql" >}}

```sql
split partition table t index idx between (1000) and (10000) regions 2;
SPLIT PARTITION TABLE t INDEX idx BETWEEN (1000) AND (10000) REGIONS 2;
```

#### Examples of Split Region for a single partition
Expand All @@ -356,37 +309,29 @@ You can specify the partition to be split.

1. Create a partitioned table. Suppose that you want to create a Range partitioned table split into three partitions. The example statement is as follows:

{{< copyable "sql" >}}

```sql
create table t ( a int, b int, index idx(b)) partition by range( a ) (
partition p1 values less than (10000),
partition p2 values less than (20000),
partition p3 values less than (MAXVALUE) );
CREATE TABLE t ( a INT, b INT, INDEX idx(b)) PARTITION BY RANGE( a ) (
PARTITION p1 VALUES LESS THAN (10000),
PARTITION p2 VALUES LESS THAN (20000),
PARTITION p3 VALUES LESS THAN (MAXVALUE) );
```

2. Suppose that you want to split the data in the `[0,10000]` range of the `p1` partition into two Regions. The example statement is as follows:

{{< copyable "sql" >}}

```sql
split partition table t partition (p1) between (0) and (10000) regions 2;
SPLIT PARTITION TABLE t PARTITION (p1) BETWEEN (0) AND (10000) REGIONS 2;
```

3. Suppose that you want to split the data in the `[10000,20000]` range of the `p2` partition into two Regions. The example statement is as follows:

{{< copyable "sql" >}}

```sql
split partition table t partition (p2) between (10000) and (20000) regions 2;
SPLIT PARTITION TABLE t PARTITION (p2) BETWEEN (10000) AND (20000) REGIONS 2;
```

4. You can use the `SHOW TABLE REGIONS` syntax to view the Regions of this table:

{{< copyable "sql" >}}

```sql
show table t regions;
SHOW TABLE t REGIONS;
```

```sql
Expand All @@ -403,10 +348,8 @@ You can specify the partition to be split.

5. Suppose that you want to split the `[0,20000]` range of the `idx` index of the `p1` and `p2` partitions into two Regions. The example statement is as follows:

{{< copyable "sql" >}}

```sql
split partition table t partition (p1,p2) index idx between (0) and (20000) regions 2;
SPLIT PARTITION TABLE t PARTITION (p1,p2) INDEX idx BETWEEN (0) AND (20000) REGIONS 2;
```

## pre_split_regions
Expand All @@ -421,10 +364,8 @@ The `tidb_scatter_region` global variable affects the behavior of `PRE_SPLIT_REG

### Examples of pre_split_regions

{{< copyable "sql" >}}

```sql
create table t (a int, b int,index idx1(a)) shard_row_id_bits = 4 pre_split_regions=2;
CREATE TABLE t (a INT, b INT, INDEX idx1(a)) SHARD_ROW_ID_BITS = 4 PRE_SPLIT_REGIONS=2;
```

After building the table, this statement splits `4 + 1` Regions for table t. `4 (2^2)` Regions are used to save table row data, and 1 Region is for saving the index data of `idx1`.
Expand Down