Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ This document focuses on how to create an index job, as well as some considerati
* bitmap index:a fast data structure that speeds up queries

## Basic Principles
Creating and droping index is essentially a schema change job. For details, please refer to
Creating and dropping index is essentially a schema change job. For details, please refer to
[Schema Change](alter-table-schema-change.html)。

## Syntax
Expand All @@ -53,12 +53,12 @@ create/drop index syntax
Please refer to [DROP INDEX](../../sql-reference/sql-statements/Data%20Definition/DROP%20INDEX.html) or [ALTER TABLE](../../sql-reference/sql-statements/Data%20Definition/ALTER%20TABLE.html)

## Create Job
Please refer to [Scheam Change](alter-table-schema-change.html)
Please refer to [Schema Change](alter-table-schema-change.html)
## View Job
Please refer to [Scheam Change](alter-table-schema-change.html)
Please refer to [Schema Change](alter-table-schema-change.html)

## Cancel Job
Please refer to [Scheam Change](alter-table-schema-change.html)
Please refer to [Schema Change](alter-table-schema-change.html)

## Notice
* Currently only index of bitmap type is supported.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
{
"title": "Scheam Change",
"title": "Schema Change",
"language": "en"
}
---
Expand All @@ -24,17 +24,17 @@ specific language governing permissions and limitations
under the License.
-->

# Scheam Change
# Schema Change

Users can modify the schema of existing tables through the Scheam Change operation. Doris currently supports the following modifications:
Users can modify the schema of existing tables through the Schema Change operation. Doris currently supports the following modifications:

* Add and delete columns
* Modify column type
* Adjust column order
* Add and modify Bloom Filter
* Add and delete bitmap index

This document mainly describes how to create a Scheam Change job, as well as some considerations and frequently asked questions about Scheam Change.
This document mainly describes how to create a Schema Change job, as well as some considerations and frequently asked questions about Schema Change.
## Glossary

* Base Table:When each table is created, it corresponds to a base table. The base table stores the complete data of this table. Rollups are usually created based on the data in the base table (and can also be created from other rollups).
Expand Down Expand Up @@ -68,9 +68,9 @@ The basic process of executing a Schema Change is to generate a copy of the inde
Before starting the conversion of historical data, Doris will obtain a latest transaction ID. And wait for all import transactions before this Transaction ID to complete. This Transaction ID becomes a watershed. This means that Doris guarantees that all import tasks after the watershed will generate data for both the original Index and the new Index. In this way, when the historical data conversion is completed, the data in the new Index can be guaranteed to be complete.
## Create Job

The specific syntax for creating a Scheam Change can be found in the description of the Scheam Change section in the help `HELP ALTER TABLE`.
The specific syntax for creating a Schema Change can be found in the description of the Schema Change section in the help `HELP ALTER TABLE`.

The creation of Scheam Change is an asynchronous process. After the job is submitted successfully, the user needs to view the job progress through the `SHOW ALTER TABLE COLUMN` command.
The creation of Schema Change is an asynchronous process. After the job is submitted successfully, the user needs to view the job progress through the `SHOW ALTER TABLE COLUMN` command.
## View Job

`SHOW ALTER TABLE COLUMN` You can view the Schema Change jobs that are currently executing or completed. When multiple indexes are involved in a Schema Change job, the command displays multiple lines, each corresponding to an index. For example:
Expand Down
4 changes: 2 additions & 2 deletions docs/en/administrator-guide/backup-restore.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ The commands related to the backup recovery function are as follows. The followi
* Snapshot Finished Time: Snapshot completion time.
* Upload Finished Time: Snapshot upload completion time.
* FinishedTime: The completion time of this assignment.
* Unfinished Tasks: In the `SNAPSHOTTING', `UPLOADING'and other stages, there will be multiple sub-tasks at the same time, the current stage shown here, the task ID of the unfinished sub-tasks.
* Unfinished Tasks: In the `SNAPSHOTTING`, `UPLOADING` and other stages, there will be multiple sub-tasks at the same time, the current stage shown here, the task ID of the unfinished sub-tasks.
* TaskErrMsg: If there is a sub-task execution error, the error message corresponding to the sub-task will be displayed here.
* Status: It is used to record some status information that may appear during the whole operation.
* Timeout: The timeout time of a job in seconds.
Expand All @@ -139,7 +139,7 @@ The commands related to the backup recovery function are as follows. The followi
* Database: The database corresponding to backup.
* Details: Shows the complete data directory structure of the backup.

5. RESTOR
5. RESTORE

Perform a recovery operation.

Expand Down
6 changes: 3 additions & 3 deletions docs/en/administrator-guide/colocation-join.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ In order for a table to have the same data distribution, the table in the same C

Tables in the same CG do not require consistency in the number, scope, and type of partition columns.

After fixing the number of bucket columns and buckets, the tables in the same CG will have the same Buckets Sequnce. The number of replicas determines the number of replicas of Tablets in each bucket, which BE they are stored on. Suppose that Buckets Sequnce is `[0, 1, 2, 3, 4, 5, 6, 7] `, and that BE nodes have `[A, B, C, D] `4. A possible distribution of data is as follows:
After fixing the number of bucket columns and buckets, the tables in the same CG will have the same Buckets Sequence. The number of replicas determines the number of replicas of Tablets in each bucket, which BE they are stored on. Suppose that Buckets Sequence is `[0, 1, 2, 3, 4, 5, 6, 7] `, and that BE nodes have `[A, B, C, D] `4. A possible distribution of data is as follows:

```
+---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+
Expand Down Expand Up @@ -141,7 +141,7 @@ SHOW PROC '/colocation_group/10005.10008';
* BucketIndex: Subscript to the bucket sequence.
* Backend Ids: A list of BE node IDs where data fragments are located in buckets.

> The above commands require AMDIN privileges. Normal user view is not supported at this time.
> The above commands require ADMIN privileges. Normal user view is not supported at this time.

### Modify Colocate Group

Expand Down Expand Up @@ -172,7 +172,7 @@ Copies can only be stored on specified BE nodes. So when a BE is unavailable (do

### Duplicate Equilibrium

Doris will try to distribute the fragments of the Collocation table evenly across all BE nodes. For the replica balancing of common tables, the granularity is single replica, that is to say, it is enough to find BE nodes with lower load for each replica alone. The equilibrium of the Colocation table is at the Bucket level, where all replicas within a Bucket migrate together. We adopt a simple equalization algorithm, which distributes Buckets Sequnce evenly on all BEs, regardless of the actual size of the replicas, but only according to the number of replicas. Specific algorithms can be referred to the code annotations in `ColocateTableBalancer.java`.
Doris will try to distribute the fragments of the Collocation table evenly across all BE nodes. For the replica balancing of common tables, the granularity is single replica, that is to say, it is enough to find BE nodes with lower load for each replica alone. The equilibrium of the Colocation table is at the Bucket level, where all replicas within a Bucket migrate together. We adopt a simple equalization algorithm, which distributes Buckets Sequence evenly on all BEs, regardless of the actual size of the replicas, but only according to the number of replicas. Specific algorithms can be referred to the code annotations in `ColocateTableBalancer.java`.

> Note 1: Current Colocation replica balancing and repair algorithms may not work well for heterogeneous deployed Oris clusters. The so-called heterogeneous deployment, that is, the BE node's disk capacity, number, disk type (SSD and HDD) is inconsistent. In the case of heterogeneous deployment, small BE nodes and large BE nodes may store the same number of replicas.
>
Expand Down
4 changes: 2 additions & 2 deletions docs/en/administrator-guide/config/be_config.md
Original file line number Diff line number Diff line change
Expand Up @@ -199,7 +199,7 @@ Similar to `base_compaction_trace_threshold`.
* Description: Configure the merge policy of the cumulative compaction stage. Currently, two merge policy have been implemented, num_based and size_based.
* Default value: size_based

In detail, ordinary is the initial version of the cumulative compaction merge policy. After a cumulative compaction, the base compaction process is directly performed. The size_based policy is an optimized version of the ordinary strategy. Versions are merged only when the disk volume of the rowset is of the same order of magnitude. After the compaction, the output rowset which satifies the conditions is promoted to the base compaction stage. In the case of a large number of small batch imports: reduce the write magnification of base compact, trade-off between read magnification and space magnification, and reducing file version data.
In detail, ordinary is the initial version of the cumulative compaction merge policy. After a cumulative compaction, the base compaction process is directly performed. The size_based policy is an optimized version of the ordinary strategy. Versions are merged only when the disk volume of the rowset is of the same order of magnitude. After the compaction, the output rowset which satisfies the conditions is promoted to the base compaction stage. In the case of a large number of small batch imports: reduce the write magnification of base compact, trade-off between read magnification and space magnification, and reducing file version data.

### `cumulative_size_based_promotion_size_mbytes`

Expand Down Expand Up @@ -337,7 +337,7 @@ The default value is `false`.
* Default: false

The merged expired rowset version path will be deleted after half an hour. In abnormal situations, deleting these versions will result in the problem that the consistent path of the query cannot be constructed. When the configuration is false, the program check is strict and the program will directly report an error and exit.
When configured as true, the program will run normally and ignore this error. In general, ignoring this error will not affect the query, only when the merged version is dispathed by fe, -230 error will appear.
When configured as true, the program will run normally and ignore this error. In general, ignoring this error will not affect the query, only when the merged version is dispatched by fe, -230 error will appear.

### inc_rowset_expired_sec

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ specific language governing permissions and limitations
under the License.
-->

# Conection Action
# Connection Action

## Request

Expand Down
4 changes: 2 additions & 2 deletions docs/en/administrator-guide/load-data/broker-load-manual.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,7 @@ The following is a detailed explanation of some parameters of the data descripti

+ negative

```data_desc``` can also set up data fetching and anti-importing. This function is mainly used when aggregated columns in data tables are of SUM type. If you want to revoke a batch of imported data. The `negative'parameter can be used as a batch of data. Doris automatically retrieves this batch of data on aggregated columns to eliminate the same batch of data.
```data_desc``` can also set up data fetching and anti-importing. This function is mainly used when aggregated columns in data tables are of SUM type. If you want to revoke a batch of imported data. The `negative` parameter can be used as a batch of data. Doris automatically retrieves this batch of data on aggregated columns to eliminate the same batch of data.

+ partition

Expand Down Expand Up @@ -377,7 +377,7 @@ The following configurations belong to the Broker load system-level configuratio

+ min\_bytes\_per\_broker\_scanner/max\_bytes\_per\_broker\_scanner/max\_broker\_concurrency

The first two configurations limit the minimum and maximum amount of data processed by a single BE. The third configuration limits the maximum number of concurrent imports for a job. The minimum amount of data processed, the maximum number of concurrencies, the size of source files and the number of BEs in the current cluster **together determine the concurrency of this import**.
The first two configurations limit the minimum and maximum amount of data processed by a single BE. The third configuration limits the maximum number of concurrent imports for a job. The minimum amount of data processed, the maximum number of concurrency, the size of source files and the number of BEs in the current cluster **together determine the concurrency of this import**.

```
The number of concurrent imports = Math. min (source file size / minimum throughput, maximum concurrency, current number of BE nodes)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ Usually used to troubleshoot network problems.

### `doris_be_snmp{name="tcp_in_segs"}`

Value of the `Tcp: InSegs` field in `/proc/net/snmp`. Represents the number of receivied TCP packets.
Value of the `Tcp: InSegs` field in `/proc/net/snmp`. Represents the number of received TCP packets.

Use `(NEW_tcp_in_errs - OLD_tcp_in_errs) / (NEW_tcp_in_segs - OLD_tcp_in_segs)` can calculate the error rate of received TCP packets.

Expand Down
2 changes: 1 addition & 1 deletion docs/en/community/committer-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ and you will be able to manage issues and pull request directly through our Gith

5. Once a reviewer has commented on a PR, they need to keep following up on subsequent changes to that PR.

6. A PR must get at least a +1 appove from committer who is not the author.
6. A PR must get at least a +1 approved from committer who is not the author.

7. After the first +1 to the PR, wait at least one working day before merging. The main purpose is to wait for the rest of the community to come to review.

Expand Down
2 changes: 1 addition & 1 deletion docs/en/getting-started/advance-usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ After successful submission, you can view the progress of the job by following c

When the job state is FINISHED, the job is completed.

When Rollup is established, you can use `DESC table1 ALL'to view the Rollup information of the table.
When Rollup is established, you can use `DESC table1 ALL` to view the Rollup information of the table.

```
mysql> desc table1 all;
Expand Down
8 changes: 4 additions & 4 deletions docs/en/getting-started/basic-usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,9 +72,9 @@ Initially, a database can be created through root or admin users:

`CREATE DATABASE example_db;`

> All commands can use'HELP command;'to see detailed grammar help. For example: `HELP CREATE DATABASE;'`
> All commands can use `HELP` command to see detailed grammar help. For example: `HELP CREATE DATABASE;'`

> If you don't know the full name of the command, you can use "help command a field" for fuzzy query. If you type'HELP CREATE', you can match commands like `CREATE DATABASE', `CREATE TABLE', `CREATE USER', etc.
> If you don't know the full name of the command, you can use "help command a field" for fuzzy query. If you type `HELP CREATE`, you can match commands like `CREATE DATABASE', `CREATE TABLE', `CREATE USER', etc.

After the database is created, you can view the database information through `SHOW DATABASES'.

Expand All @@ -99,7 +99,7 @@ After the example_db is created, the read and write permissions of example_db ca

### 2.3 Formulation

Create a table using the `CREATE TABLE'command. More detailed parameters can be seen:
Create a table using the `CREATE TABLE` command. More detailed parameters can be seen:

`HELP CREATE TABLE;`

Expand Down Expand Up @@ -315,7 +315,7 @@ Broker imports are asynchronous commands. Successful execution of the above comm

`SHOW LOAD WHERE LABLE = "table1_20170708";`

In the return result, FINISHED in the `State'field indicates that the import was successful.
In the return result, FINISHED in the `State` field indicates that the import was successful.

For more instructions on `SHOW LOAD`, see` HELP SHOW LOAD; `

Expand Down
4 changes: 2 additions & 2 deletions docs/en/getting-started/best-practice.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ DISTRIBUTED BY HASH(siteid) BUCKETS 10;

1.1.2. KEY UNIQUE

When UNIQUE KEY is the same, the new record covers the old record. At present, UNIQUE KEY implements the same RPLACE aggregation method as GGREGATE KEY, and they are essentially the same. Suitable for analytical business with updated requirements.
When UNIQUE KEY is the same, the new record covers the old record. At present, UNIQUE KEY implements the same REPLACE aggregation method as AGGREGATE KEY, and they are essentially the same. Suitable for analytical business with updated requirements.

```
CREATE TABLE sales_order
Expand Down Expand Up @@ -141,7 +141,7 @@ For the `site_visit'table:
site -u visit (siteid, city, username, pv)
```

Siteid may lead to a low degree of data aggregation. If business parties often base their PV needs on city statistics, they can build a city-only, PV-based ollup:
Siteid may lead to a low degree of data aggregation. If business parties often base their PV needs on city statistics, they can build a city-only, PV-based rollup:

```
ALTER TABLE site_visit ADD ROLLUP rollup_city(city, pv);
Expand Down
2 changes: 1 addition & 1 deletion docs/en/installing/compilation.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ Note: For different versions of Oris, you need to download the corresponding mir

### Self-compiling Development Environment Mirror

You can also create a Doris development environment mirror yourself, referring specifically to the `docker/README.md'file.
You can also create a Doris development environment mirror yourself, referring specifically to the `docker/README.md` file.


## Direct Compilation (CentOS/Ubuntu)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ under the License.
## Description

This statement is used to set the configuration items for the cluster (currently only the configuration items for setting FE are supported).
Settable configuration items can be viewed through AMDIN SHOW FRONTEND CONFIG; commands.
Settable configuration items can be viewed through `ADMIN SHOW FRONTEND CONFIG;` commands.

Grammar:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,11 +33,7 @@ grammar

ALTER CLUSTER cluster_name PROPERTIES ("key"="value", ...);

1. Scaling, scaling (according to the number of be existing in the cluster, large is scaling, small is scaling), scaling for synchronous operation, scaling for asynchronous operation, through the state of backend can be known whether the scaling is completed.

Proerties ("Instrume = Unum"= "3")

Instancefn Microsoft Yahei
1. Scaling, scaling (according to the number of be existing in the cluster, large is scaling, small is scaling), scaling for synchronous operation, scaling for asynchronous operation, through the state of backend can be known whether the scaling is completed.

## example

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ ALTER SYSTEM SET LOAD ERRORS HUB PROPERTIES
("type"= "broker",
"Name" = BOS,
"path" = "bos://backup-cmy/logs",
"bosu endpoint" ="http://gz.bcebos.com",
"bos_endpoint" ="http://gz.bcebos.com",
"bos_accesskey" = "069fc278xxxxxx24ddb522",
"bos_secret_accesskey"="700adb0c6xxxxxx74d59eaa980a"
);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Grammar:
BACKUP SNAPSHOT [db_name].{snapshot_name}
TO `repository_name`
ON (
"`Table `U name'[Distriction (`P1',...)],
`Table_name` [partition (`P1',...)],
...
)
PROPERTIES ("key"="value", ...);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ PROPERTIES (
1. Colcoate Table must be an OLAP-type table
2. The BUCKET number of tables with the same colocate_with attribute must be the same
3. The number of copies of tables with the same colocate_with attribute must be the same
4. Data types of DISTRIBUTTED Columns for tables with the same colocate_with attribute must be the same
4. Data types of DISTRIBUTED Columns for tables with the same colocate_with attribute must be the same

3 Colocate Join's applicable scenario:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -50,4 +50,4 @@ ERRORS
curl -u root -XPOST http://host:port/api/testDb/testLabel/_cancel

## keyword
Cancel, Rabel
Cancel, Label