Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/.vuepress/sidebar/en.js
Original file line number Diff line number Diff line change
Expand Up @@ -237,6 +237,7 @@ module.exports = [
"logstash",
"odbc-of-doris",
"hive-of-doris",
"iceberg-of-doris",
"plugin-development-manual",
"spark-doris-connector",
"flink-doris-connector",
Expand Down Expand Up @@ -629,6 +630,7 @@ module.exports = [
"SHOW SNAPSHOT",
"SHOW SYNC JOB",
"SHOW TABLES",
"SHOW TABLE CREATION",
"SHOW TABLET",
"SHOW TRANSACTION",
"STOP ROUTINE LOAD",
Expand Down
2 changes: 2 additions & 0 deletions docs/.vuepress/sidebar/zh-CN.js
Original file line number Diff line number Diff line change
Expand Up @@ -238,6 +238,7 @@ module.exports = [
"logstash",
"odbc-of-doris",
"hive-of-doris",
"iceberg-of-doris",
"plugin-development-manual",
"spark-doris-connector",
"flink-doris-connector",
Expand Down Expand Up @@ -631,6 +632,7 @@ module.exports = [
"SHOW SNAPSHOT",
"SHOW SYNC JOB",
"SHOW TABLES",
"SHOW TABLE CREATION",
"SHOW TABLET",
"SHOW TRANSACTION",
"SPARK LOAD",
Expand Down
146 changes: 146 additions & 0 deletions docs/en/extending-doris/iceberg-of-doris.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
---
{
"title": "Iceberg of Doris",
"language": "en"
}
---

<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

# Iceberg External Table of Doris

Iceberg External Table of Doris provides Doris with the ability to access Iceberg external tables directly, eliminating the need for cumbersome data import and leveraging Doris' own OLAP capabilities to solve Iceberg table data analysis problems.

1. support Iceberg data sources to access Doris
2. Support joint query between Doris and Iceberg data source tables to perform more complex analysis operations

This document introduces how to use this feature and the considerations.

## Glossary

### Noun in Doris

* FE: Frontend, the front-end node of Doris, responsible for metadata management and request access
* BE: Backend, the backend node of Doris, responsible for query execution and data storage

## How to use

### Create Iceberg External Table

Iceberg tables can be created in Doris in two ways. You do not need to declare the column definitions of the table when creating an external table, Doris can automatically convert them based on the column definitions of the table in Iceberg.

1. Create a separate external table to mount the Iceberg table.
The syntax can be viewed in `HELP CREATE TABLE`.

```sql
-- Syntax
CREATE [EXTERNAL] TABLE table_name
ENGINE = ICEBERG
[COMMENT "comment"]
PROPERTIES (
"iceberg.database" = "iceberg_db_name",
"iceberg.table" = "icberg_table_name",
"iceberg.hive.metastore.uris" = "thrift://192.168.0.1:9083",
"iceberg.catalog.type" = "HIVE_CATALOG"
);


-- Example: Mount iceberg_table under iceberg_db in Iceberg
CREATE TABLE `t_iceberg`
ENGINE = ICEBERG
PROPERTIES (
"iceberg.database" = "iceberg_db",
"iceberg.table" = "iceberg_table",
"iceberg.hive.metastore.uris" = "thrift://192.168.0.1:9083",
"iceberg.catalog.type" = "HIVE_CATALOG"
);
```

2. Create an Iceberg database to mount the corresponding Iceberg database on the remote side, and mount all the tables under the database.
You can check the syntax with `HELP CREATE DATABASE`.

```sql
-- Syntax
CREATE DATABASE db_name
[COMMENT "comment"]
PROPERTIES (
"iceberg.database" = "iceberg_db_name",
"iceberg.hive.metastore.uris" = "thrift://192.168.0.1:9083",
"iceberg.catalog.type" = "HIVE_CATALOG"
);

-- Example: mount the iceberg_db in Iceberg and mount all tables under that db
CREATE DATABASE `iceberg_test_db`
PROPERTIES (
"iceberg.database" = "iceberg_db",
"iceberg.hive.metastore.uris" = "thrift://192.168.0.1:9083",
"iceberg.catalog.type" = "HIVE_CATALOG"
);
```

The progress of the table build in `iceberg_test_db` can be viewed by `HELP SHOW TABLE CREATION`.

#### Parameter Description

- ENGINE needs to be specified as ICEBERG
- PROPERTIES property.
- `iceberg.hive.metastore.uris`: Hive Metastore service address
- `iceberg.database`: the name of the database to which Iceberg is mounted
- `iceberg.table`: the name of the table to which Iceberg is mounted, not required when mounting Iceberg database.
- `iceberg.catalog.type`: the catalog method used in Iceberg, the default is `HIVE_CATALOG`, currently only this method is supported, more Iceberg catalog access methods will be supported in the future.

### Show table structure

Show table structure can be viewed by `HELP SHOW CREATE TABLE`.

## Data Type Matching

The supported Iceberg column types correspond to Doris in the following table.

| Iceberg | Doris | Description |
| :------: | :----: | :-------------------------------: |
| BOOLEAN | BOOLEAN | |
| INTEGER | INT | |
| LONG | BIGINT | |
| FLOAT | FLOAT | |
| DOUBLE | DOUBLE | |
| DATE | DATE | |
| TIMESTAMP | DATETIME | Timestamp to Datetime with loss of precision |
| STRING | STRING | |
| UUID | VARCHAR | Use VARCHAR instead |
| DECIMAL | DECIMAL | |
| TIME | - | not supported |
| FIXED | - | not supported |
| BINARY | - | not supported |
| STRUCT | - | not supported |
| LIST | - | not supported |
| MAP | - | not supported |

**Note:**
- Iceberg table Schema changes **are not automatically synchronized** and require rebuilding the Iceberg external tables or database in Doris.
- The current default supported version of Iceberg is 0.12.0 and has not been tested in other versions. More versions will be supported in the future.

### Query Usage

Once you have finished building the Iceberg external table in Doris, it is no different from a normal Doris OLAP table except that you cannot use the data models in Doris (rollup, preaggregation, materialized views, etc.)

```sql
select * from t_iceberg where k1 > 1000 and k3 = 'term' or k4 like '%doris';
```
Original file line number Diff line number Diff line change
Expand Up @@ -25,14 +25,44 @@ under the License.
-->

# CREATE DATABASE

## Description
This statement is used to create a new database
Grammar:
CREATE DATABASE [IF NOT EXISTS] db_name;

This statement is used to create a new database
Syntax:
CREATE DATABASE [IF NOT EXISTS] db_name
[PROPERTIES ("key"="value", ...)] ;

1. PROPERTIES
Additional information of a database, can be defaulted.
1) In case of iceberg, the following information needs to be provided in the properties.
```
PROPERTIES (
"iceberg.database" = "iceberg_db_name",
"iceberg.hive.metastore.uris" = "thrift://127.0.0.1:9083",
"iceberg.catalog.type" = "HIVE_CATALOG"
)

```
`iceberg.database` is the name of the database corresponding to Iceberg.
`iceberg.hive.metastore.uris` is the address of the hive metastore service.
`iceberg.catalog.type` defaults to `HIVE_CATALOG`. Currently, only `HIVE_CATALOG` is supported, more Iceberg catalog types will be supported later.

## example
1. New database db_test
CREATE DATABASE db_test;
1. Create a new database db_test
```
CREATE DATABASE db_test;
```

2. Create a new Iceberg database iceberg_test
```
CREATE DATABASE `iceberg_test`
PROPERTIES (
"iceberg.database" = "doris",
"iceberg.hive.metastore.uris" = "thrift://127.0.0.1:9083",
"iceberg.catalog.type" = "HIVE_CATALOG"
);
```

## keyword
CREATE,DATABASE
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ Syntax:
CREATE [EXTERNAL] TABLE [IF NOT EXISTS] [database.]table_name
(column_definition1[, column_definition2, ...]
[, index_definition1[, ndex_definition12,]])
[ENGINE = [olap|mysql|broker|hive]]
[ENGINE = [olap|mysql|broker|hive|iceberg]]
[key_desc]
[COMMENT "table comment"]
[partition_desc]
Expand Down Expand Up @@ -106,7 +106,7 @@ Syntax:
Notice:
Only support BITMAP index in current version, BITMAP can only apply to single column
3. ENGINE type
Default is olap. Options are: olap, mysql, broker, hive
Default is olap. Options are: olap, mysql, broker, hive, iceberg
1) For mysql, properties should include:

```
Expand Down Expand Up @@ -156,6 +156,21 @@ Syntax:
)
```
"database" is the name of the database corresponding to the hive table, "table" is the name of the hive table, and "hive.metastore.uris" is the hive metastore service address.

4) For iceberg, properties should include:
```
PROPERTIES (
"iceberg.database" = "iceberg_db_name",
"iceberg.table" = "iceberg_table_name",
"iceberg.hive.metastore.uris" = "thrift://127.0.0.1:9083",
"iceberg.catalog.type" = "HIVE_CATALOG"
)

```
database is the name of the database corresponding to Iceberg.
table is the name of the table corresponding to Iceberg.
hive.metastore.uris is the address of the hive metastore service.
catalog.type defaults to HIVE_CATALOG. Currently, only HIVE_CATALOG is supported, more Iceberg catalog types will be supported later.

4. key_desc
Syntax:
Expand Down Expand Up @@ -788,6 +803,19 @@ Syntax:
);
```

17. Create an Iceberg external table

```
CREATE TABLE example_db.t_iceberg
ENGINE=ICEBERG
PROPERTIES (
"iceberg.database" = "iceberg_db",
"iceberg.table" = "iceberg_table",
"iceberg.hive.metastore.uris" = "thrift://127.0.0.1:9083",
"iceberg.catalog.type" = "HIVE_CATALOG"
);
```

## keyword

CREATE,TABLE
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
---
{
"title": "SHOW TABLE CREATION",
"language": "en"
}
---

<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

# SHOW TABLE CREATION

## Description

This statement is used to show the execution of the specified Iceberg Database table creation task
Syntax.
SHOW TABLE CREATION [FROM db_name] [LIKE table_name_wild];

Description.
1. Usage Notes
1) If db_name is not specified, the current default db is used.
2) If you use LIKE, it will match the table creation task with table_name_wild in the table name
2. The meaning of each column
1) Database: the name of the database
2) Table: the name of the table to be created
3) Status: the creation status of the table, `success`/`fail`.
4) CreateTime: the time to perform the task of creating the table
5) Error Msg: Error message of the failed table creation, or empty if it succeeds.

## example

1. Show all the table creation tasks in the default Iceberg db
SHOW TABLE CREATION;

mysql> show table creation;
+----------------------------+--------+---------+---------------------+----------------------------------------------------------+
| Database | Table | Status | Create Time | Error Msg |
+----------------------------+--------+---------+---------------------+----------------------------------------------------------+
| default_cluster:iceberg_db | logs_1 | success | 2022-01-24 19:42:45 | |
| default_cluster:iceberg_db | logs | fail | 2022-01-24 19:42:45 | Cannot convert Iceberg type[list<string>] to Doris type. |
+----------------------------+--------+---------+---------------------+----------------------------------------------------------+

2. Show the table creation tasks in the specified Iceberg db
SHOW TABLE CREATION FROM example_db;

mysql> show table creation from iceberg_db;
+----------------------------+--------+---------+---------------------+----------------------------------------------------------+
| Database | Table | Status | Create Time | Error Msg |
+----------------------------+--------+---------+---------------------+----------------------------------------------------------+
| default_cluster:iceberg_db | logs_1 | success | 2022-01-24 19:42:45 | |
| default_cluster:iceberg_db | logs | fail | 2022-01-24 19:42:45 | Cannot convert Iceberg type[list<string>] to Doris type. |
+----------------------------+--------+---------+---------------------+----------------------------------------------------------+

3. Show table creation tasks for the specified Iceberg db with the string "log" in the table name
SHOW TABLE CREATION FROM example_db LIKE '%log%';

mysql> show table creation from iceberg_db like "%1";
+----------------------------+--------+---------+---------------------+-----------+
| Database | Table | Status | Create Time | Error Msg |
+----------------------------+--------+---------+---------------------+-----------+
| default_cluster:iceberg_db | logs_1 | success | 2022-01-24 19:42:45 | |
+----------------------------+--------+---------+---------------------+-----------+

## keyword

SHOW,TABLE CREATION
Loading