apache · morningman · Aug 28, 2020 · Aug 24, 2020 · Aug 24, 2020
diff --git a/docs/en/administrator-guide/load-data/broker-load-manual.md b/docs/en/administrator-guide/load-data/broker-load-manual.md
@@ -174,6 +174,10 @@ The following is a detailed explanation of some parameters of the data descripti
 
         The where statement in ```data_desc``` is responsible for filtering the data that has been transformed. The unselected rows which is filtered by where predicate will not be calculated in ```max_filter_ratio``` . If there are more then one where predicate of the same table , the multi where predicate will be merged from different ```data_desc``` and the policy is AND. 
 
++ merge\_type
+     The type of data merging supports three types: APPEND, DELETE, and MERGE. APPEND is the default value, which means that all this batch of data needs to be appended to the existing data. DELETE means to delete all rows with the same key as this batch of data. MERGE semantics Need to be used in conjunction with the delete condition, which means that the data that meets the delete condition is processed according to DELETE semantics and the rest is processed according to APPEND semantics
+
+
 #### Import job parameters
 
 Import job parameters mainly refer to the parameters in Broker load creating import statement that belong to ``opt_properties``. Import operation parameters act on the whole import operation.

diff --git a/docs/en/administrator-guide/load-data/routine-load-manual.md b/docs/en/administrator-guide/load-data/routine-load-manual.md
@@ -167,6 +167,10 @@ The detailed syntax for creating a routine load task can be connected to Doris a
 
     3. For a column type loaded with a range limit, if the original data can pass the type conversion normally, but cannot pass the range limit, strict mode will not affect it. For example, if the type is decimal(1,0) and the original data is 10, it is eligible for type conversion but not for column declarations. This data strict has no effect on it.
 
+* merge\_type
+     The type of data merging supports three types: APPEND, DELETE, and MERGE. APPEND is the default value, which means that all this batch of data needs to be appended to the existing data. DELETE means to delete all rows with the same key as this batch of data. MERGE semantics Need to be used in conjunction with the delete condition, which means that the data that meets the delete condition is processed according to DELETE semantics and the rest is processed according to APPEND semantics
+
+
 #### strict mode and load relationship of source data
 
 Here is an example of a column type of TinyInt.

diff --git a/docs/en/administrator-guide/load-data/stream-load-manual.md b/docs/en/administrator-guide/load-data/stream-load-manual.md
@@ -143,6 +143,10 @@ The number of rows in the original file = `dpp.abnorm.ALL + dpp.norm.ALL`
 
     Memory limit. Default is 2GB. Unit is Bytes
 
++ merge\_type
+     The type of data merging supports three types: APPEND, DELETE, and MERGE. APPEND is the default value, which means that all this batch of data needs to be appended to the existing data. DELETE means to delete all rows with the same key as this batch of data. MERGE semantics Need to be used in conjunction with the delete condition, which means that the data that meets the delete condition is processed according to DELETE semantics and the rest is processed according to APPEND semantics
+
+
 ### Return results
 
 Since Stream load is a synchronous import method, the result of the import is directly returned to the user by creating the return value of the import.

diff --git a/docs/en/sql-reference/sql-statements/Data Definition/ALTER TABLE.md b/docs/en/sql-reference/sql-statements/Data Definition/ALTER TABLE.md
@@ -178,6 +178,13 @@ under the License.
             PROPERTIES ("key"="value")
         note:
             Can also be merged into the above schema change operation to modify, see the example below
+
+    7. Enable batch delete support
+        grammar:
+            ENABLE FEATURE "BATCH_DELETE"
+        note:
+            Only support unique tables
+
 
 
     Rename supports modification of the following names:

diff --git a/docs/en/sql-reference/sql-statements/Data Manipulation/BROKER LOAD.md b/docs/en/sql-reference/sql-statements/Data Manipulation/BROKER LOAD.md
@@ -57,6 +57,7 @@ under the License.
 
         To describe the data source. 
         syntax: 
+            [MERGE|APPEND|DELETE]
             DATA INFILE
             (
             "file_path1"[, file_path2, ...]
@@ -68,7 +69,8 @@ under the License.
             [FORMAT AS "file_type"]
             [(column_list)]
             [SET (k1 = func(k2))]
-            [WHERE predicate]    
+            [WHERE predicate] 
+            [DELETE ON label=true]
 
         Explain: 
             file_path: 
@@ -116,6 +118,14 @@ under the License.
             WHERE:
 
             After filtering the transformed data, data that meets where predicates can be loaded. Only column names in tables can be referenced in WHERE statements.
+
+            merge_type:
+
+            The type of data merging supports three types: APPEND, DELETE, and MERGE. APPEND is the default value, which means that all this batch of data needs to be appended to the existing data. DELETE means to delete all rows with the same key as this batch of data. MERGE semantics Need to be used in conjunction with the delete condition, which means that the data that meets the delete on condition is processed according to DELETE semantics and the rest is processed according to APPEND semantics
+
+            delete_on_predicates:
+
+            Only used when merge type is MERGE
 
     3. broker_name
 
@@ -190,7 +200,7 @@ under the License.
 
 ## example
 
-    1. Load a batch of data from HDFS, specify timeout and filtering ratio. Use the broker with the inscription my_hdfs_broker. Simple authentication.
+    1. Load a batch of data from HDFS, specify timeout and filtering ratio. Use the broker with the plaintext ugi my_hdfs_broker. Simple authentication.
 
         LOAD LABEL example_db.label1
         (
@@ -422,6 +432,27 @@ under the License.
          SET (data_time=str_to_date(data_time, '%Y-%m-%d %H%%3A%i%%3A%s'))
         ) 
         WITH BROKER "hdfs" ("username"="user", "password"="pass");
+
+    13. Load a batch of data from HDFS, specify timeout and filtering ratio. Use the broker with the plaintext ugi my_hdfs_broker. Simple authentication. delete the data when v2 >100, other append
+
+        LOAD LABEL example_db.label1
+        (
+        MERGE DATA INFILE("hdfs://hdfs_host:hdfs_port/user/palo/data/input/file")
+        INTO TABLE `my_table`
+        COLUMNS TERMINATED BY "\t"
+        (k1, k2, k3, v2, v1)
+        )
+        DELETE ON v2 >100
+        WITH BROKER my_hdfs_broker
+        (
+        "username" = "hdfs_user",
+        "password" = "hdfs_passwd"
+        )
+        PROPERTIES
+        (
+        "timeout" = "3600",
+        "max_filter_ratio" = "0.1"
+        );
 
 ## keyword
 

diff --git a/docs/en/sql-reference/sql-statements/Data Manipulation/ROUTINE LOAD.md b/docs/en/sql-reference/sql-statements/Data Manipulation/ROUTINE LOAD.md
@@ -52,9 +52,11 @@ FROM data_source
     Used to describe the load data. grammar:
 
     ```
+    [merge_type],
     [column_separator],
     [columns_mapping],
     [where_predicates],
+    [delete_on_predicates]
     [partitions]
     ```
 
@@ -106,6 +108,14 @@ FROM data_source
 
         `PARTITION(p1, p2, p3)`
 
+    5. merge_type:
+
+        The type of data merging supports three types: APPEND, DELETE, and MERGE. APPEND is the default value, which means that all this batch of data needs to be appended to the existing data. DELETE means to delete all rows with the same key as this batch of data. MERGE semantics Need to be used in conjunction with the delete condition, which means that the data that meets the delete on condition is processed according to DELETE semantics and the rest is processed according to APPEND semantics
+
+    6. delete_on_predicates:
+
+        Only used when merge type is MERGE
+
 4. job_properties
 
     A generic parameter that specifies a routine load job.
@@ -494,6 +504,29 @@ FROM data_source
             {"category":"33","author":"3avc","title":"SayingsoftheCentury","timestamp":1589191387}
             ]
         }
+
+    7. Create a Kafka routine load task named test1 for the example_tbl of example_db. delete all data key colunms match v3 >100 key columns.
+
+        CREATE ROUTINE LOAD example_db.test1 ON example_tbl
+        WITH MERGE
+        COLUMNS(k1, k2, k3, v1, v2, v3),
+        WHERE k1 > 100 and k2 like "%doris%"
+        DELETE ON v3 >100
+        PROPERTIES
+        (
+            "desired_concurrent_number"="3",
+            "max_batch_interval" = "20",
+            "max_batch_rows" = "300000",
+            "max_batch_size" = "209715200",
+            "strict_mode" = "false"
+        )
+        FROM KAFKA
+        (
+            "kafka_broker_list" = "broker1:9092,broker2:9092,broker3:9092",
+            "kafka_topic" = "my_topic",
+            "kafka_partitions" = "0,1,2,3",
+            "kafka_offsets" = "101,0,0,200"
+        );
 ## keyword
 
     CREATE, ROUTINE, LOAD
diff --git a/docs/en/sql-reference/sql-statements/Data Manipulation/STREAM LOAD.md b/docs/en/sql-reference/sql-statements/Data Manipulation/STREAM LOAD.md
@@ -123,6 +123,10 @@ Boolean type, true to indicate that json data starts with an array object and fl
 `json_root`
 json_root is a valid JSONPATH string that specifies the root node of the JSON Document. The default value is "".
 
+`merge_type`
+
+The type of data merging supports three types: APPEND, DELETE, and MERGE. APPEND is the default value, which means that all this batch of data needs to be appended to the existing data. DELETE means to delete all rows with the same key as this batch of data. MERGE semantics Need to be used in conjunction with the delete condition, which means that the data that meets the delete condition is processed according to DELETE semantics and the rest is processed according to APPEND semantics
+
 RETURN VALUES
 
 After the load is completed, the related content of this load will be returned in Json format. Current field included
@@ -240,6 +244,11 @@ Where url is the url given by ErrorURL.
        Matched imports are made by specifying jsonpath parameter, such as `category`, `author`, and `price`, for example: 
          curl --location-trusted -u root  -H "columns: category, price, author" -H "label:123" -H "format: json" -H "jsonpaths: [\"$.category\",\"$.price\",\"$.author\"]" -H "strip_outer_array: true" -H "json_root: $.RECORDS" -T testData http://host:port/api/testDb/testTbl/_stream_load
 
+13. delete all data which key columns match the load data 
+    curl --location-trusted -u root -H "merge_type: DELETE" -T testData http://host:port/api/testDb/testTbl/_stream_load
+14. delete all data which key columns match the load data where flag is true, others append
+    curl --location-trusted -u root: -H "column_separator:," -H "columns: siteid, citycode, username, pv, flag" -H "merge_type: MERGE" -H "delete: flag=1"  -T testData http://host:port/api/testDb/testTbl/_stream_load
+
 ## keyword
 
     STREAM, LOAD
diff --git a/docs/zh-CN/administrator-guide/load-data/broker-load-manual.md b/docs/zh-CN/administrator-guide/load-data/broker-load-manual.md
@@ -231,6 +231,8 @@ Label 的另一个作用，是防止用户重复导入相同的数据。**强烈
     2. 对于导入的某列由函数变换生成时，strict mode 对其不产生影响。
 
     3. 对于导入的某列类型包含范围限制的，如果原始数据能正常通过类型转换，但无法通过范围限制的，strict mode 对其也不产生影响。例如：如果类型是 decimal(1,0), 原始数据为 10，则属于可以通过类型转换但不在列声明的范围内。这种数据 strict 对其不产生影响。
++ merge\_type
+    数据的合并类型，一共支持三种类型APPEND、DELETE、MERGE 其中，APPEND是默认值，表示这批数据全部需要追加到现有数据中，DELETE 表示删除与这批数据key相同的所有行，MERGE 语义 需要与delete 条件联合使用，表示满足delete 条件的数据按照DELETE 语义处理其余的按照APPEND 语义处理
 
 #### strict mode 与 source data 的导入关系
 

diff --git a/docs/zh-CN/administrator-guide/load-data/routine-load-manual.md b/docs/zh-CN/administrator-guide/load-data/routine-load-manual.md
@@ -166,6 +166,8 @@ FE 中的 JobScheduler 根据汇报结果，继续生成后续新的 Task，或
     2. 对于导入的某列由函数变换生成时，strict mode 对其不产生影响。
 
     3. 对于导入的某列类型包含范围限制的，如果原始数据能正常通过类型转换，但无法通过范围限制的，strict mode 对其也不产生影响。例如：如果类型是 decimal(1,0), 原始数据为 10，则属于可以通过类型转换但不在列声明的范围内。这种数据 strict 对其不产生影响。
+* merge\_type
+    数据的合并类型，一共支持三种类型APPEND、DELETE、MERGE 其中，APPEND是默认值，表示这批数据全部需要追加到现有数据中，DELETE 表示删除与这批数据key相同的所有行，MERGE 语义 需要与delete 条件联合使用，表示满足delete 条件的数据按照DELETE 语义处理其余的按照APPEND 语义处理
 
 #### strict mode 与 source data 的导入关系
 

diff --git a/docs/zh-CN/administrator-guide/load-data/stream-load-manual.md b/docs/zh-CN/administrator-guide/load-data/stream-load-manual.md
@@ -154,6 +154,8 @@ Stream load 由于使用的是 HTTP 协议，所以所有导入任务有关的
     2. 对于导入的某列由函数变换生成时，strict mode 对其不产生影响。
 
     3. 对于导入的某列类型包含范围限制的，如果原始数据能正常通过类型转换，但无法通过范围限制的，strict mode 对其也不产生影响。例如：如果类型是 decimal(1,0), 原始数据为 10，则属于可以通过类型转换但不在列声明的范围内。这种数据 strict 对其不产生影响。
++ merge\_type
+    数据的合并类型，一共支持三种类型APPEND、DELETE、MERGE 其中，APPEND是默认值，表示这批数据全部需要追加到现有数据中，DELETE 表示删除与这批数据key相同的所有行，MERGE 语义 需要与delete 条件联合使用，表示满足delete 条件的数据按照DELETE 语义处理其余的按照APPEND 语义处理
 
 #### strict mode 与 source data 的导入关系
 

diff --git a/docs/zh-CN/sql-reference/sql-statements/Data Definition/ALTER TABLE.md b/docs/zh-CN/sql-reference/sql-statements/Data Definition/ALTER TABLE.md
@@ -170,6 +170,13 @@ under the License.
         注意：
             1) index 中的所有列都要写出来
             2) value 列在 key 列之后
+
+    6. 启用批量删除支持
+        语法：
+            ENABLE FEATURE "BATCH_DELETE"
+        注意：
+            1） 只能用在unique 表
+            2) 用于旧表支持批量删除功能，新表创建时已经支持
 
     6. 修改table的属性，目前支持修改bloom filter列, colocate_with 属性和dynamic_partition属性，replication_num和default.replication_num属性
         语法：
@@ -343,6 +350,8 @@ under the License.
     15. 修改表的 in_memory 属性
 
         ALTER TABLE example_db.my_table set ("in_memory" = "true");
+    16. 启用 批量删除功能
+        ALTER TABLE example_db.my_table ENABLE FEATURE "BATCH_DELETE"
 
 
     [rename]

diff --git a/docs/zh-CN/sql-reference/sql-statements/Data Manipulation/BROKER LOAD.md b/docs/zh-CN/sql-reference/sql-statements/Data Manipulation/BROKER LOAD.md
@@ -57,6 +57,7 @@ under the License.
 
         用于描述一批导入数据。
         语法：
+            [MERGE|APPEND|DELETE]
             DATA INFILE
             (
             "file_path1"[, file_path2, ...]
@@ -68,7 +69,8 @@ under the License.
             [FORMAT AS "file_type"]
             [(column_list)]
             [SET (k1 = func(k2))]
-            [WHERE predicate]    
+            [WHERE predicate]
+            [DELETE ON label=true]
 
         说明：
             file_path: 
@@ -111,6 +113,14 @@ under the License.
             WHERE:
 
             对做完 transform 的数据进行过滤，符合 where 条件的数据才能被导入。WHERE 语句中只可引用表中列名。
+
+            merge_type:
+
+            数据的合并类型，一共支持三种类型APPEND、DELETE、MERGE 其中，APPEND是默认值，表示这批数据全部需要追加到现有数据中，DELETE 表示删除与这批数据key相同的所有行，MERGE 语义 需要与delete on条件联合使用，表示满足delete 条件的数据按照DELETE 语义处理其余的按照APPEND 语义处理,
+
+            delete_on_predicates:
+
+            表示删除条件，仅在 merge type 为MERGE 时有意义，语法与where 相同
     3. broker_name
 
         所使用的 broker 名称，可以通过 show broker 命令查看。
@@ -184,7 +194,7 @@ under the License.
 
 ## example
 
-    1. 从 HDFS 导入一批数据，指定超时时间和过滤比例。使用铭文 my_hdfs_broker 的 broker。简单认证。
+    1. 从 HDFS 导入一批数据，指定超时时间和过滤比例。使用明文 my_hdfs_broker 的 broker。简单认证。
 
         LOAD LABEL example_db.label1
         (
@@ -429,7 +439,28 @@ under the License.
          SET (data_time=str_to_date(data_time, '%Y-%m-%d %H%%3A%i%%3A%s'))
         ) 
         WITH BROKER "hdfs" ("username"="user", "password"="pass");
-
+
+    13. 从 HDFS 导入一批数据，指定超时时间和过滤比例。使用明文 my_hdfs_broker 的 broker。简单认证。并且将原有数据中与 导入数据中v2 大于100 的列相匹配的列删除，其他列正常导入
+
+        LOAD LABEL example_db.label1
+        (
+        MERGE DATA INFILE("hdfs://hdfs_host:hdfs_port/user/palo/data/input/file")
+        INTO TABLE `my_table`
+        COLUMNS TERMINATED BY "\t"
+        (k1, k2, k3, v2, v1)
+        )
+        DELETE ON v2 >100
+        WITH BROKER my_hdfs_broker
+        (
+        "username" = "hdfs_user",
+        "password" = "hdfs_passwd"
+        )
+        PROPERTIES
+        (
+        "timeout" = "3600",
+        "max_filter_ratio" = "0.1"
+        );
+
 ## keyword
 
     BROKER,LOAD
diff --git a/docs/zh-CN/sql-reference/sql-statements/Data Manipulation/ROUTINE LOAD.md b/docs/zh-CN/sql-reference/sql-statements/Data Manipulation/ROUTINE LOAD.md
@@ -49,10 +49,11 @@ under the License.
     3. load_properties
 
         用于描述导入数据。语法：
-
+        [merge_type],
         [column_separator],
         [columns_mapping],
         [where_predicates],
+        [delete_on_predicates],
         [partitions]
 
         1. column_separator:
@@ -97,6 +98,10 @@ under the License.
             示例：
 
             PARTITION(p1, p2, p3)
+        5. merge_type
+            数据的合并类型，一共支持三种类型APPEND、DELETE、MERGE 其中，APPEND是默认值，表示这批数据全部需要追加到现有数据中，DELETE 表示删除与这批数据key相同的所有行，MERGE 语义 需要与delete on条件联合使用，表示满足delete 条件的数据按照DELETE 语义处理其余的按照APPEND 语义处理, 语法为[WITH MERGE|APPEND|DELETE]
+        6. delete_on_predicates
+            表示删除条件，仅在 merge type 为MERGE 时有意义，语法与where 相同
 
     4. job_properties
 
@@ -432,6 +437,28 @@ under the License.
             {"category":"33","author":"3avc","title":"SayingsoftheCentury","timestamp":1589191387}
             ]
         }
+    7. 为 example_db 的 example_tbl 创建一个名为 test1 的 Kafka 例行导入任务。并且删除与v3 >100 行相匹配的key列的行
+
+        CREATE ROUTINE LOAD example_db.test1 ON example_tbl
+        WITH MERGE
+        COLUMNS(k1, k2, k3, v1, v2, v3),
+        WHERE k1 > 100 and k2 like "%doris%"
+        DELETE ON v3 >100
+        PROPERTIES
+        (
+            "desired_concurrent_number"="3",
+            "max_batch_interval" = "20",
+            "max_batch_rows" = "300000",
+            "max_batch_size" = "209715200",
+            "strict_mode" = "false"
+        )
+        FROM KAFKA
+        (
+            "kafka_broker_list" = "broker1:9092,broker2:9092,broker3:9092",
+            "kafka_topic" = "my_topic",
+            "kafka_partitions" = "0,1,2,3",
+            "kafka_offsets" = "101,0,0,200"
+        );
 ## keyword
 
     CREATE,ROUTINE,LOAD