-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
Behavior Changed
-
Non-existent files will be ignored when querying external tables such as Hive. ([opt](external) ignore not find files #35319)
The file list is obtained from the meta cache, and it may not be consistent with the actual file list.
Ignoring non-existent files helps to avoid query errors. -
By default, creating a Bitmap Index will no longer be automatically changed to an Inverted Index. ([chore](index) add config enable_create_bitmap_index_as_inverted_index default true #33434 #35521)
This behavior is controlled by the FE configuration item
enable_create_bitmap_index_as_inverted_index, which defaults to false. -
When starting FE and BE processes using
--console, all logs will be output to the standard output and differentiated by prefixes indicating the log type. ([opt](log) refine the FE logger #35679) -
If no table comment is provided when creating a table, the default comment will be empty instead of using the table type as the default comment. ([fix](DDL) not set table type as default comment when create table #36025)
-
The default precision of decimalv3 has been adjusted from (9, 0) to (38, 9) to maintain compatibility with the version in which this feature was initially released. ([fix](nereids)change the decimal's precision and scale for cast(xx as decimal) #36316)
New Features
-
Query Optimizer
- Support FE flame graph tool (https://doris.apache.org/community/developer-guide/fe-profiler/)
- Support
SELECT DISTINCTto be used with aggregation. - Support single table query rewrite without GROUP BY. This is useful for complex filters or expressions. ([feature](mtmv)Support single table rewrite cherry21 #35242)
- The new optimizer fully supports point query functionality ([Feature](Point Query) fully support in nereids #36205).
-
Lakehouse
- Support native reader of Apache Paimon deletion vector ([feature](Paimon) support deletion vector for Paimon naive reader (#34743) #35241)
- Support using Resource in Table Valued Functions ([Enhencement](tvf) select tvf supports using resource #35139)
- Access controller with Hive ranger plugin supports Data Mask
-
Asynchronous Materialized Views
- Support partition roll-up during construction. ([enhance](mtmv)Mtmv rollup #31812)
- Support triggered updates during construction. ([enhance](mtmv)Mv refresh on commit #34548)
- Support specifying the
store_row_columnandstorage_mediumattribute during construction ([fix](mtmv)Mtmv support row column #35860) - Transparent rewrite supports single table asynchronous materialized views. ([improvement](mtmv) Split the expression mapping in LogicalCompatibilityContext for performance #34646)
- Transparent rewrite supports agg_state type aggregation roll-up. ([feature](mtmv)Support agg state roll up and optimize the roll up code #35026)
-
Others
-
Added function
replace_empty. (https://doris.apache.org/docs/sql-manual/sql-functions/string-functions/replace_empty) -
Support
show storage policy usingstatement. (https://doris.apache.org/docs/sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING/) -
Support JVM metrics on the BE side
by setting
enable_jvm_monitor=trueinbe.confto enable this feature.
-
Optimization
- Estimate memory consumed by segment cache more accurately so that unused memory can be released more quickly. ([fix](segment cache) estimate momory consumed by segment (#35647) #35751)
- Filter empty partitions before exporting tables to remote storage. ([enhancement](export) filter empty partition before export table to remote storage (#35389) #35542)
- Optimize routine load task allocation algorithm to balance the load among Backends. ([opt](routine-load) optimize routine load task allocation algorithm #34778)
- Provide hints when a related variable is not found during a set operation. ([improve](session) print more error msg when set a wrong session variable name #35775)
- Support placing Java UDF jar files in the FE's
custom_libdirectory for default loading. ([fix](multi-catalog)put java udf to custom lib #35984) - Add a timeout global variable
audit_plugin_load_timeoutfor audit log load jobs. - Optimize the performance of transparent rewrite planning for asynchronous materialized views.
- Optimize the Insert operation that when the source is empty, the BE will not execute. ([Feat](nereids)when dealing insert into stmt with empty table source, fe returns directly #34418)
- Support fetching file lists of hive/hudi tables in batches ([opt](split) get file splits in batch mode #35107).
Bugfix
Query Optimizer
- Fixed the issue where SQL cache returns old results after truncating a partition. ([fix](Nereids) fix sql cache return old value when truncate partition #34698)
- Fixed the issue where casting from JSON to other types did not correctly handle nullable attributes. ([fix](Nereids) cast from json should always nullable #34707)
- Fixed occasional datetimev2 literal simplification errors. ([fix](Nereids) DatetimeV2 round floor was incorrectly implemented as round ceil #35153)
- Fixed the issue where
count(*)could not be used in window functions. ([fix](Nereids) remove restrict for count(*) in window #35220) - Fixed the issue where nullable attributes could be incorrect when all
SELECTstatements underUNION ALLhave noFROMclause. ([fix](nereids)AdjustNullable rule should handle union node with no children #35074) - Fixed the issue where
bitmap in joinand subquery unnesting could not be used simultaneously. ([fix](nereids)set mark join reference for bitmap-in-apply #35435) - Fixed the performance issue where filter conditions could not be pushed down to the CTE producer in specific situations. ([fix](Nereids) could not push down filter through cte producer sometimes #35463)
- Fixed the issue where aggregate combinators written in uppercase could not be found. ([fix](Nereids) aggregate combinator should be case-insensitive #35540)
- Fixed the performance issue where window functions were not properly pruned by column pruning. ([fix](Nereids) prune not required window expressions on window operator #35504)
- Fixed the issue where queries might parse incorrectly leading to wrong results when multiple tables with the same name but in different databases appeared simultaneously in the query. ([fix](Nereids) remove getTableInMinidumpCache temporary #35571)
- Fixed the query error caused by generating runtime filters during schema table scans. ([fix](nereids) do not generate runtime filter on schema-scan #35655)
- Fixed the issue where nested correlated subqueries could not execute because the join condition was folded into a null literal. ([fix](nereids)keep equal predicate as join conjunct even if it can be fold to null literal #35811)
- Fixed the occasional issue where decimal literals were set with incorrect precision during planning. ([fix](nereids)decimal and datetime literal comparison should compare datatype too #36055)
- Fixed the occasional issue where multiple layers of aggregation were merged incorrectly during planning. ([Fix](nereids) fix merge aggregate rule, rules should not have mutable members #36145)
- Fixed the occasional issue where the input-output mismatch error occurred after aggregate expansion planning. ([Fix](nereids)make agg output unchanged after normalized repeat #36207)
- Fixed the occasional issue where
<=>was incorrectly converted to=. ([fix](nereids)NullSafeEqualToEqual rule should keep <=> unchanged if it has none-literal child #36521)
Query Execution
- Fixed the issue where the query hangs if the limited rows are reached on the pipeline engine and memory is not released. ([fix](pipeline) Fix query hang up if limited rows is reached (#35513) #35746)
- Fixed the BE coredump when
enable_decimal256is true but falls back to the old planner.([fix](decimal256) fix coredump when enable decimal256 but fallback to old planner #35731)
Asynchronous Materialized Views
- Fixed the issue where asynchronous materialized views caused backup and restore exceptions. ([fix](mtmv) ignore MTMV when backup and restore (#35586) #35703)
- Fixed the issue where partition rewrite could lead to incorrect results. ([fix](mtmv) Fix partition mv rewrite result wrong #35236)
Semi-structured
- Fixed the core dump problem when a variant with an empty key is used. ([Fix](Variant) fix variant with empty key #35671)
- Bitmap and bloomfilter index should not perform light index changes. ([fix](index) bitmap and bloomfilter index should not do light index change #35225)
- Supported creating inverted indexes for columns with Chinese names. ([fix](inverted index)Support Chinese column name with inverted index #36321)
Primary Key
- Fixed the issue where an exception BE restart occurred in the case of partial column updates during import, which could result in duplicate keys. ([branch-2.1](partial-update) duplicate key occurred when BE restart #35678)
- Fixed the issue where BE might core dump during clone operations when memory is tight. ([cherry-pick](branch-2.1) remove some CHECKs in Tablet::revise_tablet_meta (#31268) #34702)
Lakehouse
- Fixed the issue where a Hive table could not be created with a fully qualified name such as
ctl.db.tbl([fix](hive-ctas) allow use qualified name when create hive table #34984) - Fixed the issue where the Hive metastore connection did not close when refreshing ([fix](catalog) close connection on refresh #35426)
- Fixed a potential meta replay issue when upgrading from 2.0.x to 2.1.x ([fix](meta) fix catalog replay error #35532)
- Fixed the issue where the table-valued function could not read an empty snappy compressed file ([Fix](tvf) Fix that tvf reading empty files in compressed formats. #34926)
- Fixed the issue where unable to read parquet files with invalid min-max column statistics ([Fix](parquet-reader) Fix INT96 timestamp min-max statistics is incorrect when was written by some old parquet writers by disable it. #35041)
- Fixed the issue where unable to handle pushdown predicates with null-aware functions in the parquet/orc reader ([Fix](multi-catalog) Fix string dictionary filtering when using null related functions in parquet and orc reader by disabling dictionary filtering when predicates contain functions. #35335)
- Fixed the issue about the order of partition columns when creating a Hive table ([Fix](hive-writer) Fix partition column orders issue when the partition fields inserted into the target table are inconsistent with the field order of the query source table and the schema field order of the query source table. #35347)
- Fixed the issue where writing to a Hive table on S3 failed when partition values contained spaces ((Fix)[hive-writer] Fixed the issue when partition values contain spaces when writing to s3. #35645)
- Fixed the issue about incorrect scheme of Aliyun OSS endpoint ([fix](multi-catalog)remove http scheme in oss endpoint #34907)
- Fixed the issue where the parquet format Hive table written by Doris could not be read by Hive ([bugfix](hive)Misspelling of class names #34981)
- Fixed the issue where unable to read orc files after the schema change of a Hive table ([fix](orc)fix orc reader missing column and filter missing column. #35583)
- Fixed the issue where unable to read Paimon tables via JNI after the schema change of the Paimon table ([fix](paimon)fix paimon cache bug #35309)
- Fixed the issue of too small Row Groups in Parquet format files written out ([opt](parquet-writer) Specify the row group size when writing data to Parquet files. (#35081) #36042) ([Fix](outfile) Add a configuration for exporting data in Parquet format using
select into outfile#36143) - Fixed the issue where unable to read Paimon tables after schema changes ([bugfix](paimon)paimon's field length judgment error for 2.1 #36049)
- Fixed the issue where unable to read Hive Parquet format tables after schema changes ([fix](parquet) fix parquet reader missing column and filter missing column #36182)
- Fixed the FE OOM issue caused by Hadoop FS cache ([fix](hudi) disable fs.impl.cache to avoid FE OOM (#36402) #36403)
- Fixed the issue where FE could not start after enabling the Hive Metastore Listener ([fix](catalog) fix wrong check when using "use_meta_cache=true" #36533)
- Fixed the issue of query performance degradation with a large number of files ([fix](split) FileSystemCacheKey are always different in overload equals #36431)
- Fixed the timezone issue when reading the timestamp column type in Iceberg ([bugfix](iceberg)Read error when timestamp does not have time zone for 2.1 #36435)
- Fixed datetime conversion error and data path error on Iceberg Table. ([bugfix](iceberg)fix datetime conversion error and data path error #35708)
- Support retain and pass the additional user-defined properties fo table-valued functions to the S3 SDK. ([Fix](tvf) Pass through user-defined properties #35515)
S3 SDK. [Fix](tvf) Pass through user-defined properties #35515
Data Import
- Fixed the issue where
CANCEL LOADdid not work ([fix](load) fix wrong assert and cancel load error #35352) - Fixed the issue where a null pointer error in the Publish phase of load transactions prevented the load from completing ([fix](statistics) NPE when drop partition during publish (pick #35475) #35977)
- Fixed the issue with brpc serializing large data files when sent via HTTP ([fix](rpc) fix transfer large data and enable transfer_large_data_by_brpc by default #35770 #36169)
Data Management
- Fixed the isseu that the resource tag in ConnectionContext was not set after forwarding DDL or DML to master FE. ([fix](resource-tag) missing resource tag after forwarding to master #35618)
- Fixed the issue where the restored table name was incorrect when
lower_case_table_nameswas enabled ([fix](restore) Fix restore table name when lower_case_table_names enabled #35508) - Fixed the issue where
admin clean trashcould not work ([fix](clean trash) Fix clean trash lost submit task #35271) - Fixed the issue where a storage policy could not be deleted from a partition ([fix](coldhot) fix cannot cancel storage policy of partition #35874)
- Fixed the issue of data loss when importing into a multi-replica automatic partition table ([branch-2.1](auto-partition) Fix auto partition load failure in multi replica #36586)
- Fixed the issue where the partition column of a table changed when querying or inserting into an automatic partition table using the old optimizer ([branch-2.1](auto-partition) fix auto partition expr change unexpected (#36345) #36514)
Memory Management
- Fixed the issue of frequent errors in the logs due to failure in obtaining Cgroup meminfo. ([branch-2.1](memory) Fix BE memory info compatible with Cgroup #35425)
- Fixed the issue where the Segment cache size was uncontrolled when using Bloom filters, leading to abnormal process memory growth. ([Fix](bloom filter) Fix bloom filter memory leak #34871)
Permissions
- Fixed the issue where permission settings were ineffective after enabling case-insensitive table names. ([fix](auth)Auth support case insensitive (#36381) #36557)
- Fixed the issue where setting LDAP passwords through non-Master FE nodes did not take effect. ([fix](auth)ldap set passwd need forward to master (#36436) #36598)
- Fixed the issue where authorization could not be checked for the
SELECT COUNT(*)statement. ([fix](auth)Fix no auth,but can select count(*) #35465)
Others
- Fixed the issue where the client JDBC program could not close the connection if the MySQL connection was broken. ([fix](connection) kill connection when meeting Write mysql packet failed error #36559 #36616)
- Fixed MySQL protocol compatibility issue with the
SHOW PROCEDURE STATUSstatement. ([fix](Nereids) fix ShowProcedureStatusCommand sendResultSet #35350) - The
libeventnow forces Keepalive to solve the issue of connection leaks in certain situations. ([fix](third-party) enable keepalive on socket created by libevent #36088)
Big Thanks
Thanks to every one who contributes to this release.
@133tosakarin
@924060929
@airborne12
@amorynan
@AshinGau
@BePPPower
@BiteTheDDDDt
@ByteYue
@caiconghui
@CalvinKirs
@cambyzju
@catpineapple
@cjj2010
@csun5285
@DarvenDuan
@dataroaring
@deardeng
@Doris-Extras
@eldenmoon
@englefly
@feiniaofeiafei
@felixwluo
@freemandealer
@Gabriel39
@gavinchou
@GoGoWen
@HappenLee
@hello-stephen
@hubgeter
@hust-hhb
@jacktengg
@jackwener
@jeffreys-cat
@Jibing-Li
@kaijchen
@kaka11chen
@Lchangliang
@liaoxin01
@LiBinfeng-01
@lide-reed
@luennng
@luwei16
@mongo360
@morningman
@morrySnow
@mrhhsg
@Mryange
@mymeiyi
@nextdreamblue
@platoneko
@qidaye
@qzsee
@seawinde
@shuke987
@sollhui
@starocean999
@suxiaogang223
@TangSiyang2001
@Thearas
@Vallishp
@w41ter
@wangbo
@whutpencil
@wsjz
@wuwenchi
@xiaokang
@xiedeyantu
@XieJiann
@xinyiZzz
@XuPengfei-1020
@xy720
@xzj7019
@yiguolei
@yongjinhou
@yujun777
@Yukang-Lian
@Yulei-Yang
@zclllyybb
@zddr
@zfr9527
@zgxme
@zhangbutao
@zhangstar333
@zhannngchen
@zhiqiang-hhhh
@zy-kkk
@zzzxl1993