-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Closed
Labels
Stalekind/featureCategorizes issue or PR as related to a new feature.Categorizes issue or PR as related to a new feature.
Description
Behavior Changes
- For delete, the SELECT_PRIV permission of the target table is no longer wrongly required. branch-2.1:[fix](auth)Delete from should not check select_priv #49794
- For insert overwrite, the concurrency limit of 1 for the same table is removed. [enhance](mtmv)Only restrict MTMV to not allow concurrent insert overwrite execution #48673
- For tables with Merge on write unique, the time series compaction is prohibited. [Opt](mow) Forbid time_series compaction policy on unique table #49905
New Features
Query Execution Engine
- Supports more GEO type calculation functions such as ST_CONTAINS, ST_INTERSECTS, ST_TOUCHES, GeometryFromText, ST_Intersects, ST_Disjoint, ST_Touches. [Enhancement] (GEO) Support MultiPolygon for Geometry functions #49665 [Enhancement] Support some spatial functions #48695
- Supports the years_of_week function. [Feature](function) support year of week #48870
Lake-House Integration
- For Hive Catalog, the partition cache switch control at the Catalog level is supported. ([feat](hive) add catalog level partition cache property #50724)
Documentation: https://doris.apache.org/zh-CN/docs/dev/lakehouse/meta-cache#%E5%85%B3%E9%97%AD-hive-catalog-%E5%85%83%E6%95%B0%E6%8D%AE%E7%BC%93%E5%AD%98 - The Paimon dependency version is upgraded to 1.0.1.
- The Iceberg dependency version is upgraded to 1.6.1.
- The memory overhead of the Parquet Footer is included in the Memory Tracker for management to avoid potential OOM issues. ([fix](Parquet) add a memory tracker to parquet meta #49037)
- Optimize the predicate pushdown logic of the JDBC Catalog to support the pushdown of connection predicates such as AND and OR ([fix](jdbc catalog) Improve conjunct expression handling in JdbcScanNode #50542).
- The pre-compiled version by default comes with the Jindofs extension package to support access to Alibaba Cloud OSS-HDFS.
Semi-structured Management
- The ANY function supports the JSON type ([feat](func) any function supports json #50311).
- Functions like JSON_REPLACE, JSON_INSERT, JSON_SET, JSON_ARRAY support JSON data types and complex data types ([fix](json-functions)fix json-replace/insert/set/array behavior with complex type #50308).
Query Optimizer
- When the options in the in expression are more than Config.max_distribution_pruner_recursion_depth, the bucket pruning is not executed to - improve the planning speed (branch-2.1: [opt](nereids) skip run PruneOlapScanTablet when exists lots of InPredicate #49387).
Storage Management
- Reduce logs and improve some logs. [log](mow) reduce log for data load on mow table #47647 [Fix](mow) Fix some logs for mow #48523
Others
- Avoid the thrift rpc END_OF_FILE exception [fix](thrift) Pick THRIFT-5492: Add readEnd to TBufferedTransport #49649
Bug Fixes
##Lake-House Integration
- Fix the issue that when a table is newly created on the Hive side, it cannot be immediately viewed on the Doris side in some cases ([fix](catalog) do cache load when cache value is not present #50188).
- Fix the issue of the error "Storage schema reading not supported" when accessing some Text format Hive tables ([opt](hive) add option to get schema from table object #50038).
- Search for "get_schema_from_table" in the documentation: https://doris.apache.org/zh-CN/docs/dev/lakehouse/catalogs/hive-catalog
- Fix the concurrency issue of metadata submission when writing to Hive/Iceberg tables in some cases ([fix](multi-catalog) Fix multi-thread issue in hive/iceberg writer commit meta-info to fe. #49842).
- Fix the issue of failed writing to Hive tables stored on oss-hdfs in some cases ([fix](oss) the write to hive table on oss-hdfs may fail #49754).
- Fix the issue of failed access when the Hive partition key value contains a comma in some cases ([fix](multi-catalog) Fix bug: "Can not create a Path from an empty string" #49382).
- Fix the issue of uneven Split distribution for Paimon tables in some cases ([fix](paimon)Set the target size of the split. #50083).
- Fix the issue of incorrect processing of Delete files when reading Paimon tables stored on oss in some cases ([fix](paimon) Covert Paimon DeletionFile Path to StoragePath in fe #49645).
- Fix the issue of inaccessible columns with high-precision Timestamps when reading in the MaxCompute Catalog ([fix](mc)Fixed the issue that maxcompute catalog can only read part of the timestamp data #49600).
- Fix the issue of potential partial resource leakage when deleting a Catalog in some cases ([Fix](Catalog) Close system resources when dropping catalog #49621).
- Fix the issue of failed reading of data in LZO compression format in some cases ([fix](lzo) fix lzo decompression failed #49538).
Fix the issue of incorrect reading of complex types caused by the ORC lazy materialization feature in some cases ([fix](orc) Should not pass selection vector when decode child column of List or Map #50136). - Fix the issue of errors when reading ORC files generated by the pyorc-0.3 version in some cases ([fix](orc-reader) Fixed issue with top level struct column having present stream failing to access repeatedly when late materialization occurs. #50358).
- Fix the issue of metadata deadlock caused by the EXPORT operation in some cases ([fix](Export) fix the lock leak issue of Export #50088).
Indexes
- Fix the error in building the inverted index after multiple operations of adding, deleting, and renaming columns ([Fix](inverted index) fix rename column build index bug #50056).
- Validate the unique ID of the columns corresponding to the index during index compaction to avoid potential data exceptions and system errors ([fix](index compaction)Add column unique id check before use #47562).
Semi-structured Data Types
- Fix the issue of returning NULL when converting the VARIANT type to the JSON type in some cases ([Fix](Variant) fix variant cast to jsonb into wrong NULL values #50180).
- Fix the crash caused by JSONB CAST in some cases ([Bug][function] fix the string cast jsonb cause null map have not init value #49810).
- Prohibit building indexes on the VARIANT type ([fix](variant) building index on the variant column is prohibited #49159).
- Fix the correctness of the precision of the decimal type in the named_struct function ([fix](named_struct) fix named_struct signature which deduce wrong for nested decimal precision #48964).
Query Optimizer
- Fix some issues in constant folding [fix](Nereids) use StringLikeLiteral as parameter type in constant folding #49413 [fix](constant fold)Make sure FE cast double to varchar generate identical result with BE. #50425 [fix](Nereids) fix unix_timestamp #49686 [fix](Nereids) fix regression framework compare issue and fix code point count #49575 [fix](nereids) fix fold constant return wrong scale of datetime type #50142
- The public expression extraction may work abnormally on lambda expressions [fix](Nereids) cse extract wrong expression from lambda expressions #49166
- Fix the issue that the elimination of constants in the group by key may not work properly [fix](nereids) do eliminate constant group by key in normalizeagg #49589
- Fix the issue that the planning cannot be executed normally due to incorrect derivation of statistical information in extreme scenarios [opt](nereids) catch all exceptions in StatsCalculator #49415
- Fix the issue that some information_schema tables relying on metadata in BE cannot obtain complete data [fix](information_schema) fix backend_active_tasks table only return one backend's data #50721
Query Execution Engine
- Fix the issue of not finding the explode_json_array_json_outer function. [Bug](function) fix Could not find function explode_json_array_json_outer #50164
- Fix the issue that the substring_index function does not support dynamic parameters. [feat](function) SUBSTRING_INDEX function delimiter supports dynamic #50149
- Fix the issue that the calculation results of the st_contains function are incorrect in many cases. [fix](geo) Fix ST_Contains behavior #50115
- Fix the issue that the array_range function may cause a core problem. [Fix](function) fix wrong length check of function array_range #49993
- Fix the issue of incorrect calculation results of the date_diff function. [Fix](function) fix wrong floor of function date_diff when unit less than day #49429
- Fix a series of issues of garbled characters or incorrect results of string functions under non-ASCII encoding. [feature](function) upper lower support utf8 input #49231 [feature](function) support utf8 input in initcap #49846 [fix](function) fix error result when input utf8 in url_encode, strright, append_trailing_char_if_absent #49127 [fix](function) fix error result in split_by_string with utf8 chars #40710
Storage Management
- Fix the issue of failed metadata playback for dynamic partition tables in some cases ([fix](meta) do not check replica allocation when replay #49569).
- Fix the issue of potential data loss due to the operation sequence in streamload under the arm architecture [fix] (streamload) fixed the issue of data loss due to concurrency wh… #48948
- Fix the issue of errors in full compaction and the potential problem of duplicate data in mow [Fix](full compaction) Fix problems for full compaction #49825 [Fix](compaction) Fix full compaction error when compaction size is too large #48958
- Fix the issue of the lack of persistence of partition storage policies. [fix](gson) Missing the serialization of the partition's storage policy #49721
- Fix the issue of the extremely low probability of file non-existence after import. [fix](path gc) Fix path gc race with publish task #50343
- Fix the issue of file not being found that may be caused by ccr and disk balance concurrency. [fix](binlog) Acquire migration lock before ingesting binlog #50663
- Fix the issue of connection reset that may occur when backing up and restoring large snapshots. [fix](thrift) Pick THRIFT-5492: Add readEnd to TBufferedTransport #49649
- Fix the issue of local backup snapshot loss for FE followers. [fix](backup) Save snapshot meta during replay #49550
Others
- Fix the issue of potential loss of audit logs in some scenarios ([fix](audit) fix potential audit log missing issue #50357).
- Fix the issue that the isQuery flag in audit logs may be incorrect Set isQuery audit log correctly when fallback to old planner. #49959
- Fix the issue that the sqlhash of some queries in audit logs is incorrect [fix](auditlog)Set sqlHash in executeInternalQuery #49984
zclllyybb and Ruffianjiang
Metadata
Metadata
Assignees
Labels
Stalekind/featureCategorizes issue or PR as related to a new feature.Categorizes issue or PR as related to a new feature.