-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Closed
Labels
Description
Behavior changes
- The following global variables will be forcibly set to the following default values:
- enable_nereids_dml: true
- enable_nereids_dml_with_pipeline: true
- enable_nereids_planner: true
- enable_fallback_to_original_planner: true
- enable_pipeline_x_engine: true
- New columns have been added to the audit log. [fix](auditlog) add missing audit log fields and duplicate audit log error #42262
New features
Lakehouse
Async Materialized View
- An asynchronous materialized view has added a property called use_for_rewrite to control whether it participates in transparent rewriting. [improvement](mtmv) Support to add use_for_rewrite property when create materialized view #40332
Query Execution
- The list of changed session variables is now output in the Profile. Link to the pull request
- Support for trim_in, ltrim_in, and rtrim_in functions has been added. Link to the pull request (Note: This is a duplicate mention, but I'm including it as per your original list.)
- Support for several URL functions (top_level_domain, first_significant_subdomain, cut_to_first_significant_subdomain) has been added. Link to the pull request
- The bit_set function has been added. Link to the pull request
- The count_substrings function has been added. Link to the pull request
- The translate and url_encode functions have been added. Link to the pull request
- The normal_cdf, to_iso8601, and from_iso8601_date functions have been added. Link to the pull request
- (Duplicate entry, but for completeness) Support for trim_in, ltrim_in, and rtrim_in functions has been added. Link to the pull request
Storage Management
- The information_schema.table_options and table_properties system tables have been added, supporting the querying of attributes set during table creation. [Enhancement] add information_schema.table_options(#32572) #34384
- Support for bitmap_empty as a default value has been implemented. [Featrue](default value) Support
bitmap_emptydefault value #40364 - A new session variable require_sequence_in_insert has been introduced to control whether a sequence column must be provided when performing INSERT INTO SELECT writes to a unique key table. https://github.com/apache/doris/pull/41655/files
Others
- Allow for generating flame graphs on the BE WebUI page.[opt](cpu-profile) enable cpu profile in BE webui (#40330) #41044
Improvements
Lakehouse
- Support for writing data to Hive Text format tables: [cherry-pick](branch-2.1) pick hive text write from master #40537
- Access MaxCompute data using MaxCompute Open Storage API: [Enhancement](MaxCompute)Refactoring maxCompute catalog using Storage API.(#40225 , #40888 ,#41386 ) #41610
- Documentation: https://doris.apache.org/docs/lakehouse/database/max-compute
- Support for Paimon DLF Catalog: [feature](paimon)support paimon with dlf for 2.1 (#41247) #41694
- Added table$partitions syntax to directly query Hive partition information: [feat](metatable) support table$partitions for hive table (#40774) #41230
- Support for reading Parquet files in brotli compression format: [cherry-pick](branch-2.1) support reading brotli compressed parquet file #42162
- Support for reading decimal 256 types in Parquet files: [cherry-pick](branch-2.1) support decimal256 for parquet reader #42241
- Support for reading Hive tables in OpenCsvSerde format: [Configuration](transactional-hive) Add
skip_checking_acid_version_filesession var to skip checking acid version file in some hive envs. (#42111)(#42225) #42939
Async Materialized View
- Refined the granularity of lock holding during the build process for asynchronous materialized views. [enhance](mtmv)During cache generation, no longer hold the write lock for mtmv #40402 and [enhance](mtmv)Optimize the logic of mtmv lock #41010.
Query optimizer
- Improved the accuracy of statistic information collection and usage in extreme cases to enhance planning stability. [improvement](statistics)Return -1 to neredis if report olap table row count for new table is not done for all tablets. #40457
- Runtime filters can now be generated in more scenarios to improve query performance. [opt](nereids) enable runtime filter use cte as target #40815
- Enhanced constant folding capabilities for numerical, date, and string functions to boost query performance. [Fix](Nereids) fix append_trailing_char_if_absent function return null #40820
- Optimized the column pruning algorithm to enhance query performance. [opt](Nereids) use 1 instead narrowest column when do column pruning #41548
Query Execution
- Supported parallel preparation to reduce the time consumed by short queries. [PipelineX](improvement) Prepare tasks in parallel #40270
- Corrected the names of some counters in the profile to match the audit logs. [opt](scanner profile) Rename some filed name to keep consistent with audit log. #41993
- Added new local shuffle rules to speed up certain queries. [opt](pipeline) Distribute data evenly for passthrough local exchange #40637
Storage Management
- The SHOW PARTITIONS command now supports displaying the commit version. [chore](show partitions) show partitions print commit version #28274
- Checked for unreasonable partition expressions when creating tables. [Enhancement](DDL) check illegal partition exprs #40158
- Optimized the scheduling logic when encountering EOF in Routine Load. [improve](routine load) delay schedule EOF tasks to avoid too many small transactions (#39975) #40509
- Made Routine Load aware of schema changes. [opt](routine load) support routine load perceived schema change (#39412) #40508
- Improved the timeout logic for Routine Load tasks. [opt](routine load) optimize routine load timeout logic (#40818) #41135
Others
- Allowed closing the built-in service port of BRPC via BE configuration. [Enhancement](brpc)Added enable_brpc_builtin_services parameter in be.conf (#40718) #41047
- Fixed issues with missing fields and duplicate records in audit logs. [fix](auditlog) add missing audit log fields and duplicate audit log error #42262 #43015
Bug fixes
Lakehouse
- Fixed the inconsistency in the behavior of INSERT OVERWRITE with Hive. [bugfix](hive/iceberg)align with Hive insert overwrite table functionality #39840
- Cleaned up temporarily created folders to address the issue of too many empty folders on HDFS. [bugfix](hive)Delete the temporarily created folder #40424
- Resolved memory leaks in FE caused by using the JDBC Catalog in some cases. [branch-2.1][fix](jdbc catalog) Fixed FE memory leak by enabling weak references in HikariCP #40923
- Resolved memory leaks in BE caused by using the JDBC Catalog in some cases. [Fix](jdbc-scanner) Fix jdbc scanner memory leak because it didn't close
outputTable. #41266 - Fixed errors in reading Snappy compressed formats in certain scenarios. [branch-2.1](fix) fix snappy decompressor bug #40862
- Addressed potential FileSystem leaks on the FE side in certain scenarios. [branch-2.1][Fix](hdfs-fs)The cache expiration should explicitly release the held fs (#38610) #41108
- Resolved issues where using EXPLAIN VERBOSE to view external table execution plans could cause null pointer exceptions in some cases. [fix](explain) fix NPE when explain verbose with partition batch mode (#40969) #41231
- Fixed the inability to read tables in Paimon parquet format. [bugfix](paimon)Get the file format by file name (#41020) #41487
- Addressed performance issues introduced by compatibility changes in the JDBC Oracle Catalog. [fix](oracle scan) Fix performance issues caused by version judgment #41407
- Disabled predicate pushing down after implicit conversion to resolve incorrect query results in some cases with JDBC Catalog. [2.1][improvement](jdbc catalog) Disallow non-constant type conversion pushdown and implicit conversion pushdown #42242
- Fixed issues with case-sensitive access to table names in the External Catalog. [2.1][opt](Catalog) Remove unnecessary conjuncts handling on External Scan #42261
Async Materialized View
- Fixed the issue where user-specified start times were not effective. [fix](mtmv)Mtmv support set both immediate and starttime #39573
- Resolved the issue of nested materialized views not refreshing. [fix](mtmv)fix nested mtmv not refresh #40433
- Fixed the issue where materialized views might not refresh after the base table was deleted and recreated. [fix](mtmv)fix in the scenario of recreating a table, the materialized view may assume that the data has not changed #41762
- Addressed issues where partition compensation rewrites could lead to incorrect results. [fix](mtmv) Fix compensate union all wrongly when query rewrite by materialized view #40803
- Fixed potential errors in rewrite results when sql_select_limit was set. [fix](mtmv) Disable sql_limit variable when query rewrite by materialize view #40106
Semi-Structured Data Management - Fixed the issue of index file handle leaks. [fix](index compaction) fix fd leak and mem leak while index compaction #41915
- Addressed inaccuracies in the count() function of inverted indexes in special cases. [Fix](inverted index) fix wrong opt for count_on_index #41127
- Fixed exceptions with variant when light schema change was not enabled. [fix](Variant) check enable light_schema_change when create table with variant type #40908
- Resolved memory leaks when variant returns arrays. [Fix](Serde-2.1) fix potential mem leak in array serde write_one_cell_to_json #41339
Query optimizer
- Corrected potential errors in nullable calculations for filter conditions during external table queries, leading to execution exceptions. [fix](nereids)adjust conjunct's nullable info in LogicalExternalRelation #41014
- Fixed potential errors in optimizing range comparison expressions. [fix](Nereids) simplify range result wrong when reference is nullable #41356
Query Execution
- The match_regexp function could not correctly handle empty strings. [fix](inverted index) Fix match_regexp to correctly handle empty string patterns #39503
- Resolved issues where the scanner thread pool could become stuck in high-concurrency scenarios. [fix](scanner) Fix deadlock when scanner submit failed #40495
- Fixed errors in the results of the data_floor function. [bug](function)fix date_floor function return wrong result #41948
- Addressed incorrect cancel messages in some scenarios. [fix](cancel) Fix cancel msg on branch-2.1 #41798
- Fixed issues with excessive warning logs printed by arrow flight. [fix](arrow-flight-sql) Fix kill timeout FlightSqlConnection and FlightSqlConnectProcessor close #41770
- Resolved issues where runtime filters failed to send in some scenarios. [Improvement](runtime-filter) set some rf brpc request to ignore_eovercrowded #41698
- Fixed problems where some system table queries could not end normally or became stuck. [fix](schema scan) Finish schema scanner if limitation is reached #41592
- Addressed incorrect results from window functions. [Bug](partition_topn) fix partition_topn not reset output rows after do_partition_topn_sort #40761
- Fixed issues where the encrypt and decrypt functions caused BE cores. [fix](encrypt) wrong mode arg of encrypt and decrypt function make BE crash #40726
- Resolved errors in the results of the conv function. [Bug](conv) fix conv function parser string failure return wrong result #40530
Storage Management
- Fixed import failures when Memtable migration was used in multi-replica scenarios with machine crashes. [fix](move-memtable) multi replica tables should tolerate minority failures #38003
- Addressed inaccurate memory statistics during the Memtable flush phase during imports. [refactor](loadmemlimit) remove load memlimit since it is never used #39536
- Fixed fault tolerance issues with Memtable migration in multi-replica scenarios. [fix](move-memtable) multi replica tables should tolerate minority failures (#38003) #40477
- Resolved inaccurate bvar statistics with Memtable migration. [fix](move-memtable) fix bvar g_load_stream_file_writer_cnt (#39075) #40985
- Fixed inaccurate progress reporting for S3 loads. [fix](pipelinex) fix fragment instance progress reports (#40325) #40987
Permissions
- Fixed permission issues related to show columns, show sync, and show data from db.table. [fix](auth)Fix some issues with incorrect permission verification #39726
Others
- Fixed the issue where the audit log plugin for version 2.0 could not be used in version 2.1. [fix](audit_loader) fix that old external audit loader plugin not work because of incompatibility with new audit plugin (#40565) #41400
lordk911