-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
Since the release of 0.13, Apache Doris (incubating) contains around 390 new features, bug fixes, performance enhancements, documentation improvements, code refactors from 60+ contributors. We are ready to release Apache Doris (incubating) 0.14.
New Feature
Import and delete
Support to delete multiple pieces of data at one time through the import method to avoid performance degradation caused by multiple deletions. For tables of the UniqueKey model, support to specify the Sequence column when importing. Doris will judge the sequence of the data according to the value of the Sequence column to ensure that the data is imported Time order
Support database backup
The support in the backup stmt specifies the backup content (metadata and data).
Support exclude backup and restore some tables in stmt. When backing up the entire database, you can exclude some very large and unimportant tables.
Supports backing up and restoring the entire database instead of declaring each table name in the backup and restore statement.
[#5314]
ODBC external table support
Support access to external tables such as MySQL, postgresql, Oracle, etc. through ODBC protocol
[#4798] [#4438] [#4559] [#4699]
Support SQL level and Partition level result Cache
Support for caching query results to improve the efficiency of repeated queries, support SQL-level and Partition-level results Cache [#4330]
Built-in functions
- Support bitmap_xor function [add BE udf bitmap xor #5098]
- Add replace() function [udf: replace function #4347]
- Add the time_round function to support time alignment according to multiple time granularities [[Feature] Add time_round builtin functions #4640]
FE interface and HTTP interface
-
The new FE UI interface can be enabled by setting the FE configuration item enable_http_server_v2 [[UI Part 5] Enable HTTP Server 2 by FE config #4684]
-
BE adds an http interface to show the distribution of all tablets in a partition among different disks in a BE [[Feature] Add a http interface to acquire the tablets distribution between different disks #5096]
-
BE adds an http interface to manually migrate a tablet to other disks on the same node [[Feature] Add a http interface for single tablet migration between different disks #5101]
-
Support to modify the configuration items of FE and BE through http, and persist these modifications [[Feature][Config] Support persistence of configuration items modified at runtime #4704]
Compatibility with MySQL
- Added support for views table in the information_schema database [Error happened when open a view in DBeaver #4778]
- Added table_privileges, schema_privileges and user_privileges to the information_schema library for compatibility with certain MySQL applications [Added #4897: Add table_privileges, schema_privileges and user_privile… #4899]
- A new statistic table is added to the information_schema meta-database for compatibility with some MySQL tools [[Enhance] Add "statistics" meta table and fix some mysql compatibility problem #4991]
Monitoring
-
BE added tablet-level monitoring indicators, including scanned data volume and row number, written data volume and row number, to help locate hot tablets [[Metrics] Support tablet level metrics #4428]
-
BE added metrics to view the usage of various LRU caches [[LRUCache] Expose LRU Cache status to metrics #4688]
Table building related
- Added CREATE TABLE LIKE statement to facilitate the creation of a table metadata copy [[CREATE TABLE]Support new syntax CREATE TABLE LIKE to clone an existe… #4705]
- Support atomic replacement of two tables through replace statement [[Feature] Support REPLACE TABLE operation. #4669]
Support backup,restore,load,export directly connect to s3 [#5399]
Other
-
Support adding Optimizer Hints of type SET_VAR in the Select statement to set session variables [[Feature]Support SELECT Optimizer Hints SET_VAR #4504]
-
Support to repair damaged tablets by filling in empty tablets [[Tablet][Recovery] Support using empty tablet to repair the damaged or missing tablet #4255]
-
Support Bucket Shuffle Join function (when the Join condition column is a subset of the table bucket column, the right table will be shuffled to the node where the data in the left table is located, which can significantly reduce the network overhead caused by Shuffle Join and improve query speed) [# 4677]
-
Support batch cancel import tasks through cancel load statement [[Feature] Support cancel load jobs in batch #4515]
-
Add a Session variable to set whether to allow the partition column to be NULL [[Repair] Add an option whether to allow the partition column to be NULL #5013]
-
Support TopN aggregation function [[Feature] Add Topn udaf #4803]
-
Support a new data balancing logic based on the number of partitions and buckets [[Rebalancer] support partition rebalancer #5010]
-
Support creating indexes on the value column of unique table [Support create index on unique value column #5305]
Enhancement
Performance improvement
- Implemented a new compaction selection algorithm, providing lower write amplification and a more reasonable compaction strategy [Compaction rules optimization #4212]
- Optimize bit operation efficiency in variable length coding [Optimise coding bit operation in BE #4366]
- Improve the execution efficiency of monery_format function [[UDF] Improve performance of function money_format #4672]
- Optimize query execution plan: When the bucket column of the table is a subset of the GroupBy column in SQL, reduce the data shuffle step [Do not add exchange when table's distributioin satisfy the distribution requirements #4482]
- Improve the efficiency of column name search on BE [[Optimize] Add an unordered_map for TabletSchema to speed up column name lookup #4779]
- Improve the performance of the BE side LRU Cache [[Optimize] Improve LRU cache's performance #4781]
- Optimized the tablet selection strategy of Compaction, reducing the number of invalid selections [[Bug][Compaction] Fix bug that output rowset is not deleted after compaction failure. #4964]
- Optimized the reading efficiency of Unique Key table [Optimized the read performance of the table when have multi versions #4958]
- Optimized the memory usage of LoadJob on the FE side and reduced the memory overhead on the FE side [[Bug] fix finished load jobs cost too much heap #4993]
- Reduce the lock granularity in FE metadata from Database level to Table level to support more fine-grained concurrent access to metadata [Support read and write lock in table level to reduce lock competition #3775]
- Avoid unnecessary memory copy when creating hash table [[optimization] avoid extra memory copy while build hash table #5301]
- Remove the path check when BE starts to speed up BE startup speed [(#5267) remove path check when start BE #5268]
- Optimize the import performance of Json data [Add fuzzy_parse option to speed up json import #5114]
Functional improvements
- SQL supports collate utf8_general_ci syntax to improve MySQL syntax compatibility [support collate field option in compare predicate sql from datagrip #4365]
- Improve the function of Batch delete, improve and optimize the related compaction process [Support batch delete[part 2] #4425]
- Enhance the function of parse_url() function, support lowercase, support parsing port [Fix parse url bug #4429]
- When SQL execution specifies the execution mode of join (Join Hint), the Colocation Join function will be disabled by default [[Bug] Support disable colocate join where join clause has join hint #4497]
- Dynamic partition support hour level [[Feature] support hour time unit with dynamic parition #4514]
- HTTP interface on BE side supports gzip compression [[Enhance] Support gzip compression for http response #4533]
- Optimized the use of threads on the BE side [[refactor] Optimize threads usage mode in BE #4440]
- Optimize the checking process and error message of the rand() function in the query analysis stage [Validate the param of rand function in compile step #4439]
- Optimize the compaction triggering and execution logic to better limit the resource overhead (mainly memory overhead) of the compaction operation, and trigger the compaction operation more reasonably [[Optimize] Optimize the execution model of compaction to limit memory consumption #4670]
- Support pushing Limit conditions to ODBC/MySQL external tables [[ODBC/MySQL] Support Limit Clause Push Down For ODBC Table And MySQL Table(#4706) #4707]
- Increase the limit on the number of tablet versions on the BE side to prevent excessive data versions from causing abnormal cluster load [[Config] Limit the version number of tablet #4687]
- When an RPC error occurs in a query, it can quickly return specific error information to prevent the query from being stuck [[Enhance][Log] Make RPC error log more clear #4702]
- Support automatic mapping of count(distinct if(bool, bitmap, null)) to bitmap_union_count function [Rewrite count(distinct if(bool, bitmap, null)) to bitmap_union_count #4201]
- Support set sql_mode = concat(@@sql_mode, "STRICT_TRANS_TABLES") statement [[MySQL Compatibility 1/4][Bug] Fix bug that set sql_mode with concat() function failed #4359]
- Support all stream load features in multiload [Support full StreamLoad feature in multiload #4717]
- Optimize BE’s strategy for selecting disks when creating tablets, and use the "two random choices" algorithm to ensure tablet copies are more even [[Optimize]Optimize the disk selection strategy on BE for tablet creation #4373]
- When creating a materialized view, the bitmap_union aggregation method only supports integer columns, and hll_union does not support decimal columns [Forbidden the illegal column types on BITMAP_UNION OR HLL_UNION mv #4432]
- Optimize the log level of some FEs to avoid log writing becoming a bottleneck [ Make some debug log settings configurable and change some log level from info to debug to avoid performance bottlenecks #4766]
- In the describe table statement, display the definition expression of the aggregate column of the materialized view [Show column display name on
Show Procstmt #4446] - Support convert() function [[Mysql Compatibility 3/4] Support convert() and signed/unsigned interger cast #4364]
-Support cast (expr as signed/unsigned int) syntax to be compatible with MySQL ecology
-Add more columns to the information_schema.columns table to be compatible with the MySQL ecosystem - In Spark Load function, use yarn command line instead of yarn-client API to kill job or get job status [[SparkLoad]Use the yarn command to get status and kill the application #4383]
- Persistence of stale rowset meta-information to ensure that this information will not be lost after BE restarts [Persistence stale rowsets meta #4454]
- Return an error code in the schema change result to more clearly inform the user of the specific error [Add OLAP_ERR_DATE_QUALITY_ERR error status to display schema change failure #4388]
- Optimize the rowset selection logic of some compactions to make the selection strategy more accurate [(#5151) An already merged rowset should skip window check #5152]
- Optimize the Page Cache on the BE side, divide Page into data cache and index cache [[Optimize][Cache]Implementation of Separated Page Cache #5008]
- Optimized the accuracy of functions such as variance and standard deviation on Decimal type [[Bug] Fix the loss of precision when Decimal calculates variance/stddev #4959]
- Optimized the processing logic of predicates pushed down to ScanNode to avoid repeated filtering of predicate conditions at the query layer and improve query efficiency [[Performance Optimization] Remove push down conjuncts in olap scan node #4999]
- Optimized the predicate push-down logic of Unique Key table, and supports push-down the conditions of non-primary key columns [Push down predicate on value column of unique table to base rowset #5022]
- Support pushing down "not in" and "!=" to the storage layer to improve query efficiency [[Performance Improve] Push Down _conjunctf of 'not in' and '!=' to Storage Engine. #5207]
- Support writing multiple memtables of a tablet in parallel during import. Improve import efficiency [[Load Parallel][2/3] Support parallel flushing memtable during load #5163]
- Optimize the creation logic of ZoneMap. When the number of rows on a page is too small, ZoneMap will not be created anymore [Optimize Zone map create policy #5260]
- Added histogram monitoring indicator class on BE [[Feature] Implementation of histogram metric #5148]
- When importing Parquet files, if there is a parsing error, the specific file name will be displayed in the error message [[Improvement] Add parquet file name to the error message #4954]
- Optimize the creation logic of dynamic partitions, the table under construction directly triggers the creation of dynamic partitions [[Feature] Support Create Dynamic Partition Immediately FirstTime Without Wating Schedule. #5209]
- In the result of the SHOW BACKENDS command, display the real start time of BE [[Bug] Make 'LastStartTime' in backends list as the actual BE start time #4872]
- Support column names start with @ symbol, mainly used to support mapping ES tables [[Doris][Doris On ES] support
@leading column name #5006] - Optimize the logic of the mapping and conversion relationship of the declared columns in the import statement to make the use more clear [Improve the processing logic of Load statement derived columns #5140]
- Optimize the execution logic of colocation join to make the query plan more evenly executed on multiple BE nodes [[Enhancement]Make Cholocate table join more load balance #5104]
- Optimize the predicate pushdown logic, and support pushdown of is null and is not null to the storage engine [[Performance Improve] Push Down _conjunct of 'A is NULL' and 'B is not NULL' to Storage Engine. #5092]
- Optimize the BE node selection logic in bucket join [[Enhancement] Optimize the algorithm of selecting host for a bucket scan task when a backend not alive #5133]
- Support UDF in import operation [[New Feature]Support udf when loading data #4863]
Other
- Added support for IN Predicate in delete statement [[Doc] Add in predicate support content in delete-manual.md #4404]
- Update the Dockerfile of the development image and add some new dependencies [[Dockerfile] Update Dockerfile #4474]
- Fix various spelling errors in the code and documentation [[Refactor] Fixes some be typo #4714] [[Docs] update data types doc and fix some typo #4712] [[documentation] Fix some typo in en docs part 1 #4722] [[documentation] Fix some typo in en docs part 2 #4723] [[documentation] Fix some typo in en docs part 3 #4724] [[documentation] Fix some typo in en docs part 4 #4725] [[documentation] Fix some typo in zh_cn docs part 1 #4726] [[documentation] Fix some typo in zh_cn docs part 2 #4727]
- Added two segment-related indicators in the OlapScanNode of the query profile to display the total number of segments and the number of filtered segments [[Profile] Add 2 Segment related metrics in query profile #4348]
- Add batch delete function description document [[DOCS] Add batch delete docs #4435]
- Added Spark Load syntax manual [[Doc] Add spark load sql statement doc and update manual #4463]
- Added the display of cumulative compaction strategy name and rowset data size in BE's /api/compaction/show API [Compaction show policy type and disk format #4466]
- Redirect the Spark Launcher log in Spark Load to a separate log file for easy viewing [[Spark Load] Redirect the spark launcher's log to a separated log file #4470]
- The BE configuration item streaming_load_max_batch_size_mb was renamed streaming_load_json_max_mb to make its meaning more clear [Change config name 'streaming_load_max_batch_size_mb' to 'streaming_load_json_max_mb' #4791]
- Adjust the default value of the FE configuration item thrift_client_timeout_ms to solve the problem of too long access to the information_schema library [[optimize] optimize default value for thriftserver's config key "thrift_client_timeout_ms" #4808]
- CPU or memory sampling of BE process is supported on BE web page to facilitate performance debugging [[Feature] Add CPU and Heap profile in BE webserver #4632]
- Extend the data slicing balance class on the FE side, so that it can extend more balance logic [[LoadBalance] make BeLoadRebalancer extends from base class Rebalancer #4771]
- The reorganized OLAP_SCAN_NODE profile information makes the profile clearer and easier to read [[Feature] Running Profile OLAP_SCAN_NODE layering and enhance readability #4825]
- Added monitoring indicators on the BE side to monitor cancelled Query Fragment [[Metrics] Add metric to monitor timeout canceled fragment count #4862]
- Reorganized the profile information of HASH_JOIN_NODE, CROSS_JOIN_NODE, UNION_NODE, ANALYTIC_EVAL_NODE to make the Profile more clear and easy to read [[Doc] Running Profile document add HASH_JOIN_NODE, etc. #4878]
- Modify the default value of query_colocate_join_memory_limit_penalty_factor to 1 to ensure that the default memory limit of the execution plan fragment is consistent with the user setting during the colocation join operation [[BUG] Fix colocate join memory limit problem (#4894) #4895]
- Added consideration of tablet scanning frequency in the selection of compaction strategy on the BE side [[Optimize] Take 'tablet scan frequency' into consideration when selecting a tablet for compaction #4837]
- Optimize the strategy of sending Query Fragments and reduce the number of sending public attributes to improve query plan scheduling performance [[Optimize] Avoid repeated sending of common components in Fragments #4904]
- Optimized the accuracy of load statistics for unavailable nodes when the query scheduler is scheduling query plans [【Optimize】optimize host selection strategy #4914]
- Add the code version information of the FE node in the result of the SHOW FRONTENDS statement [[Feature] Show FE commit hash on proc #4943]
- Support more column type conversion, such as support conversion from CHAR to numeric type, etc. [[Schema change] Support More column type in schema change #4938]
- Import function to identify complex types in Parquet files [[Bug]Parquet map/list/struct structure recognize #4968]
- In the BE monitoring indicators, increase the monitoring of used permits and waiting permits in the compaction logic [[Metrics][LOG] Add metrics for compaction permits and log for merge rowsets #4893]
- Optimize the execution time of BE single test [[UT] Speed up BE unit test #5131]
- Added more JVM-related monitoring items on the FE side [[Enhancement] Add more comprehensive prometheus jvm thread metrics on fe #5112]
- Add a session variable to control the timeout period for the transaction to take effect in the insert operation [For #5169 Add publish timout param when exec insert #5170]
- Optimize the logic of selecting scan nodes for query execution plans, and consider all ScanNode nodes in a query [[Optimize]Take all scan nodes of one sql into consideration when select host for a tablet #4984]
- Add more system monitoring indicators for FE nodes [[Enhancement] Add system memory metrics for fe #5149]
- Use of VLOG in unified BE code [Standardize the use of VLOG in code #5264]
BugFix
-
Fix the bug that may be caused during playback of Erase Table metadata operations [Fix fe restart failed bug when replay erase table log #5221]
-
Fix the problem that the BE process crashes due to the orc::TimezoneError not being caught when importing ORC format files [[Bug]Fix bug that BE crash when load ORC file #4350]
-
Fix the problem that the result of the Except operator is incorrect [[BUG] Fix except wrong answer bug #4369]
-
Fix the problem that the query always route to the same BE node when querying ES data [[Doris On ES][Bug-Fix] ES queries always route at same 3 BE nodes (#4351) #4352]
-
Fix the problem that the operation is not correctly persisted when setting the Global Variable [[Bug] Fix bug that modification of global variable can not be persisted. #4324]
-
Fixed the problem that the MemTracker was not constructed correctly in PushHandler which caused the BE process to crash [[Bug][MemTracker] Cleanup the mem tracker's constructor to avoid wrong usage #4345]
-
Fix the problem of importing blank lines when importing Json data format [[JsonLoad] Fix bug that row num stat is not correct when loading json #4379]
-
Fix the problem that the SQL rewriting rules failed to correctly handle count distinct [Modify mv rewrite rule on 'Count distinct' #4382]
-
Fix the problem that the data model type of the materialized view is not set correctly when creating the materialized view [Fix errors when alter materialized view which based on dup table #4375]
-
Fix the problem of wrong query result of left semi/anti join [[BUG] Remove the deduplication of LEFT SEMI/ANTI JOIN with not equal … #4417]
-
Prioritize the join method specified by the user [Fix explicit broadcast join bug #4424]
-
Fix the problem of incorrect results when Inline view is included in the Left join operation [FixTupleIsNull miss in SelectStmt resultExpr #4279]
-
[[MySQL Compatibility 2/4][Bug] Fix bug and improve compatibility with mysql protocol #4362]
select database() no longer returns the cluster qualified name, and fix the problem that select user() does not display the user ip
-
Fix the problem that the number of table copies displayed by show create table is incorrect for tables that use the dynamic partition function [FIX: fix dynamic partition replicationNum error #4393]
-
Fix the inconsistent precision of decimal, char and varchar columns in the base table and the materialized view in the materialized view [Keep the scale and precision of type when creating mv #4436]
-
Fix the problem of wild pointer in PlanFragmentExecutor, fix the problem of null pointer when importing in json format [Fix core issue of 4447 and change declare order for compatibility #4448]
-
Fixed the problem that some remaining tablet directories on BE were not cleared [[Bug-Fix] Some deleted tablets are not recycled on BE #4401]
-
Fix some issues with Spark Load [[Spark load] Fix dpp and submit push task bugs #4464]
-
Fix the problem that the balance of the colocation table cannot be completed [[Colocation] Fix Colocation balance endless loop bug #4471]
-
Fix MemIndex::load_segment possible memory copy exception problem [[Bug] Fix bug that memory copy may overflow in MemIndex::load_segment #4458]
-
Fix the problem of BE crashing when using Load Error Hub function when WITH_MYSQL compilation option is not added [[Bug] Fix bug of load error hub and schema change #4486]
-
Fix the problem of execution error when using @@sql_mode environment variable in SQL [[Bug] Fix bug of select @@sql_mode #4484]
-
Fix the problem of splitting the same column in Spark Load and Broker Load, and the splitting behavior is inconsistent [[Spark load][Bug] Fix column terminator for spark load #4491]
-
Fix the problem of BE downtime caused by querying the information_schema.columns table [[Bug] Fix bug that BE will crash when querying information_schema.columns #4511]
-
Fix some issues in the persistence of rowset metadata in historical versions [[BUG] Fix recover persistent stale rowsets bug from multi-single version rowsets in stale rowsets #4513]
-
Fix the problem of inconsistent behavior of str_do_date() function on FE side and BE side [[Bug] function str_to_date()'s behavior on BE and FE is inconsistent #4495]
-
Fixed the issue where BE was down due to some historical data conversion when performing linked schema change [[BUG] Fix segment group add zone map bug when schema change. #4526]
-
Fix the problem that Spark Load stays in the ETL stage after FE restart [[Spark Load] [Bug] Load job's state will stay in ETL state all the time after FE restart #4528]
-
Fixed an issue that caused unreadable data when the delete condition contained "\n" [[BUG] Tablet is not readable and delete handler report -1903 error, when condition value contains \n #4531]
-
Fix the problem that Spark Load job in PENDING state cannot be cancelled [[Spark load][Bug] fix that cancelling a spark load in the
PENDINGphase will not succeed #4536] -
Fix the problem of inconsistent behavior when splitting columns between Spark Load and other import methods [[Spark load][Bug] fix that cancelling a spark load in the
PENDINGphase will not succeed #4536] -
Fix the problem that net.sourceforge.czt.dev cannot be found when compiling the FE module [[Compile] Add pluginRepository for java-cup-plugins #4636]
-
Fix the problem that the statement parsing fails when the cast function exists in the case when statement [[Bug] Fix analysis error when there are different types in case-when-then-else with group by clause #4646]
-
Fix the problem that all queries will fail when there is a problem with the RPC of a certain BE [Fix all queries failed when one BE network or disk has issue #4651]
-
Fixed the issue that related import transactions were not cleaned up after the BE node went down [[BUG] Fix transaction not be cleared after BE down. #4661]
-
Fix the problem that the column types of the columns table of information_schema are not compatible with MySQL [Fix DATA_TYPE in information_schema.columns is not compatible to mysql meta #4648]
-
Fix the problem of SQL Cache access out of bounds [[Docs] Supply BE config docs of setting and examples #4641]
-
Fix the problem that import throws a null pointer exception when there is no partition in the table [[Bug] Fix that the partitions of a dynamic-partitioned table has not been created at the time of load or insert #4658]
-
Fix an error when tools/show_segment_status access external tables [[Bug]External engines(e.g. ES) don't have segments, ignore those tables #4671]
-
Fix the issue that delete on clause may not take effect in Routine Load [[Bug]Fix information_schema.columns table column_comment does not show #4676]
-
Fix the problem that the columns of information_schema do not display comments [Fix delete on clause may not work in routineLoad #4683]
-
Fix the problem that hidden columns (delete flag column, etc.) may be lost after schema change [Fix hidden cloumn may disappeared #4686]
-
Fix the problem that the window function lag()/lead() reports an error when matching the decimal type [Fix Windows function lag()/lead() function throw AnalysisException. #4666]
-
Fix the problem that the client is stuck in high concurrency scenarios when using MySQL NIO Server [Fix mysqlslap hang under high concurrent #4680]
-
Fix the problem of always reporting out of date in tablet report [[Bug] Fix bug that tablet report always out of date #4695]
-
Fix the problem of duplicate columns in case when statement after query planning [[Bug] Fix duplicate columns in case when statement #4693]
-
Fix the problem that the rand() function generates the same random value every time [Fix rand() function return same value #4709]
-
Fix the problem of query error caused by incorrect column cardinality statistics [[Bug] Fix hard cardinality check which makes queries fail #4678]
-
Fix the problem of BE downtime caused by function error of split_part function [[Bug] Fix the core problem of function
split_partand add the UT of core case #4721] -
Fix the problem of query execution error when SQL statement contains constant subquery [[Bug] Add regular column when materialized slot is empty in tuple #4719]
-
Fix the problem of join query error when the table contains the delete tag column [[BUG] Fix join error when the table has enbale batch delete #4734]
-
Fix the problem of syntax parsing errors when the CTE statement contains nested subqueries [[Bug]Fix bug CTE statement with nested select #4731]
-
Fix the problem of lead/lag type matching error in window function [[BUG] Ensure that the correct lead/lag function is selected #4732]
-
Fix the problem that tablet cannot be selected correctly when selecting tablet for compaction [[Bug] Fix bug of cumulative compaction and deletion of stale version #4593]
-
Fix the problem that limit conditions are incorrectly pushed down to the odbc external table and Es external table [[Bug] Do not push down limit operation when ODBC table do not push all conjunct as filter. #4764] [[Doris On ES][Bug-Fix] Can not pushdown limit when some predicate not processed by ES #4768]
-
Fix the problem that the compaction thread stops working [[Bug][Compaction] Fix bug that compaction may be blocked #4750]
-
Fix the problem that the timeout idle connection is not automatically killed in some cases [[Bug] Fix Bug that fe's connection which is timed out can't be released #4774]
-
Fix the problem of error when querying tables with delete flag column when SQL contains join [Fix delete_sign predicate assing to join node #4770]
-
Fix the calculation results of some time functions in FE to keep the results consistent with BE calculations [[Bug] Fix some date functions to make their result same as MySQL #4786]
-
Fix the issue that BE crashes when displaying tablet information on BE web page [[Bug] Fix bug and optimize implementation logic of tablets web page #4775]
-
Fix the type conversion problem of the time type filter condition, so that it can be correctly converted to the corresponding event type [[BUG] Cast int type to date type #4806]
-
Fixed the problem of repeatedly creating hidden columns when creating Rollup [Fix create rollup may duplicate hidden column #4816]
-
Fix the problem of hidden sequence column not displaying [[Bug] Sequence column should be visible when show_hidden_columns = true #4818]
-
Fix the problem of incorrect query results of some union statements [[Bug] Fix union bug (#4772) #4807]
-
Fixed an issue where offline node tasks could not be completed in some cases [[TabletScheduler] Fix some bug where decommission operations cannot be completed #4804]
-
Intelligently identify illegal date constants during SQL parsing to avoid query scanning all partitions [[FEATURE]Check date type to avoid scan all partitions #4756]
-
Fix the problem that BE crashes when the BE side selects the tablet for compaction without locking [[Compaction][Bug-Fix] Fix bug that meta lock need to be held when calculating compaction score #4829]
-
Fix some front-end display issues and back-end cookie processing logic issues in the new version of the UI [[FE UI] Fix some bugs about new FE UI #4830]
-
Fixed the problem that the tablet could not be found when querying errors when UNION and Colocation Join are included in SQL [[Bug][SQL] Fix bug that query failed when SQL contains Union and Colocate join #4842]
-
When submitting import tasks, the submission failed due to the full task queue, but the failure exception was not captured correctly [[BUG] Catch retry submit exception #4796]
-
Fix the problem of Broker Load job scheduling. Avoid the problem that some jobs cannot be scheduled after submission [[Bug] Fix some bugs of load job scheduler #4869]
-
Just before Master FE is started, avoid forwarding commands to Master FE [【Improvement】Avoid null host when forward to master #4844]
-
Ignore Parquet and ORC format empty files when importing to avoid reading errors [[Broker Load] Ignore empty file when file format is parquet or orc. #4810]
-
Fix the problem that the materialized view name conflict is not checked when renaming the OLAP table [[Bug] Rename table logic error #4870]
-
Fix the problem that the creation fails when using complex SQL to create a logical view [[Bug] Fix bug that failed to create view with complex select stmt #4840]
-
Fixed an issue where Routine Load could not end the task correctly due to reading empty messages when consuming Kafka data [[Bug] Fix bug that routine load blocked with TOO_MANY_TASKS error #4861]
-
Fix the problem that some column names are not recognized when using CTE syntax [[Bug] Fix bug #4886 and #4586 by refactoring code of method 'getDbs' #4887]
-
Fix the problem that the content of the columns table of the Information_schema library is incorrect [[BUG] Fix field error in information_schema.columns #4858]
-
Fix the problem that BitmapValue serialization fails when only 32-bit integers are included in the implementation of BitmapValue on the FE side [(#4883) Java Version BitmapValue deserialized failed #4884]
-
Fix that when calculating BE disk usage, all disk space not used by Doris in the node is incorrectly included. This will cause calculation errors during the Decommission operation [[BUG] modify isDecommissioned be capacity calculate rule #4889]
-
Fix the problem that an additional column may be added incorrectly when only constant expressions are included in the SELECT list [Avoid duplicate column when adding slot in empty tuple #4901]
-
Fix the problem that the Thrift Server type on the FE side and the BE side are inconsistent and cause communication failure [[Bug] Fix bug that be thrift client cannot connect to fe thrift server when fe thrift server use
TThreadedSelectorServermodel #4908] -
When partition cutting, ignore the filter conditions on non-columns [[Bug] Fix partition prune (#4833) #4921]
-
Fix the problem that the log directory is created incorrectly in the start_fe.sh startup script [fix the FE logs dir create issue #4929]
-
Fix the problem that some NULL values are not displayed when using CTE syntax [[Bug] Fix the bug of NULL do not show in CTE statement. #4932]
-
Fix the problem that Colocation Group is always in unstable state when some BE nodes are down [[BUG] Fix Colocate table balance bug #4936]
-
It is forbidden to create a table in Segment V1 format [disable the creation of segment v1 table #4913]
-
Fix the problem that Bool type condition processing error when Doris queries ES data [[Doris On ES][Bug-fix] fix boolean predicate pushdown manner #4990]
-
Fix a problem of Tablet Shard lock on BE side [[Bug] Fix concurrent access of _tablets_under_clone in TabletManager #5000]
-
Fix the problem of ConcurrentModificationException that may appear on the FE side when deleting a table that is being imported [(#5002) ConcurrentModificationException when finish transaction #5003]
-
Fix the problem of incorrect return type of str_to_date function [[Function] Let "str_to_date" return correct type #5004]
-
Fix the problem that the precision of some floating point types is lost when importing Json format data [[Bug] Fix the bug of Largetint and Decimal json load failed. #4983]
-
Fix the problem of incomplete query results when using Union to connect multiple external tables to query [[Bug] Fix bug that query multi mysql external table with union will get incomplete result #5067]
-
Fix the problem that the query result is incorrect when the SQL contains multiple in conditions [[Bug] Fix the bug of where condition a in ('A', 'B', 'V') and a in ('A') return error result #5072]
-
Fix a problem that the order of Profile destruction caused BE downtime [[Bug] Fix a core dump of counter in BE #5078]
-
Fix the problem of memory leakage when importing Json format data [[Bug] Fix Memory Leak in Json Load #5073]
-
Fix the problem that Colocation balance logic occupies 100% CPU when there is no BE node [[BUG] Fix colocate balance bug when no available BE #5079]
-
Fix the issue that creating a new tablet may cause BE downtime [[Bug] Fix coredump bug when create new tablets #5089]
-
Fixed the problem that the shared pointer circular reference caused the tablet to be unable to be cleared and occupied disk space [[Bug] Fix tablet shared ptr circular reference causing the tablet not to be cleared #5100]
-
Fix the issue that the BE will crash when the is null condition is included in the delete condition [[Bug] Fix bug when delete condition is null but zonemap is not null #5109]
-
Fix a problem with Partition Cache hit strategy [[Bug-Fix] Fix partition cache match bug #5060]
-
Optimize the strategy of Spark Load to read Hive tables to avoid full scanning of Hive tables [avoid to read whole hive table when spark load from hive table #5047]
-
Added support for Ninjia build system to speed up the compilation speed of BE [support ninja build system #5076]
-
Optimize the efficiency of importing data in Json format [[enhancement]improve performance of json load #5055]
-
Support FE to directly use thrift protocol to transmit heartbeat information to avoid heartbeat blocking failure that may be caused by http communication model [ Support fe heartbeat use thrift protocol to get stable response #5027]
-
Simplify the opening logic of the dynamic partition function, and prohibit hourly partitioning for date type columns [Forbidden creating table with dynamic partition when FE.config dynamic_partition_enable=false #5043]
-
Support to view Broker Load Profile through FE Web page [[Enhance] Add profile for load job #5052]
-
When viewing Resource information, clear text password is no longer displayed [[ODBC] ODBC Catalog do not show password in 'show resource' #5088]
-
The BE side adds trace information for tablet creation to help locate the problem of slow tablet creation [[Trace] Add trace for create tablet tasks #5091]
-
Fix the issue that may cause data loss when Routine Load consumes Kafka data in some cases [[Bug] Fix bug that routine load may lost some data #5093]
-
Fix the problem that desc statement to view all materialized views may return Malformed packet [[Bug-Fix] Fix 'Malformed packet' error when desc OlapTable with Rollup #5115]
-
Fix the issue that may cause BE to crash when BE starts loading the data directory [[Bug] Fix old tablet inserting bug #5113]
-
Fix the problem that non-Master FE repeatedly sends non-query requests to Master FE [[BUG] Follower shouldn't forward non-query statement to master repeatedly #5160]
-
Fix the problem of partition cache hit logic error [[Bug] Hit none partition cache, but hit range is still right #5065]
-
Fixed an error when bucket join was executed on an empty table [[Bug-Fix] Bucket shuffle join executes failed when two tables have no data #5145]
-
Fix the problem that the percentile_approx function returns the wrong result [[Bug-Fix] Fix the bug of
PERCENTILE_APPROXreturn error resultnanand addPERCENTILE_APPROXUT #5172] -
Fix the problem of the calling sequence of Olap Scanner thread ending [5111]
-
Fixed an error when creating the colocation attribute for an empty partitioned table [Fix create colocate table bug #5139]
-
Fixed an error when querying materialized views in CTE statement [Fix MaterializedView select with CTE bug #5165]
-
Fix the problem that the min max function does not handle the null value of string type column correctly [[Bug] Fix bug that the min/max function has an error in handling string null values #5189]
-
Modify the string encoding in Spark-Doris-Connector to utf8 [[Spark on Doris] fix the encode of varchar when convertArrowToRowBatch #5202]
-
Fix the problem that delete column may be added repeatedly in routine load [Fix duplicated add delete condition when run routine load #5222]
-
Fix bucket shuffle join bug [[Bug] Fix bucket shuffle join bug of query failed #5228]
-
Fix the issue that the ALTER ROUTINE LOAD operation is invalid for some parameters [[BUG] fix alter routine load not work #5257]
-
Fixed an issue where metadata signatures of different tables may be the same during backup and recovery operations [[Bug] Remove schema hash and fix bug of calculating table signature #5254]
-
Fix the problem that Colocate Join and Buckets shuffle join may cause data to be scanned repeatedly [[Bug] Colocate Join and Bucket shuffle join may scan some tablet twice time. #5256]
-
Fix the issue of metadata errors caused by unchecked log id when FE pushes metadata [Add some consistency check in image put api #5219]
-
Fix the problem of error in aggregate query processing -0.0 [[Bug] Fix row_number and group by have inconsistent partition results for (0.0, -0.0) #5226]
-
Fix outer join query error [[Bug] Fix bug of outer join cause error result #5285]
Other
-
Add some non-Apache protocol code protocol declarations to the NOTICE file [Add other license declare in NOTICE #4831]
-
Reformatted the code of BE using clang-format [Clang-format cpp sources #4965]
-
Added clang-format checking and formatting scripts to unify the C++ code style of BE before submission [Add clang-format script #4934]
-
The third-party library adds the AWS S3 SDK, which can be used to directly read the data in the object storage through the SDK [add aws sdk to thirdparty #5234]
-
Fixed some issues related to License: [[License] Organize and modify the license of the code #4371]
-
The dependencies of the two third-party libraries, MySQL client and LZO, will no longer be enabled in the default compilation options. If users need MySQL external table function, they need to turn it on
-
Removed the js and css code in the code library and introduced it in the form of a third-party library dependency
-
-
Updated the Docker development environment image build-env-1.2
-
Updated the compilation method of the UnixODBC tripartite library, so that the BE process no longer depends on the libltdl.so dynamic library of the system when it is running
-
Added third-party UDF to support more efficient set calculation of orthogonal bitmap data [Add bitmap longitudinal cutting udaf #4198]
-
Added UnixODBC third-party library dependency to support ODBC external table function [[ODBC SCAN NODE] 1/4 Add unix odbc library. #4377]
API Change
- Prohibit the creation of segment v1 tables [disable the creation of segment v1 table #4913]
- Rename the configuration item
streaming_load_max_batch_size_mbtostreaming_load_json_max_mb[Change config name 'streaming_load_max_batch_size_mb' to 'streaming_load_json_max_mb' #4791] - Support column reference passing in column definition of load statement [Improve the processing logic of Load statement derived columns #5140]
- Support creating indexes on the value column of unique table [Support create index on unique value column #5305]
- Support atomic replacement of two tables through replace statement [[Feature] Support REPLACE TABLE operation. #4669]
- Support CREATE TABLE LIKE statement
Credits
924060929
acelyc111
Astralidea
benbiti
blueChild
caiconghui
caoyang10
ccoffline
coalchan
Dam1029
e0c9
EmmyMiao87
gengjun-git
HangyuanLiu
HappenLee
hffariel
jollykingCN
kangkaisen
killxdcj
lihuigang
liutang123
luozenglin
marising
mengqinghuan
morningman
nimuyuhan
Nivane
pengxiangyu
px-l
qidaye
sduzh
Skysheepwang
songchuangyuan
stalary
stdpain
Sunt-ing
vagetablechicken
vergilchiu
wangbo
wangxiaobaidu11
weizuo93
WingsGo
wutiangan
wuyunfeng
xinghuayu007
xinyiZzz
Xpray
xy720
yangzhg
Youngwb
yxqweasd
zh0122
ZhangYu0123
zhaojintaozhao
xxiao2018
bookeezhou
JNSimba
killxdcj
yuliangwan