-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[fix](Nereids) cherry pick some pr to branch-2.1 #33464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix](Nereids) cherry pick some pr to branch-2.1 #33464
Conversation
…ache#32991) * The default delete bitmap cache is set to 100MB, which can be insufficient and cause performance issues when the amount of user data is large. To mitigate the problem of an inadequate cache, we will take the larger of 5% of the total memory and 100MB as the delete bitmap cache size.
…ctions (fe part) (apache#33087) * cse fe part
Problem: when ntile using 0 as parameter, be would core because no checking of parameter Solved: check parameter in fe analyze
…tself (apache#33089) Previously, strings_pool was allocated within each tree node. However, due to the Arena's alignment of allocated chunks to at least 4K, this allocation size was excessively large for a single tree node. Consequently, when there are numerous nodes within the SubcolumnTree, a significant portion of memory was wasted. Moving strings_pool to the tree itself optimizes memory usage and reduces wastage, improving overall efficiency.
…ument (apache#32746) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
|
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
…pache#32617) this pr can improve the performance of the nereids planner, in plan stage. 1. refactor expression rewriter to pattern match, so the lots of expression rewrite rules can criss-crossed apply in a big bottom-up iteration, and rewrite until the expression became stable. now we can process more cases because original there has no loop, and sometimes only process the top expression, like `SimplifyArithmeticRule`. 2. replace `Collection.stream()` to `ImmutableXxx.Builder` to avoid useless method call 3. loop unrolling some codes, like `Expression.<init>`, `PlanTreeRewriteBottomUpJob.pushChildrenJobs` 4. use type/arity specified-code, like `OneRangePartitionEvaluator.toNereidsLiterals()`, `PartitionRangeExpander.tryExpandRange()`, `PartitionRangeExpander.enumerableCount()` 5. refactor `ExtractCommonFactorRule`, now we can extract more cases, and I fix the deed loop when use `ExtractCommonFactorRule` and `SimplifyRange` in one iterative, because `SimplifyRange` generate right deep tree, but `ExtractCommonFactorRule` generate left deep tree 6. refactor `FoldConstantRuleOnFE`, support visitor/pattern match mode, in ExpressionNormalization, pattern match can criss-crossed apply with other rules; in PartitionPruner, visitor can evaluate expression faster 7. lazy compute and cache some operation 8. use int field to compare date 9. use BitSet to find disableNereidsRules 10. two level loop usually faster then build Multimap when bind slot in Scope, so I revert the code 11. `PlanTreeRewriteBottomUpJob` don't need to clearStatePhase any more ### test case 100 threads parallel continuous send this sql which query an empty table, test in my mac machine(m2 chip, 8 core), enable sql cache ```sql select count(1),date_format(time_col,'%Y%m%d'),varchar_col1 from tbl where partition_date>'2024-02-15' and (varchar_col2 ='73130' or varchar_col3='73130') and time_col>'2024-03-04' and time_col<'2024-03-05' group by date_format(time_col,'%Y%m%d'),varchar_col1 order by date_format(time_col,'%Y%m%d') desc, varchar_col1 desc,count(1) asc limit 1000 ``` before this pr: 3100 peak QPS, about 2700 avg QPS after this pr: 4800 peak QPS, about 4400 avg QPS (cherry picked from commit 7338683)
apache#32617 introduce a bug: rewrite may not working when plan's arity >= 3. this pr fix it (cherry picked from commit 8b070d1)
Fix failed in regression_test/suites/query_p0/group_concat/test_group_concat.groovy select group_concat( distinct b1, '?'), group_concat( distinct b3, '?') from table_group_concat group by b2 exception: lowestCostPlans with physicalProperties(GATHER) doesn't exist in root group The root cause is '?' is push down to slot by NormalizeAggregate, AggregateStrategies treat the slot as a distinct parameter and generate a invalid PhysicalHashAggregate, and then reject by ChildOutputPropertyDeriver. I fix this bug by avoid push down literal to slot in NormalizeAggregate, and forbidden generate stream aggregate node when group by slots is empty
This sql will failed because
2 in the group by will bind to 1 as col2 in BindExpression
ResolveOrdinalInOrderByAndGroupBy will replace 1 to MIN (LENGTH (cast(age as varchar)))
CheckAnalysis will throw an exception because group by can not contains aggregate function
select MIN (LENGTH (cast(age as varchar))), 1 AS col2
from test_bind_groupby_slots
group by 2
we should move ResolveOrdinalInOrderByAndGroupBy into BindExpression
(cherry picked from commit 3fab449)
9b424bf to
693690a
Compare
|
clang-tidy review says "All clean, LGTM! 👍" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
| // should NOT behave like two column arguments, so we can not use const column default implementation | ||
| bool use_default_implementation_for_constants() const override { return false; } | ||
|
|
||
| Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: function 'execute_impl' exceeds recommended size/complexity thresholds [readability-function-size]
Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments,
^Additional context
be/src/vec/functions/function_truncate.h:79: 161 lines including whitespace and comments (threshold 80)
Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments,
^| res = Dispatcher<FieldType, RoundingMode::Trunc, | ||
| TieBreakingMode::Auto>::apply_vec_const(col_general, | ||
| scale_arg); | ||
| return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: redundant boolean literal in conditional return statement [readability-simplify-boolean-expr]
be/src/vec/functions/function_truncate.h:103:
- if constexpr (IsDataTypeNumber<DataType> || IsDataTypeDecimal<DataType>) {
- using FieldType = typename DataType::FieldType;
- res = Dispatcher<FieldType, RoundingMode::Trunc,
- TieBreakingMode::Auto>::apply_vec_const(col_general,
- scale_arg);
- return true;
- }
-
- return false;
+ return IsDataTypeNumber<DataType> || IsDataTypeDecimal<DataType>;| using FieldType = typename DataType::FieldType; | ||
| res = Dispatcher<FieldType, RoundingMode::Trunc, TieBreakingMode::Auto>:: | ||
| apply_vec_const(column_general.column.get(), scale_arg); | ||
| return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: redundant boolean literal in conditional return statement [readability-simplify-boolean-expr]
be/src/vec/functions/function_truncate.h:145:
- if constexpr (IsDataTypeNumber<DataType> || IsDataTypeDecimal<DataType>) {
- using FieldType = typename DataType::FieldType;
- res = Dispatcher<FieldType, RoundingMode::Trunc, TieBreakingMode::Auto>::
- apply_vec_const(column_general.column.get(), scale_arg);
- return true;
- }
-
- return false;
+ return IsDataTypeNumber<DataType> || IsDataTypeDecimal<DataType>;| using FieldType = typename DataType::FieldType; | ||
| res = Dispatcher<FieldType, RoundingMode::Trunc, TieBreakingMode::Auto>:: | ||
| apply_const_vec(&const_col_general, column_scale.column.get()); | ||
| return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: redundant boolean literal in conditional return statement [readability-simplify-boolean-expr]
be/src/vec/functions/function_truncate.h:180:
- if constexpr (IsDataTypeNumber<DataType> || IsDataTypeDecimal<DataType>) {
- using FieldType = typename DataType::FieldType;
- res = Dispatcher<FieldType, RoundingMode::Trunc, TieBreakingMode::Auto>::
- apply_const_vec(&const_col_general, column_scale.column.get());
- return true;
- }
-
- return false;
+ return IsDataTypeNumber<DataType> || IsDataTypeDecimal<DataType>;| using FieldType = typename DataType::FieldType; | ||
| res = Dispatcher<FieldType, RoundingMode::Trunc, TieBreakingMode::Auto>:: | ||
| apply_vec_vec(column_general.column.get(), column_scale.column.get()); | ||
| return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: redundant boolean literal in conditional return statement [readability-simplify-boolean-expr]
be/src/vec/functions/function_truncate.h:213:
- if constexpr (IsDataTypeNumber<DataType> || IsDataTypeDecimal<DataType>) {
- using FieldType = typename DataType::FieldType;
- res = Dispatcher<FieldType, RoundingMode::Trunc, TieBreakingMode::Auto>::
- apply_vec_vec(column_general.column.get(), column_scale.column.get());
- return true;
- }
- return false;
+ return IsDataTypeNumber<DataType> || IsDataTypeDecimal<DataType>;| Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments, | ||
| size_t result, size_t input_rows_count) const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: method 'execute_impl' can be made static [readability-convert-member-functions-to-static]
| Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments, | |
| size_t result, size_t input_rows_count) const override { | |
| static Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments, | |
| size_t result, size_t input_rows_count) override { |
|
|
||
| size_t get_number_of_arguments() const override { return 1; } | ||
|
|
||
| DataTypePtr get_return_type_impl(const DataTypes& arguments) const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: method 'get_return_type_impl' can be made static [readability-convert-member-functions-to-static]
| DataTypePtr get_return_type_impl(const DataTypes& arguments) const override { | |
| static DataTypePtr get_return_type_impl(const DataTypes& arguments) override { |
| Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments, | ||
| size_t result, size_t input_rows_count) const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: method 'execute_impl' can be made static [readability-convert-member-functions-to-static]
| Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments, | |
| size_t result, size_t input_rows_count) const override { | |
| static Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments, | |
| size_t result, size_t input_rows_count) override { |
|
|
||
| // NOTE: This function is only tested for truncate | ||
| // DO NOT USE THIS METHOD FOR OTHER ROUNDING BASED FUNCTION UNTIL YOU KNOW EXACTLY WHAT YOU ARE DOING !!! only test for truncate | ||
| static ColumnPtr apply_const_vec(const ColumnConst* const_col_general, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: function 'apply_const_vec' exceeds recommended size/complexity thresholds [readability-function-size]
static ColumnPtr apply_const_vec(const ColumnConst* const_col_general,
^Additional context
be/src/vec/functions/round.h:568: 84 lines including whitespace and comments (threshold 80)
static ColumnPtr apply_const_vec(const ColumnConst* const_col_general,
^| // specific language governing permissions and limitations | ||
| // under the License. | ||
|
|
||
| #include <gtest/gtest-message.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: 'gtest/gtest-message.h' file not found [clang-diagnostic-error]
#include <gtest/gtest-message.h>
^
Proposed changes
cherry pick from #32617, #33134, #33091, #33117