-
Notifications
You must be signed in to change notification settings - Fork 3.7k
branch-3.0: [enhance](orc) Optimize ORC Predicate Pushdown for OR-connected Predicate #43255 #44436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…cate (#43255) ### What problem does this PR solve? Problem Summary: This issue addresses a limitation in Apache Doris where only predicates joined by AND are pushed down to the ORC reader, leaving OR-connected predicates unoptimized. By extending pushdown functionality to handle these OR conditions, the aim is to better leverage ORC’s predicate pushdown capabilities, reducing data reads and improving query performance.
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
| template <PrimitiveType primitive_type> | ||
| std::tuple<bool, orc::Literal> convert_to_orc_literal(const orc::Type* type, const void* value, | ||
| int precision, int scale) { | ||
| std::tuple<bool, orc::Literal> convert_to_orc_literal(const orc::Type* type, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: function 'convert_to_orc_literal' exceeds recommended size/complexity thresholds [readability-function-size]
std::tuple<bool, orc::Literal> convert_to_orc_literal(const orc::Type* type,
^Additional context
be/src/vec/exec/format/orc/vorc_reader.cpp:469: 94 lines including whitespace and comments (threshold 80)
std::tuple<bool, orc::Literal> convert_to_orc_literal(const orc::Type* type,
^| // check if there are rest children of expr can be pushed down to orc reader | ||
| bool OrcReader::_check_rest_children_can_push_down(const VExprSPtr& expr) { | ||
| if (expr->children().size() < 2) { | ||
| return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: redundant boolean literal in conditional return statement [readability-simplify-boolean-expr]
be/src/vec/exec/format/orc/vorc_reader.cpp:639:
- if (expr->children().size() < 2) {
- return false;
- }
-
- for (size_t i = 1; i < expr->children().size(); ++i) {
- if (!expr->children()[i]->is_literal()) {
- return false;
- }
- }
- return true;
+ return !expr->children().size() < 2;| return true; | ||
| } | ||
|
|
||
| bool OrcReader::_build_search_argument(const VExprSPtr& expr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: function '_build_search_argument' exceeds recommended size/complexity thresholds [readability-function-size]
bool OrcReader::_build_search_argument(const VExprSPtr& expr,
^Additional context
be/src/vec/exec/format/orc/vorc_reader.cpp:772: 115 lines including whitespace and comments (threshold 80)
bool OrcReader::_build_search_argument(const VExprSPtr& expr,
^|
|
||
| #pragma once | ||
|
|
||
| #include <cctz/time_zone.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: 'cctz/time_zone.h' file not found [clang-diagnostic-error]
#include <cctz/time_zone.h>
^| // specific language governing permissions and limitations | ||
| // under the License. | ||
|
|
||
| #include "testutil/desc_tbl_builder.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: 'testutil/desc_tbl_builder.h' file not found [clang-diagnostic-error]
#include "testutil/desc_tbl_builder.h"
^| #ifndef DORIS_BE_SRC_TESTUTIL_DESC_TBL_BUILDER_H | ||
| #define DORIS_BE_SRC_TESTUTIL_DESC_TBL_BUILDER_H | ||
|
|
||
| #include <gen_cpp/Descriptors_types.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: 'gen_cpp/Descriptors_types.h' file not found [clang-diagnostic-error]
#include <gen_cpp/Descriptors_types.h>
^| // specific language governing permissions and limitations | ||
| // under the License. | ||
|
|
||
| #include <glog/logging.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: 'glog/logging.h' file not found [clang-diagnostic-error]
#include <glog/logging.h>
^…r OR-connected Predicate apache#43255 (apache#44436)" This reverts commit 705012e.
revert: branch-3.0: [fix](orc) ignore null values when the literals of in_predicate contains #45104 (#45586) [fix](orc) check all the cases before build_search_argument (#44615) (#44802) branch-3.0: [enhance](orc) Optimize ORC Predicate Pushdown for OR-connected Predicate #43255 (#44436) re-pick: branch-3.0: [Fix](ORC) Not push down fixed char type in orc reader #45484 (#45525) --------- Co-authored-by: Socrates <suyiteng@selectdb.com>
Cherry-picked from #43255