Skip to content

Conversation

@github-actions
Copy link
Contributor

Cherry-picked from #43255

…cate (#43255)

### What problem does this PR solve?

Problem Summary:
This issue addresses a limitation in Apache Doris where only predicates
joined by AND are pushed down to the ORC reader, leaving OR-connected
predicates unoptimized. By extending pushdown functionality to handle
these OR conditions, the aim is to better leverage ORC’s predicate
pushdown capabilities, reducing data reads and improving query
performance.
@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring reopened this Nov 22, 2024
@doris-robot
Copy link

run buildall

Copy link
Contributor Author

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

template <PrimitiveType primitive_type>
std::tuple<bool, orc::Literal> convert_to_orc_literal(const orc::Type* type, const void* value,
int precision, int scale) {
std::tuple<bool, orc::Literal> convert_to_orc_literal(const orc::Type* type,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function 'convert_to_orc_literal' exceeds recommended size/complexity thresholds [readability-function-size]

std::tuple<bool, orc::Literal> convert_to_orc_literal(const orc::Type* type,
                               ^
Additional context

be/src/vec/exec/format/orc/vorc_reader.cpp:469: 94 lines including whitespace and comments (threshold 80)

std::tuple<bool, orc::Literal> convert_to_orc_literal(const orc::Type* type,
                               ^

// check if there are rest children of expr can be pushed down to orc reader
bool OrcReader::_check_rest_children_can_push_down(const VExprSPtr& expr) {
if (expr->children().size() < 2) {
return false;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: redundant boolean literal in conditional return statement [readability-simplify-boolean-expr]

be/src/vec/exec/format/orc/vorc_reader.cpp:639:

-     if (expr->children().size() < 2) {
-         return false;
-     }
- 
-     for (size_t i = 1; i < expr->children().size(); ++i) {
-         if (!expr->children()[i]->is_literal()) {
-             return false;
-         }
-     }
-     return true;
+     return !expr->children().size() < 2;

return true;
}

bool OrcReader::_build_search_argument(const VExprSPtr& expr,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function '_build_search_argument' exceeds recommended size/complexity thresholds [readability-function-size]

bool OrcReader::_build_search_argument(const VExprSPtr& expr,
                ^
Additional context

be/src/vec/exec/format/orc/vorc_reader.cpp:772: 115 lines including whitespace and comments (threshold 80)

bool OrcReader::_build_search_argument(const VExprSPtr& expr,
                ^


#pragma once

#include <cctz/time_zone.h>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 'cctz/time_zone.h' file not found [clang-diagnostic-error]

#include <cctz/time_zone.h>
         ^

// specific language governing permissions and limitations
// under the License.

#include "testutil/desc_tbl_builder.h"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 'testutil/desc_tbl_builder.h' file not found [clang-diagnostic-error]

#include "testutil/desc_tbl_builder.h"
         ^

#ifndef DORIS_BE_SRC_TESTUTIL_DESC_TBL_BUILDER_H
#define DORIS_BE_SRC_TESTUTIL_DESC_TBL_BUILDER_H

#include <gen_cpp/Descriptors_types.h>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 'gen_cpp/Descriptors_types.h' file not found [clang-diagnostic-error]

#include <gen_cpp/Descriptors_types.h>
         ^

// specific language governing permissions and limitations
// under the License.

#include <glog/logging.h>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 'glog/logging.h' file not found [clang-diagnostic-error]

#include <glog/logging.h>
         ^

@morningman morningman merged commit 705012e into branch-3.0 Nov 28, 2024
@github-actions github-actions bot deleted the auto-pick-43255-branch-3.0 branch November 28, 2024 14:53
morningman added a commit to morningman/doris that referenced this pull request Feb 8, 2025
morningman added a commit that referenced this pull request Feb 9, 2025
revert:
branch-3.0: [fix](orc) ignore null values when the literals of
in_predicate contains #45104 (#45586)
[fix](orc) check all the cases before build_search_argument (#44615)
(#44802)
branch-3.0: [enhance](orc) Optimize ORC Predicate Pushdown for
OR-connected Predicate #43255 (#44436)

re-pick:
branch-3.0: [Fix](ORC) Not push down fixed char type in orc reader
#45484 (#45525)

---------

Co-authored-by: Socrates <suyiteng@selectdb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants