Skip to content

[Bug] When querying the Hive external table, an Invalid data error occurred. #30106

@qian0817

Description

@qian0817

Search before asking

  • I had searched in the issues and found no similar issues.

Version

doris-2.0.4-rc02-e529646

What's Wrong?

When querying the hive external table, an "Invalid data" error occurs.

java.sql.SQLException: errCode = 2, detailMessage = (10.196.150.125)[CANCELLED][INTERNAL_ERROR]Couldn't deserialize thrift msg:
TProtocolException: Invalid data

        0#  doris::Status doris::deserialize_thrift_msg<tparquet::ColumnIndex>(unsigned char const*, unsigned int*, bool, tparquet::ColumnIndex*) at /root/src/doris-2.0/be/src/util/t
hrift_util.h:153
        1#  doris::vectorized::PageIndex::parse_column_index(tparquet::ColumnChunk const&, unsigned char const*, tparquet::ColumnIndex*) at /root/src/doris-2.0/be/src/common/status.h
:443
        2#  doris::vectorized::ParquetReader::_process_page_index(tparquet::RowGroup const&, std::vector<doris::vectorized::RowRange, std::allocator<doris::vectorized::RowRange> >&)
at /root/src/doris-2.0/be/src/common/status.h:443
        3#  doris::vectorized::ParquetReader::_next_row_group_reader() at /root/src/doris-2.0/be/src/common/status.h:443
        4#  doris::vectorized::ParquetReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /root/src/doris-2.0/be/src/common/status.h:443
        5#  doris::vectorized::VFileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/src/doris-2.0/be/src/common/status.h:443
        6#  doris::vectorized::VScanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/src/doris-2.0/be/src/vec/exec/scan/vscanner.cpp:0
        7#  doris::vectorized::ScannerScheduler::_scanner_scan(doris::vectorized::ScannerScheduler*, doris::vectorized::ScannerContext*, std::shared_ptr<doris::vectorized::VScanner>)
 at /root/src/doris-2.0/be/src/common/status.h:355
        8#  std::_Function_handler<void (), doris::vectorized::ScannerScheduler::_schedule_scanners(doris::vectorized::ScannerContext*)::$_1::operator()() const::{lambda()#4}>::_M_in
voke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:701
        9#  doris::WorkThreadPool<true>::work_thread(int) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/atomic_base.h:646
        10# execute_native_thread_routine at /data/gcc-11.1.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/unique_ptr.h:85
        11# start_thread
        12# clone

        at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:130) ~[mysql-connector-j-8.0.33.jar!/:8.0.33]
        at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122) ~[mysql-connector-j-8.0.33.jar!/:8.0.33]
        at com.mysql.cj.jdbc.StatementImpl.executeQuery(StatementImpl.java:1200) ~[mysql-connector-j-8.0.33.jar!/:8.0.33]

The structure of the SQL is as follows:

select col1,col2
          from tbl1
          where col3='xxx'
          group by col1,col2;

This error seems to only occur on some tables, and running similar SQL on other tables does not result in an error.

What You Expected?

There will be no query errors.

How to Reproduce?

No response

Anything Else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions