Skip to content

[Bug] vectorized compaction failed #9604

@hello-stephen

Description

@hello-stephen

Search before asking

  • I had searched in the issues and found no similar issues.

Version

branch: master
commit id: 953429e

What's Wrong?

4 server, 1fe 3be deploy,
I use tools/tpch-tools to load 100G data, after it finished load 1 hour later, I check it and find that lineitem table failed to do cumulative compaction. Then I found error log in be.WARNING ,

W0516 13:37:26.612212 801433 compaction.cpp:252] row_num does not match between cumulative input and output! input_row_num=18742298, merged_row_num=0, filtered_row_num=0, output_row_num=0
W0516 13:37:26.612402 801433 tablet.cpp:1410] failed to do cumulative compaction. res=Internal error: :  0# doris::Status::ConstructErrorStatus(short, doris::Slice const&) at /mnt/hdd01/ldy/incubator-doris/be/src/common/status.cpp:80
 1# doris::Compaction::check_correctness(doris::Merger::Statistics const&) at /mnt/hdd01/ldy/incubator-doris/be/src/olap/compaction.cpp:276
 2# doris::Compaction::do_compaction_impl(long) at /mnt/hdd01/ldy/incubator-doris/be/src/olap/compaction.cpp:121
 3# doris::Compaction::do_compaction(long) at /mnt/hdd01/ldy/incubator-doris/be/src/olap/compaction.cpp:57
 4# doris::CumulativeCompaction::execute_compact_impl() at /mnt/hdd01/ldy/incubator-doris/be/src/olap/cumulative_compaction.cpp:75
 5# doris::Compaction::execute_compact() at /mnt/hdd01/ldy/incubator-doris/be/src/olap/compaction.cpp:46
 6# doris::Tablet::execute_compaction(doris::CompactionType) at /mnt/hdd01/ldy/incubator-doris/be/src/olap/tablet.cpp:1407
 7# std::_Function_handler<void (), doris::StorageEngine::_submit_compaction_task(std::shared_ptr<doris::Tablet>, doris::CompactionType)::{lambda()#1}>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/include/c++/11/bits/std_function.h:291
 8# doris::ThreadPool::dispatch_thread() at /mnt/hdd01/ldy/incubator-doris/be/src/util/threadpool.cpp:548
 9# doris::Thread::supervise_thread(void*) at /mnt/hdd01/ldy/incubator-doris/be/src/util/thread.cpp:409
10# start_thread in /lib/x86_64-linux-gnu/libpthread.so.0
11# __clone in /lib/x86_64-linux-gnu/libc.so.6
 (error -224), tablet=1007524.1617673526.414fcdf78976e781-857a7348cc73c68e

This error log still keep printing after I drop lineitem and drop database.
then I set enable_vectorized_compaction=false in be.conf and restart cluster, it stops print.
At this time, it looks like recovered, so I redo load again, and no error log again.

To summarize:
Load tpch 100G data with enable_vectorized_compaction=true, NOT OK;
Load tpch 100G data with enable_vectorized_compaction=false, OK;

What You Expected?

do cumulative compaction and base compaction successful as expected.

How to Reproduce?

use tools/tpch-tools to load 100G data.

Anything Else?

If more information is required, I will provide it

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions