Skip to content

[Bug] BE crash when doing insert stmt #3871

@morningman

Description

@morningman

Describe the bug
BE crash with coredump like:

(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x0000000001045667 in release (bytes=8192, this=0x633b7d40) at /var/doris/palo-core/be/src/runtime/mem_tracker.h:221
#2  doris::RowBatch::clear (this=0x8fc6bce0) at /var/doris/palo-core/be/src/runtime/row_batch.cpp:294
#3  0x00000000010459dd in clear (this=0x8fc6bce0) at /var/doris/palo-core/be/src/common/object_pool.h:54
#4  doris::RowBatch::~RowBatch (this=0x8fc6bce0, __in_chrg=<optimized out>) at /var/doris/palo-core/be/src/runtime/row_batch.cpp:301
#5  0x0000000001045a01 in doris::RowBatch::~RowBatch (this=0x8fc6bce0, __in_chrg=<optimized out>) at /var/doris/palo-core/be/src/runtime/row_batch.cpp:302
#6  0x00000000014a0299 in operator() (this=<optimized out>, __ptr=<optimized out>) at /var/doris/doris-toolchain/gcc730/include/c++/7.3.0/bits/unique_ptr.h:78
#7  ~unique_ptr (this=0x7070ec98, __in_chrg=<optimized out>) at /var/doris/doris-toolchain/gcc730/include/c++/7.3.0/bits/unique_ptr.h:268
#8  doris::stream_load::NodeChannel::~NodeChannel (this=0x7070ec00, __in_chrg=<optimized out>) at /var/doris/palo-core/be/src/exec/tablet_sink.cpp:41
#9  0x00000000014a53f2 in ~SpecificElement (this=0x77645500, __in_chrg=<optimized out>) at /var/doris/palo-core/be/src/common/object_pool.h:75
#10 doris::ObjectPool::SpecificElement<doris::stream_load::NodeChannel>::~SpecificElement (this=0x77645500, __in_chrg=<optimized out>) at /var/doris/palo-core/be/src/common/object_pool.h:76
#11 0x0000000001052efd in clear (this=0x6a213220) at /var/doris/palo-core/be/src/common/object_pool.h:54
#12 ~ObjectPool (this=0x6a213220, __in_chrg=<optimized out>) at /var/doris/palo-core/be/src/common/object_pool.h:37
#13 std::_Sp_counted_ptr<doris::ObjectPool*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=<optimized out>) at /var/doris/doris-toolchain/gcc730/include/c++/7.3.0/bits/shared_ptr_base.h:376
#14 0x000000000104b591 in _M_release (this=0x6a212fa0) at /var/doris/doris-toolchain/gcc730/include/c++/7.3.0/bits/shared_ptr_base.h:154
#15 ~__shared_count (this=0x891de310, __in_chrg=<optimized out>) at /var/doris/doris-toolchain/gcc730/include/c++/7.3.0/bits/shared_ptr_base.h:684
#16 ~__shared_ptr (this=0x891de308, __in_chrg=<optimized out>) at /var/doris/doris-toolchain/gcc730/include/c++/7.3.0/bits/shared_ptr_base.h:1123
#17 ~shared_ptr (this=0x891de308, __in_chrg=<optimized out>) at /var/doris/doris-toolchain/gcc730/include/c++/7.3.0/bits/shared_ptr.h:93
#18 doris::RuntimeState::~RuntimeState (this=0x891de300, __in_chrg=<optimized out>) at /var/doris/palo-core/be/src/runtime/runtime_state.cpp:128
#19 0x000000000103dcc6 in checked_delete<doris::RuntimeState> (x=0x891de300) at /var/doris/thirdparty/installed/include/boost/core/checked_delete.hpp:34
#20 ~scoped_ptr (this=0x88082ce0, __in_chrg=<optimized out>) at /var/doris/thirdparty/installed/include/boost/smart_ptr/scoped_ptr.hpp:89
#21 doris::PlanFragmentExecutor::~PlanFragmentExecutor (this=0x88082b70, __in_chrg=<optimized out>) at /var/doris/palo-core/be/src/runtime/plan_fragment_executor.cpp:62
#22 0x0000000000fd39ed in doris::FragmentExecState::~FragmentExecState (this=0x88082b00, __in_chrg=<optimized out>) at /var/doris/palo-core/be/src/runtime/fragment_mgr.cpp:175

Debug

In some abnormal situations, performing Insert to load will cause BE to crash. The reasons for troubleshooting are as follows:

  • FE side:

https://github.com/apache/incubator-doris/blob/2211cb0ee0fcd23d4fd2445494aba6cf1a020987/fe/src/main/java/org/apache/doris/qe/Coordinator.java#L475-L489

During the execution of execState.execRemoteFragmentAsync(), if an RPC error occurs, if the corresponding BE is down, an exception will be thrown directly instead of returning the error via Future<PExecPlanFragmentResult>. This time, the Coordinator will not proceed with the subsequent Cancel operation.

After an exception is thrown, the thread stack returns to handleInsertStmt() and directly returns the user error message. Insert failed. So far, FE has no further processing.

  • BE side

BE receives the execution plan of Insert and calls _sink->open in PlanFragmentExecutor::open_internal() to open TabletSink.

https://github.com/apache/incubator-doris/blob/2211cb0ee0fcd23d4fd2445494aba6cf1a020987/be/src/runtime/plan_fragment_executor.cpp#L272-L292

The Open method of TabletSink will open all related NodeChannels via RPC, but because some BEs problem, some of NodeChannels fail to open. An error message appears in the BE log:

tablet open failed, load_id=4e2f4d6076cc4692-b8f26cc924b6be0d, txn_id144942945, node=10.xx.xx.xx:8060, errmsg=failed to open tablet writer

However, because the majority of NodeChannels were successfully opened, TabletSink Open operations returned success.

Next, in PlanFragmentExecutor::open_internal() will call get_next_internal() to start fetching data, because Insert already failed at this time, so it returns failure here. And there are bugs in the later update_status(status) method:

https://github.com/apache/incubator-doris/blob/2211cb0ee0fcd23d4fd2445494aba6cf1a020987/be/src/runtime/plan_fragment_executor.cpp#L485-L503

Line 493: if (_status.ok()) should be if (!_status.ok()). This error caused the _status variable not to be updated. This will cause the NodeChannel to be closed instead of being canceled when the TabletSink is finally closed.

The normal NodeChannel close operation will send the last batch of RowBatch it holds and destroy the RowBatch object. And because some NodeChannels were not opened normally, they will not be closed normally. The RowBatch held by these NodeChannels will not be destroyed.

After the execution of the entire plan is completed, the PlanFragmentExecutor's destruction process is entered. The destructor call chain is as follows:

PlanFragmentExecutor
|--- RuntimeState
     |--- RuntimeProfile
     |--- ObjectPool
          |--- NodeChannel
               |--- RowBatch
                    |--- MemTracker->release()
                         |--- profile->_consumption->add(-bytes)

Note that the RuntimeProfile will be first destructed in RuntimeState, which will cause the _consumption object that has been destructed to be called when destructing the RowBatch in the NodeChannel, which will eventually cause BE crash.

The whole process has the following problems:

  1. The bug in update_status(status) caused the NodeChannel to not be canceled correctly, which caused problems in the final destruction. (If Cancel of NodeChannel is called in advance, RowBatch will be destructed in advance).
  2. When the FE Coordinator executes execRemoteFragmentAsync(), if it finds an RPC error, it should return a Future with an error code, continue the process afterwards, and actively call Cancel().
  3. The _status in RuntimeState has no lock protection, which may cause some potential problems.

Metadata

Metadata

Assignees

Labels

area/loadIssues or PRs related to all kinds of loadkind/fixCategorizes issue or PR as related to a bug.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions