Skip to content

[Bug] MetricEntity will cause BE crash #4396

@morningman

Description

@morningman

Describe the bug

#0  std::_Hash_bytes (ptr=<optimized out>, len=892612151, seed=seed@entry=3339675911) at ../../../../gcc-7.3.0/libstdc++-v3/libsupc++/hash_bytes.cc:147
#1  0x0000000001277bc1 in hash (__seed=3339675911, __clength=<optimized out>, __ptr=<optimized out>) at /home/work/cov_tools/gcc730/include/c++/7.3.0/bits/functional_hash.h:192
#2  operator() (this=0x384f168 <doris::DorisMetrics::instance()::instance+3784>, __s=...) at /home/work/cov_tools/gcc730/include/c++/7.3.0/bits/basic_string.h:6591
#3  _M_hash_code (this=0x384f168 <doris::DorisMetrics::instance()::instance+3784>, __k=...) at /home/work/cov_tools/gcc730/include/c++/7.3.0/bits/hashtable_policy.h:1368
#4  _M_erase (__k=..., this=0x384f168 <doris::DorisMetrics::instance()::instance+3784>) at /home/work/cov_tools/gcc730/include/c++/7.3.0/bits/hashtable.h:1903
#5  erase (__k=..., this=0x384f168 <doris::DorisMetrics::instance()::instance+3784>) at /home/work/cov_tools/gcc730/include/c++/7.3.0/bits/hashtable.h:759
#6  erase (__x=..., this=0x384f168 <doris::DorisMetrics::instance()::instance+3784>) at /home/work/cov_tools/gcc730/include/c++/7.3.0/bits/unordered_map.h:817
#7  doris::MetricRegistry::deregister_entity (this=this@entry=0x384f140 <doris::DorisMetrics::instance()::instance+3744>, name=...) at /home/work/teamcity/workspace/doris_daily_compile/core/be/src/util/metrics.cpp:170
#8  0x0000000000f62000 in doris::BaseTablet::~BaseTablet (this=0x108a9690, __in_chrg=<optimized out>) at /home/work/teamcity/workspace/doris_daily_compile/core/be/src/olap/base_tablet.cpp:46
#9  0x0000000000d793da in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x108a9680) at /home/work/cov_tools/gcc730/include/c++/7.3.0/bits/shared_ptr_base.h:154
#10 0x0000000000f04693 in ~__shared_count (this=<optimized out>, __in_chrg=<optimized out>) at /home/work/cov_tools/gcc730/include/c++/7.3.0/bits/shared_ptr_base.h:684
#11 ~__shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /home/work/cov_tools/gcc730/include/c++/7.3.0/bits/shared_ptr_base.h:1123
#12 operator= (__r=..., this=0x14ad0000) at /home/work/cov_tools/gcc730/include/c++/7.3.0/bits/shared_ptr_base.h:1213
#13 operator= (__r=..., this=0x14ad0000) at /home/work/cov_tools/gcc730/include/c++/7.3.0/bits/shared_ptr.h:319
#14 __copy_m<std::shared_ptr<doris::Tablet>*, std::shared_ptr<doris::Tablet>*> (__result=0x14ad0000, __last=<optimized out>, __first=0x14ad0010) at /home/work/cov_tools/gcc730/include/c++/7.3.0/bits/stl_algobase.h:343
#15 __copy_move_a<true, std::shared_ptr<doris::Tablet>*, std::shared_ptr<doris::Tablet>*> (__result=<optimized out>, __last=<optimized out>, __first=<optimized out>) at /home/work/cov_tools/gcc730/include/c++/7.3.0/bits/stl_algobase.h:386
#16 __copy_move_a2<true, __gnu_cxx::__normal_iterator<std::shared_ptr<doris::Tablet>*, std::vector<std::shared_ptr<doris::Tablet> > >, __gnu_cxx::__normal_iterator<std::shared_ptr<doris::Tablet>*, std::vector<std::shared_ptr<doris::Tablet> > > > (__result=..., __last=..., __first=...)
    at /home/work/cov_tools/gcc730/include/c++/7.3.0/bits/stl_algobase.h:422
#17 move<__gnu_cxx::__normal_iterator<std::shared_ptr<doris::Tablet>*, std::vector<std::shared_ptr<doris::Tablet> > >, __gnu_cxx::__normal_iterator<std::shared_ptr<doris::Tablet>*, std::vector<std::shared_ptr<doris::Tablet> > > > (__result=..., __last=..., __first=...)
    at /home/work/cov_tools/gcc730/include/c++/7.3.0/bits/stl_algobase.h:488
#18 _M_erase (__position=..., this=0x545a338) at /home/work/cov_tools/gcc730/include/c++/7.3.0/bits/vector.tcc:157
#19 erase (__position=..., this=0x545a338) at /home/work/cov_tools/gcc730/include/c++/7.3.0/bits/stl_vector.h:1180
#20 doris::TabletManager::start_trash_sweep (this=0x545a280) at /home/work/teamcity/workspace/doris_daily_compile/core/be/src/olap/tablet_manager.cpp:1061
#21 0x0000000000ecf279 in doris::StorageEngine::_start_trash_sweep (this=this@entry=0x545c780, usage=usage@entry=0x7f8b8b136068) at /home/work/teamcity/workspace/doris_daily_compile/core/be/src/olap/storage_engine.cpp:656
#22 0x0000000000ebf21f in doris::StorageEngine::_garbage_sweeper_thread_callback (this=0x545c780, arg=<optimized out>) at /home/work/teamcity/workspace/doris_daily_compile/core/be/src/olap/olap_server.cpp:256
#23 0x000000000291690f in std::execute_native_thread_routine (__p=0x5556840) at ../../../../../gcc-7.3.0/libstdc++-v3/src/c++11/thread.cc:83
#24 0x00007f8bf90b11c3 in start_thread () from /opt/compiler/gcc-4.8.2/lib64/libpthread.so.0
#25 0x00007f8bf8ae012d in clone () from /opt/compiler/gcc-4.8.2/lib64/libc.so.6

This is because every time a tablet object is constructed, a MetricEntity instance will be constructed and registered in the MetricRegistry, but MetricRegistry only uses the tablet id as the key to save the MetricEntity. When two identical tablet objects are constructed at the same time, an access exception will be generated.

Metadata

Metadata

Assignees

Labels

Stalekind/fixCategorizes issue or PR as related to a bug.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions