-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Closed
Labels
Stalehelp wantedkind/fixCategorizes issue or PR as related to a bug.Categorizes issue or PR as related to a bug.
Description
Describe the bug
BE crash occasionally and be.out shows:
palo_be: ../nptl/pthread_mutex_lock.c:80: __pthread_mutex_lock: Assertion `mutex->__data.__owner == 0' failed.
*** Aborted at 1592890873 (unix time) try "date -d @1592890873" if you are using GNU date ***
PC: @ 0x7f599a2af3f7 __GI_raise
*** SIGABRT (@0x1f4000081a6) received by PID 33190 (TID 0x7f59797b6700) from PID 33190; stack trace: ***
@ 0x7f599a2af470 (unknown)
@ 0x7f599a2af3f7 __GI_raise
@ 0x7f599a2b07d8 __GI_abort
@ 0x7f599a2a8516 __assert_fail_base
@ 0x7f599a2a85c2 __GI___assert_fail
@ 0x7f599a06658c __GI___pthread_mutex_lock
@ 0x1ba34d6 pthread_mutex_lock
@ 0x145f4ac doris::OlapScanNode::scanner_thread()
@ 0xfa8a35 doris::PriorityThreadPool::work_thread()
@ 0x1a5bbed thread_proxy
@ 0x7f599a0641c3 start_thread
@ 0x7f599a36112d __clone
The reason is that when trying to lock a mutex, the assertion failed at mutex->__data.__owner == 0. It expected __owner == 0, which is not.
But when I look into the core dump file, the __owner field of that mutex is 0.
#0 0x00007f599a2af3f7 in raise () from /opt/compiler/gcc-4.8.2/lib64/libc.so.6
#1 0x00007f599a2b07d8 in abort () from /opt/compiler/gcc-4.8.2/lib64/libc.so.6
#2 0x00007f599a2a8516 in __assert_fail_base () from /opt/compiler/gcc-4.8.2/lib64/libc.so.6
#3 0x00007f599a2a85c2 in __assert_fail () from /opt/compiler/gcc-4.8.2/lib64/libc.so.6
#4 0x00007f599a06658c in pthread_mutex_lock () from /opt/compiler/gcc-4.8.2/lib64/libpthread.so.0
#5 0x0000000001ba34d6 in pthread_mutex_lock_impl (mutex=0x67521610) at /home/palo/thirdparty/src/incubator-brpc-0.9.5/src/bthread/mutex.cpp:551
#6 pthread_mutex_lock (__mutex=__mutex@entry=0x67521610) at /home/palo/thirdparty/src/incubator-brpc-0.9.5/src/bthread/mutex.cpp:809
#7 0x000000000145f4ac in pthread_mutex_scoped_lock (m_=0x67521610, this=<synthetic pointer>) at /home/palo/thirdparty/installed/include/boost/thread/pthread/pthread_mutex_scoped_lock.hpp:26
#8 notify_one (this=0x67521610) at /home/palo/thirdparty/installed/include/boost/thread/pthread/condition_variable.hpp:126
#9 doris::OlapScanNode::scanner_thread (this=0x67521000, scanner=0x20200bd40) at /home/palo/be/src/exec/olap_scan_node.cpp:1322
#10 0x0000000000fa8a35 in operator() (this=0x7f59797b2828) at /home/palo/thirdparty/installed/include/boost/function/function_template.hpp:759
#11 doris::PriorityThreadPool::work_thread (this=0x50ac300, thread_id=<optimized out>) at /home/palo/be/src/util/priority_thread_pool.hpp:138
#12 0x0000000001a5bbed in thread_proxy ()
#13 0x00007f599a0641c3 in start_thread () from /opt/compiler/gcc-4.8.2/lib64/libpthread.so.0
#14 0x00007f599a36112d in clone () from /opt/compiler/gcc-4.8.2/lib64/libc.so.6
(gdb) f 5
#5 0x0000000001ba34d6 in pthread_mutex_lock_impl (mutex=0x67521610) at /home/palo/thirdparty/src/incubator-brpc-0.9.5/src/bthread/mutex.cpp:551
551 /home/palo/thirdparty/src/incubator-brpc-0.9.5/src/bthread/mutex.cpp: No such file or directory.
(gdb) p mutex
$1 = (pthread_mutex_t *) 0x67521610
(gdb) p *mutex
$2 = {
__data = {
__lock = 0,
__count = 0,
__owner = 0,
__nusers = 4294967295,
__kind = 0,
__spins = 0,
__elision = 0,
__list = {
__prev = 0x0,
__next = 0x0
}
},
__size = '\000' <repeats 12 times>, "����", '\000' <repeats 23 times>,
__align = 0
}
That mutex is a internal mutex of boost::condition_variable. I have no idea why.
Metadata
Metadata
Assignees
Labels
Stalehelp wantedkind/fixCategorizes issue or PR as related to a bug.Categorizes issue or PR as related to a bug.