-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
Describe the bug
I meet a same situation in #3840 , and I find the follwing stracktrace like following, the reason why BE crash is we didn't catch a Exception orc::TimezoneError. But we didn't open coredump, so I have to look the source code to find the reason. by the way, I think open coredump it important, it will save much of time.OvO~~.
4002 terminate called after throwing an instance of 'orc::TimezoneError'
4003 rc::TimezoneErro
4004 what(): Can't open /usr/share/zoneinfo/GMT+08:00
4005 *** Aborted at 1597225209 (unix time) try "date -d @1597225209" if you are using GNU date ***
4006 PC: @ 0x7f9cd9e1a1f7 __GI_raise
4007 *** SIGABRT (@0x1ac58) received by PID 109656 (TID 0x7f9c0202a700) from PID 109656; stack trace: ***
4008 @ 0x7f9cd9e1a270 (unknown)
4009 @ 0x7f9cd9e1a1f7 __GI_raise
4010 @ 0x7f9cd9e1b8e8 __GI_abort
4011 @ 0x2f21645 __gnu_cxx::__verbose_terminate_handler()
4012 @ 0x2e8d706 __cxxabiv1::__terminate()
4013 @ 0x2e8d751 std::terminate()
4014 @ 0x2ed1c6e execute_native_thread_routine
4015 @ 0x7f9cd9bd0e25 start_thread
4016 @ 0x7f9cd9edd34d __clone
I go to ORC's source code I found that When writing timestamps, the ORC library now records the time zone in the stripe footer. So in orc's Reader.hh file we use RowReaderImpl::next to get the data from orc, and the function is called by us in
but the function will call startNextStripe() in RowReaderImpl::next , in startNextStripe() function it will judge whether the orc file has writerTimezone in stripe footer, the relate code is in Reader.cc , line 829: const Tinezone& writerTimezone = currentStripeFooter.has_writertimezone() ? getTimezoneByName(currentStripeFooter.writertimezone()) : localTimezone;, so, if the orc file has_writertimezone(), the function will call getTimezoneByName internally.
In getTimezoneByName, it will call getTimezoneByFilename, the function will open file in /usr/share/zoneinfo to get specify timezone, if not found, will Throw a orc::ParseError, the error is cause by FileInputStream's constructor(In OrcFile.cc, line 51), after catch the orc::ParserError, it will throw anothor error, the relate code is in Timezone.cc line 689
try {
} catch (ParseError& err) {
throw TimezoneError(err.what());
}
Now, be's crash reason is clear, if BE's machine have no relate zoneinfo file, it will throw a orc::TimezoneError, it we forget to catch it , be will crash , the function call stracktrace is:
throw a orc::TimezoneError
throw a orc::ParseError
FileInputStream::FileInputStream
orc::readLocalFile()
readFile()
Timezone::getTimezoneByFilename()
Timezone::getTimezoneByName()
RowReaderImpl::startNextStripe()
RowReaderImpl::next()
RowReader::next()
Expected behavior
BE not crash
Solutions
when call reader->next(), we should catch the orc::TimezoneError exception and return an InternalError to users to avoid be crash.