Skip to content

Conversation

@liaoxin01
Copy link
Contributor

@liaoxin01 liaoxin01 commented Feb 13, 2025

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #47610

Problem Summary:
SIGSEGV address not mapped to object (@0x0) received by PID 340906 (TID 341622 OR 0x7f7f38784640) from PID 0; stack trace: ***
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_branch-3.0/doris/be/src/common/signal_handler.h:421
1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0] in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
2# JVM_handle_linux_signal in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
3# 0x00007F80B37D0520 in /lib/x86_64-linux-gnu/libc.so.6
4# doris::MemTableWriter::_flush_memtable_async() at /home/zcp/repo_center/doris_branch-3.0/doris/be/src/olap/memtable_writer.cpp:157
5# doris::MemTableWriter::flush_async() at /home/zcp/repo_center/doris_branch-3.0/doris/be/src/olap/memtable_writer.cpp:187
6# doris::MemTableMemoryLimiter::_flush_active_memtables(long) at /home/zcp/repo_center/doris_branch-3.0/doris/be/src/olap/memtable_memory_limiter.cpp:190
7# doris::MemTableMemoryLimiter::handle_memtable_flush() at /home/zcp/repo_center/doris_branch-3.0/doris/be/src/olap/memtable_memory_limiter.cpp:144
8# doris::LoadChannelMgr::add_batch(doris::PTabletWriterAddBlockRequest const&, doris::PTabletWriterAddBlockResult*) at /home/zcp/repo_center/doris_branch-3.0/doris/be/src/runtime/load_channel_mgr.cpp:154
9# std::_Function_handler<void (), doris::PInternalService::tablet_writer_add_block(google::protobuf::RpcController*, doris::PTabletWriterAddBlockRequest const*, doris::PTabletWriterAddBlockResult*, google::protobuf::Closure*)::$_0>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
10# doris::WorkThreadPool::work_thread(int) at /home/zcp/repo_center/doris_branch-3.0/doris/be/src/util/work_thread_pool.hpp:159
11# execute_native_thread_routine at ../../../../../libstdc++-v3/src/c++11/thread.cc:84
12# start_thread at ./nptl/pthread_create.c:442
13# 0x00007F80B38B4850 at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:83

This PR addresses potential null pointer dereference crashes that could occur when write operations fail and the memtable is reset. The changes add defensive null checks to ensure safe handling of the _mem_table state during flush memtable.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@liaoxin01
Copy link
Contributor Author

run buildall

@liaoxin01
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31841 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 14c845996ac05ed6e9a2d8a87a8d9e01474a5afc, data reload: false

------ Round 1 ----------------------------------
q1	18610	5237	5094	5094
q2	2045	317	177	177
q3	11808	1265	734	734
q4	10221	995	534	534
q5	7596	2349	2375	2349
q6	191	173	134	134
q7	906	761	607	607
q8	9794	1288	1180	1180
q9	6588	4798	4805	4798
q10	7010	2311	1921	1921
q11	494	279	258	258
q12	348	357	217	217
q13	19169	3758	3136	3136
q14	228	227	214	214
q15	517	473	464	464
q16	620	618	581	581
q17	566	859	343	343
q18	6606	6199	6159	6159
q19	1533	933	530	530
q20	311	325	191	191
q21	2809	2185	1913	1913
q22	358	329	307	307
Total cold run time: 108328 ms
Total hot run time: 31841 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5179	5128	5120	5120
q2	231	329	226	226
q3	2136	2697	2316	2316
q4	1488	1807	1401	1401
q5	4372	4296	4286	4286
q6	212	175	129	129
q7	1998	1933	1818	1818
q8	2638	2616	2612	2612
q9	7361	7217	7341	7217
q10	3008	3227	2757	2757
q11	572	499	486	486
q12	684	757	647	647
q13	3604	3900	3239	3239
q14	289	289	273	273
q15	499	454	457	454
q16	639	693	644	644
q17	1124	1571	1375	1375
q18	7581	7502	7212	7212
q19	763	782	963	782
q20	1981	2027	1946	1946
q21	5420	4661	4762	4661
q22	614	582	560	560
Total cold run time: 52393 ms
Total hot run time: 50161 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 189717 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 14c845996ac05ed6e9a2d8a87a8d9e01474a5afc, data reload: false

query1	1313	961	957	957
query2	6211	1832	1852	1832
query3	10950	4464	4288	4288
query4	54858	25485	23364	23364
query5	4998	567	495	495
query6	338	203	181	181
query7	4918	507	297	297
query8	312	245	235	235
query9	5705	2605	2592	2592
query10	406	294	251	251
query11	15073	15045	14850	14850
query12	152	109	101	101
query13	1059	521	384	384
query14	10492	6408	6239	6239
query15	209	199	173	173
query16	7000	652	494	494
query17	1044	709	574	574
query18	1508	419	334	334
query19	215	204	175	175
query20	131	126	127	126
query21	207	125	107	107
query22	4510	4664	4415	4415
query23	33952	33389	33241	33241
query24	5750	2426	2439	2426
query25	479	490	443	443
query26	677	295	166	166
query27	1824	484	342	342
query28	2825	2443	2437	2437
query29	549	579	419	419
query30	215	194	159	159
query31	859	881	808	808
query32	70	64	65	64
query33	441	348	312	312
query34	766	856	488	488
query35	864	849	762	762
query36	993	1001	904	904
query37	114	98	73	73
query38	4319	4424	4249	4249
query39	1483	1458	1448	1448
query40	215	113	101	101
query41	54	48	50	48
query42	124	107	104	104
query43	510	532	476	476
query44	1307	841	833	833
query45	186	176	163	163
query46	861	1063	666	666
query47	1852	1892	1826	1826
query48	396	439	320	320
query49	691	494	415	415
query50	696	755	434	434
query51	4326	4294	4256	4256
query52	108	101	96	96
query53	233	274	202	202
query54	490	489	414	414
query55	85	80	79	79
query56	257	280	258	258
query57	1150	1198	1139	1139
query58	240	235	250	235
query59	2831	2953	2763	2763
query60	321	282	258	258
query61	123	121	116	116
query62	749	735	698	698
query63	225	189	195	189
query64	1431	1044	669	669
query65	3307	3277	3257	3257
query66	732	383	291	291
query67	16113	15729	15422	15422
query68	5527	775	512	512
query69	523	311	266	266
query70	1203	1089	1101	1089
query71	455	299	265	265
query72	5989	3556	3808	3556
query73	1341	742	348	348
query74	9022	9141	8766	8766
query75	3204	3140	2686	2686
query76	3806	1167	749	749
query77	564	370	282	282
query78	10030	10222	9221	9221
query79	1797	825	593	593
query80	668	541	465	465
query81	503	274	239	239
query82	238	131	101	101
query83	179	174	154	154
query84	324	100	75	75
query85	764	355	300	300
query86	338	315	310	310
query87	4451	4541	4365	4365
query88	2768	2190	2163	2163
query89	389	316	290	290
query90	1720	191	193	191
query91	135	139	112	112
query92	63	60	56	56
query93	1613	1016	582	582
query94	630	418	307	307
query95	335	263	253	253
query96	489	554	270	270
query97	2744	2838	2732	2732
query98	222	212	203	203
query99	1325	1410	1283	1283
Total cold run time: 290761 ms
Total hot run time: 189717 ms

@wm1581066 wm1581066 added the p0_c label Feb 13, 2025
@doris-robot
Copy link

ClickBench: Total hot run time: 30.46 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 14c845996ac05ed6e9a2d8a87a8d9e01474a5afc, data reload: false

query1	0.03	0.03	0.04
query2	0.07	0.03	0.03
query3	0.24	0.08	0.06
query4	1.61	0.11	0.10
query5	0.42	0.42	0.42
query6	1.16	0.65	0.66
query7	0.02	0.01	0.01
query8	0.04	0.03	0.04
query9	0.61	0.51	0.53
query10	0.59	0.59	0.57
query11	0.16	0.10	0.10
query12	0.15	0.11	0.11
query13	0.62	0.60	0.60
query14	2.68	2.80	2.79
query15	0.92	0.84	0.84
query16	0.38	0.37	0.37
query17	1.05	1.03	1.04
query18	0.22	0.20	0.19
query19	1.99	1.78	2.00
query20	0.01	0.01	0.01
query21	15.35	0.97	0.56
query22	0.76	1.21	0.62
query23	15.01	1.38	0.66
query24	7.18	0.87	0.63
query25	0.46	0.29	0.13
query26	0.62	0.16	0.15
query27	0.06	0.05	0.05
query28	9.08	0.91	0.43
query29	12.55	3.90	3.27
query30	0.25	0.09	0.06
query31	2.84	0.59	0.39
query32	3.22	0.53	0.47
query33	2.99	3.02	3.14
query34	15.78	5.09	4.50
query35	4.55	4.53	4.49
query36	0.65	0.50	0.48
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.03
query40	0.16	0.13	0.13
query41	0.08	0.02	0.02
query42	0.03	0.02	0.03
query43	0.04	0.03	0.03
Total cold run time: 104.8 s
Total hot run time: 30.46 s

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 42.98% (11309/26312)
Line Coverage: 32.91% (94976/288561)
Region Coverage: 32.07% (48687/151832)
Branch Coverage: 27.92% (24555/87948)
Coverage Report: http://coverage.selectdb-in.cc/coverage/14c845996ac05ed6e9a2d8a87a8d9e01474a5afc_14c845996ac05ed6e9a2d8a87a8d9e01474a5afc/report/index.html

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Feb 13, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@dataroaring dataroaring merged commit 7490ea8 into apache:master Feb 13, 2025
25 of 28 checks passed
github-actions bot pushed a commit that referenced this pull request Feb 13, 2025
…7860)

Related PR: #47610

Problem Summary:
SIGSEGV address not mapped to object (@0x0) received by PID 340906 (TID
341622 OR 0x7f7f38784640) from PID 0; stack trace: ***
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int,
siginfo_t*, void*) at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/common/signal_handler.h:421
1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0]
in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
2# JVM_handle_linux_signal in
/usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
 3# 0x00007F80B37D0520 in /lib/x86_64-linux-gnu/libc.so.6
4# doris::MemTableWriter::_flush_memtable_async() at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/olap/memtable_writer.cpp:157
5# doris::MemTableWriter::flush_async() at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/olap/memtable_writer.cpp:187
6# doris::MemTableMemoryLimiter::_flush_active_memtables(long) at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/olap/memtable_memory_limiter.cpp:190
7# doris::MemTableMemoryLimiter::handle_memtable_flush() at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/olap/memtable_memory_limiter.cpp:144
8# doris::LoadChannelMgr::add_batch(doris::PTabletWriterAddBlockRequest
const&, doris::PTabletWriterAddBlockResult*) at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/runtime/load_channel_mgr.cpp:154
9# std::_Function_handler<void (),
doris::PInternalService::tablet_writer_add_block(google::protobuf::RpcController*,
doris::PTabletWriterAddBlockRequest const*,
doris::PTabletWriterAddBlockResult*,
google::protobuf::Closure*)::$_0>::_M_invoke(std::_Any_data const&) at
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
10# doris::WorkThreadPool<false>::work_thread(int) at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/util/work_thread_pool.hpp:159
11# execute_native_thread_routine at
../../../../../libstdc++-v3/src/c++11/thread.cc:84
12# start_thread at ./nptl/pthread_create.c:442
13# 0x00007F80B38B4850 at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:83

This PR addresses potential null pointer dereference crashes that could
occur when write operations fail and the memtable is reset. The changes
add defensive null checks to ensure safe handling of the _mem_table
state during flush memtable.
dataroaring pushed a commit that referenced this pull request Feb 13, 2025
…re reset #47860 (#47869)

Cherry-picked from #47860

Co-authored-by: Xin Liao <liaoxin@selectdb.com>
lzyy2024 pushed a commit to lzyy2024/doris that referenced this pull request Feb 21, 2025
…ache#47860)

Related PR: apache#47610

Problem Summary:
SIGSEGV address not mapped to object (@0x0) received by PID 340906 (TID
341622 OR 0x7f7f38784640) from PID 0; stack trace: ***
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int,
siginfo_t*, void*) at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/common/signal_handler.h:421
1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0]
in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
2# JVM_handle_linux_signal in
/usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
 3# 0x00007F80B37D0520 in /lib/x86_64-linux-gnu/libc.so.6
4# doris::MemTableWriter::_flush_memtable_async() at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/olap/memtable_writer.cpp:157
5# doris::MemTableWriter::flush_async() at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/olap/memtable_writer.cpp:187
6# doris::MemTableMemoryLimiter::_flush_active_memtables(long) at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/olap/memtable_memory_limiter.cpp:190
7# doris::MemTableMemoryLimiter::handle_memtable_flush() at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/olap/memtable_memory_limiter.cpp:144
8# doris::LoadChannelMgr::add_batch(doris::PTabletWriterAddBlockRequest
const&, doris::PTabletWriterAddBlockResult*) at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/runtime/load_channel_mgr.cpp:154
9# std::_Function_handler<void (),
doris::PInternalService::tablet_writer_add_block(google::protobuf::RpcController*,
doris::PTabletWriterAddBlockRequest const*,
doris::PTabletWriterAddBlockResult*,
google::protobuf::Closure*)::$_0>::_M_invoke(std::_Any_data const&) at
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
10# doris::WorkThreadPool<false>::work_thread(int) at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/util/work_thread_pool.hpp:159
11# execute_native_thread_routine at
../../../../../libstdc++-v3/src/c++11/thread.cc:84
12# start_thread at ./nptl/pthread_create.c:442
13# 0x00007F80B38B4850 at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:83

This PR addresses potential null pointer dereference crashes that could
occur when write operations fail and the memtable is reset. The changes
add defensive null checks to ensure safe handling of the _mem_table
state during flush memtable.
kaijchen added a commit to kaijchen/doris that referenced this pull request Feb 28, 2025
dataroaring pushed a commit that referenced this pull request Mar 4, 2025
### What problem does this PR solve?

Issue Number: DORIS-18927

Related PR: #47860 and #47610
github-actions bot pushed a commit that referenced this pull request Mar 4, 2025
### What problem does this PR solve?

Issue Number: DORIS-18927

Related PR: #47860 and #47610
dataroaring pushed a commit that referenced this pull request Mar 10, 2025
Cherry-picked from #48489

Co-authored-by: Kaijie Chen <chenkaijie@selectdb.com>
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
…ache#47860)

Related PR: apache#47610

Problem Summary:
SIGSEGV address not mapped to object (@0x0) received by PID 340906 (TID
341622 OR 0x7f7f38784640) from PID 0; stack trace: ***
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int,
siginfo_t*, void*) at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/common/signal_handler.h:421
1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0]
in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
2# JVM_handle_linux_signal in
/usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
 3# 0x00007F80B37D0520 in /lib/x86_64-linux-gnu/libc.so.6
4# doris::MemTableWriter::_flush_memtable_async() at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/olap/memtable_writer.cpp:157
5# doris::MemTableWriter::flush_async() at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/olap/memtable_writer.cpp:187
6# doris::MemTableMemoryLimiter::_flush_active_memtables(long) at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/olap/memtable_memory_limiter.cpp:190
7# doris::MemTableMemoryLimiter::handle_memtable_flush() at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/olap/memtable_memory_limiter.cpp:144
8# doris::LoadChannelMgr::add_batch(doris::PTabletWriterAddBlockRequest
const&, doris::PTabletWriterAddBlockResult*) at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/runtime/load_channel_mgr.cpp:154
9# std::_Function_handler<void (),
doris::PInternalService::tablet_writer_add_block(google::protobuf::RpcController*,
doris::PTabletWriterAddBlockRequest const*,
doris::PTabletWriterAddBlockResult*,
google::protobuf::Closure*)::$_0>::_M_invoke(std::_Any_data const&) at
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
10# doris::WorkThreadPool<false>::work_thread(int) at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/util/work_thread_pool.hpp:159
11# execute_native_thread_routine at
../../../../../libstdc++-v3/src/c++11/thread.cc:84
12# start_thread at ./nptl/pthread_create.c:442
13# 0x00007F80B38B4850 at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:83

This PR addresses potential null pointer dereference crashes that could
occur when write operations fail and the memtable is reset. The changes
add defensive null checks to ensure safe handling of the _mem_table
state during flush memtable.
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
### What problem does this PR solve?

Issue Number: DORIS-18927

Related PR: apache#47860 and apache#47610
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.4-merged p0_c reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants