Skip to content

Conversation

@github-actions
Copy link
Contributor

@github-actions github-actions bot commented Feb 8, 2025

Cherry-picked from #47610

…t crash (#47610)

### What problem does this PR solve?

*** Query id: 5447701417c13e4e-cea25b10f284c6a5 ***
*** is nereids: 0 ***
*** tablet id: 1738818748602 ***
*** Aborted at 1738820047 (unix time) try "date -d @1738820047" if you
are using GNU date ***
*** Current BE git commitID: 512681c ***
*** SIGSEGV invalid permissions for mapped object (@0x7f112a5df53f)
received by PID 6310 (TID 6765 OR 0x7f1384ed3640) from PID 710800703;
stack trace: ***
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int,
siginfo_t*, void*) at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/common/signal_handler.h:421
1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0]
in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
2# JVM_handle_linux_signal in
/usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
 3# 0x00007F14815CC520 in /lib/x86_64-linux-gnu/libc.so.6
4# doris::vectorized::ColumnVector<unsigned
char>::insert_indices_from(doris::vectorized::IColumn const&, unsigned
int const*, unsigned int const*) at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/vec/columns/column_vector.cpp:323
5# doris::vectorized::MutableBlock::add_rows(doris::vectorized::Block
const*, unsigned int const*, unsigned int const*, std::vector<int,
std::allocator<int> > const*) at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/vec/core/block.cpp:1036
6# doris::MemTable::_put_into_output(doris::vectorized::Block&) at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/olap/memtable.cpp:257
7# doris::MemTable::_to_block(std::unique_ptr<doris::vectorized::Block,
std::default_delete<doris::vectorized::Block> >*) at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/olap/memtable.cpp:513
8# doris::MemTable::to_block(std::unique_ptr<doris::vectorized::Block,
std::default_delete<doris::vectorized::Block> >*) at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/olap/memtable.cpp:532
9# doris::FlushToken::_do_flush_memtable(doris::MemTable*, int, long*)
at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/olap/memtable_flush_executor.cpp:144
10# doris::FlushToken::_flush_memtable(std::shared_ptr<doris::MemTable>,
int, long) in /mnt/hdd01/PERFORMANCE_ENV/be/lib/doris_be
11# doris::MemtableFlushTask::run() at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/olap/memtable_flush_executor.cpp:60
12# doris::ThreadPool::dispatch_thread() in
/mnt/hdd01/PERFORMANCE_ENV/be/lib/doris_be
13# doris::Thread::supervise_thread(void*) at
/home/zcp/repo_center/doris_branch-3.0/doris/be/src/util/thread.cpp:499
14# start_thread at ./nptl/pthread_create.c:442
15# 0x00007F14816B0850 at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:83

Problem Summary:
- When memtable insert fails (e.g., due to memory allocation failure
during add_rows),
  the memtable is left in an inconsistent state
- Under memory pressure, the system might trigger a flush operation on
this failed memtable,
  leading to crashes

Solution:
- Reset memtable immediately after insert failure
@github-actions github-actions bot requested a review from dataroaring as a code owner February 8, 2025 06:26
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring closed this Feb 8, 2025
@dataroaring dataroaring reopened this Feb 8, 2025
@hello-stephen
Copy link
Contributor

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40960 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 875d9fbafc0c934d0a870223f5da410c65756023, data reload: false

------ Round 1 ----------------------------------
q1	17578	7352	7219	7219
q2	2054	163	179	163
q3	10744	1086	1172	1086
q4	10467	728	821	728
q5	7762	2832	2772	2772
q6	230	147	142	142
q7	970	620	602	602
q8	9349	1969	2025	1969
q9	6552	6418	6413	6413
q10	7029	2314	2286	2286
q11	471	273	271	271
q12	399	215	213	213
q13	17806	2953	2953	2953
q14	235	216	223	216
q15	570	529	520	520
q16	666	582	585	582
q17	977	538	562	538
q18	7278	6763	6776	6763
q19	1397	1107	1034	1034
q20	481	213	204	204
q21	3978	3348	3288	3288
q22	1099	998	1013	998
Total cold run time: 108092 ms
Total hot run time: 40960 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7310	7271	7249	7249
q2	334	250	229	229
q3	2935	2991	2992	2991
q4	2034	1890	1875	1875
q5	5832	5758	5789	5758
q6	226	138	139	138
q7	2318	1786	1870	1786
q8	3353	3563	3562	3562
q9	8919	9236	9131	9131
q10	3722	3593	3590	3590
q11	635	531	500	500
q12	824	658	677	658
q13	14316	3189	3185	3185
q14	305	272	282	272
q15	574	528	521	521
q16	720	650	649	649
q17	1870	1626	1600	1600
q18	8369	7746	7911	7746
q19	1704	1498	1567	1498
q20	2050	1896	1887	1887
q21	5419	5424	5315	5315
q22	1111	1073	1000	1000
Total cold run time: 74880 ms
Total hot run time: 61140 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 197881 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 875d9fbafc0c934d0a870223f5da410c65756023, data reload: false

query1	1291	905	902	902
query2	6230	2119	2056	2056
query3	10828	4452	4293	4293
query4	61474	29791	23694	23694
query5	5141	456	457	456
query6	391	175	175	175
query7	5462	309	309	309
query8	298	225	217	217
query9	8616	2668	2655	2655
query10	451	274	259	259
query11	17679	15427	15999	15427
query12	162	107	108	107
query13	1443	431	429	429
query14	10924	6874	6764	6764
query15	201	185	183	183
query16	7170	502	534	502
query17	1140	598	584	584
query18	1885	325	325	325
query19	223	162	156	156
query20	116	113	114	113
query21	207	112	109	109
query22	4775	4866	4489	4489
query23	34688	34159	34207	34159
query24	6221	2916	2865	2865
query25	566	435	436	435
query26	659	172	169	169
query27	1854	356	361	356
query28	4278	2437	2406	2406
query29	710	469	427	427
query30	240	161	159	159
query31	970	825	855	825
query32	67	58	56	56
query33	471	295	286	286
query34	927	497	515	497
query35	845	728	729	728
query36	1135	974	988	974
query37	123	74	73	73
query38	4077	4162	4179	4162
query39	1505	1451	1494	1451
query40	199	96	101	96
query41	49	55	50	50
query42	112	100	99	99
query43	565	507	495	495
query44	1165	827	822	822
query45	189	182	170	170
query46	1145	733	749	733
query47	2055	1952	1966	1952
query48	473	375	377	375
query49	712	386	396	386
query50	845	435	424	424
query51	7313	7310	7050	7050
query52	106	86	88	86
query53	258	182	184	182
query54	586	454	461	454
query55	79	79	80	79
query56	275	262	257	257
query57	1262	1159	1173	1159
query58	215	202	205	202
query59	3082	3082	2917	2917
query60	267	255	253	253
query61	107	105	105	105
query62	836	735	732	732
query63	212	188	181	181
query64	1384	681	627	627
query65	3264	3221	3216	3216
query66	634	301	295	295
query67	15926	15879	15531	15531
query68	4158	567	560	560
query69	414	268	271	268
query70	1189	1161	1163	1161
query71	332	253	254	253
query72	6316	3954	4036	3954
query73	744	345	343	343
query74	10425	8995	9070	8995
query75	3393	2621	2677	2621
query76	1963	1084	1093	1084
query77	484	275	271	271
query78	10697	9654	9609	9609
query79	2086	586	584	584
query80	1321	419	426	419
query81	516	247	238	238
query82	1220	125	122	122
query83	182	145	144	144
query84	284	75	80	75
query85	986	290	283	283
query86	385	305	298	298
query87	4430	4256	4315	4256
query88	3927	2364	2349	2349
query89	420	294	292	292
query90	1859	186	184	184
query91	176	150	146	146
query92	64	51	50	50
query93	2599	537	533	533
query94	729	287	301	287
query95	350	254	253	253
query96	622	276	274	274
query97	3374	3189	3219	3189
query98	227	216	202	202
query99	1696	1424	1429	1424
Total cold run time: 316326 ms
Total hot run time: 197881 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.89 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 875d9fbafc0c934d0a870223f5da410c65756023, data reload: false

query1	0.04	0.03	0.03
query2	0.07	0.03	0.03
query3	0.23	0.07	0.07
query4	1.62	0.10	0.10
query5	0.52	0.52	0.53
query6	1.13	0.73	0.73
query7	0.02	0.02	0.02
query8	0.04	0.03	0.03
query9	0.56	0.50	0.50
query10	0.55	0.55	0.55
query11	0.14	0.10	0.10
query12	0.14	0.11	0.11
query13	0.62	0.58	0.60
query14	2.86	2.84	2.79
query15	0.88	0.83	0.83
query16	0.37	0.38	0.38
query17	1.07	1.03	1.05
query18	0.23	0.22	0.22
query19	1.87	1.85	1.97
query20	0.01	0.01	0.01
query21	15.35	0.58	0.55
query22	2.60	1.80	1.64
query23	17.13	0.92	0.83
query24	2.76	0.98	1.76
query25	0.14	0.27	0.25
query26	0.34	0.13	0.14
query27	0.04	0.03	0.05
query28	10.56	1.10	1.07
query29	12.61	3.23	3.25
query30	0.25	0.06	0.06
query31	2.86	0.40	0.38
query32	3.24	0.46	0.45
query33	2.96	3.05	3.02
query34	16.98	4.48	4.49
query35	4.52	4.52	4.49
query36	0.67	0.50	0.50
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.03
query40	0.17	0.13	0.13
query41	0.08	0.03	0.02
query42	0.04	0.03	0.02
query43	0.03	0.04	0.03
Total cold run time: 106.47 s
Total hot run time: 32.89 s

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Feb 8, 2025
@github-actions
Copy link
Contributor Author

github-actions bot commented Feb 8, 2025

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor Author

github-actions bot commented Feb 8, 2025

PR approved by anyone and no changes requested.

@dataroaring dataroaring merged commit ef5864d into branch-3.0 Feb 10, 2025
22 of 24 checks passed
@github-actions github-actions bot deleted the auto-pick-47610-branch-3.0 branch February 10, 2025 02:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants