Skip to content

Conversation

@zhangstar333
Copy link
Contributor

@zhangstar333 zhangstar333 commented Aug 4, 2025

picked from #54137

…pache#54137)

Problem Summary:
```
doris_be: /mnt/disk8/zhangsida/doris/thirdparty/installed/include/parallel_hashmap/phmap.h:1511: void phmap::priv::raw_hash_set<phmap::priv::FlatHashMapPolicy<unsigned char, char *>, HashCRC32<doris::vectorized::UInt8>, phmap::EqualTo<unsigned char>, doris::vectorized::Allocator_<std::pair<const unsigned char, char *>>>::constructor::operator()(Args &&...) const [Policy = phmap::priv::FlatHashMapPolicy<unsigned char, char *>, Hash = HashCRC32<doris::vectorized::UInt8>, Eq = phmap::EqualTo<unsigned char>, Alloc = doris::vectorized::Allocator_<std::pair<const unsigned char, char *>>, Args = <unsigned char &, std::nullptr_t>]: Assertion `*slot_' failed.
*** Query id: 6bda59ac672d4496-bcf117ae8ce3f894 ***
*** is nereids: 1 ***
*** tablet id: 0 ***
*** Aborted at 1753928330 (unix time) try "date -d @1753928330" if you are using GNU date ***
*** Current BE git commitID: 9634891 ***
*** SIGABRT unknown detail explain (@0x3ef003cad02) received by PID 3976450 (TID 3979645 OR 0x7fad84c15700) from PID 3976450; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /mnt/disk8/zhangsida/doris/be/src/common/signal_handler.h:420
 1# 0x00007FAF620E25B0 in /lib64/libc.so.6
 2# __GI_raise in /lib64/libc.so.6
 3# __GI_abort in /lib64/libc.so.6
 4# _nl_load_domain.cold.0 in /lib64/libc.so.6
 5# 0x00007FAF620DAE86 in /lib64/libc.so.6
 6# void phmap::priv::raw_hash_set<phmap::priv::FlatHashMapPolicy<unsigned char, char*>, HashCRC32<unsigned char>, phmap::EqualTo<unsigned char>, doris::vectorized::Allocator_<std::pair<unsigned char const, char*> > >::constructor::operator()<unsigned char&, decltype(nullptr)>(unsigned char&, decltype(nullptr)&&) const at /mnt/disk8/zhangsida/doris/thirdparty/installed/include/parallel_hashmap/phmap.h:1511
 7# _ZNSt8__detail9__variant17__gen_vtable_implINS0_12_Multi_arrayIPFNS0_21__deduce_visit_resultIbEEON5doris10vectorized8OverloadIJZNS5_8pipeline17AggSinkLocalState30_emplace_into_hash_table_limitEPPcPNS6_5BlockERKSt6vectorIiSaIiEERSE_IPKNS6_7IColumnESaISL_EEjE3$_0ZNS9_30_emplace_into_hash_table_limitESB_SD_SI_SO_jE3$_1EEERSt7variantIJSt9monostateNS6_16MethodSerializedI9PHHashMapINS5_9StringRefESA_11DefaultHashISX_vEEEENS6_15MethodOneNumberIhSW_IhSA_9HashCRC32IhEEEENS12_ItSW_ItSA_S13_ItEEEENS12_IjSW_IjSA_S13_IjEEEENS12_ImSW_ImSA_S13_ImEEEENS6_19MethodStringNoCacheINS5_13StringHashMapISA_NS5_9AllocatorILb1ELb1ELb0ENS5_22DefaultMemoryAllocatorELb1EEEEEEENS12_IN4wide7integerILm128EjEESW_IS1P_SA_S13_IS1P_EEEENS12_INS1O_ILm256EjEESW_IS1T_SA_S13_IS1T_EEEENS12_IjSW_IjSA_14HashMixWrapperIjS1A_EEEENS12_ImSW_ImSA_S1X_ImS1D_EEEENS6_26MethodSingleNullableColumnINS12_IhNS6_15DataWithNullKeyIS15_EEEEEENS24_INS12_ItNS25_IS18_EEEEEENS24_INS12_IjNS25_IS1B_EEEEEENS24_INS12_ImNS25_IS1E_EEEEEENS24_INS12_IjNS25_IS1Z_EEEEEENS24_INS12_ImNS25_IS22_EEEEEENS24_INS12_IS1P_NS25_IS1R_EEEEEENS24_INS12_IS1T_NS25_IS1V_EEEEEENS24_INS1G_INS25_IS1L_EEEEEENS6_15MethodKeysFixedIS1E_EENS2X_IS1R_EENS2X_IS1V_EENS2X_ISW_INS6_7UInt136ESA_S13_IS31_EEEEEEEJEEESt16integer_sequenceImJLm11EEEE14__visit_invokeESS_S36_ at /mnt/disk8/zhangsida/install_data/ldb_toolchain_robin/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/variant:1032
 8# doris::pipeline::AggSinkLocalState::_emplace_into_hash_table_limit(char**, doris::vectorized::Block*, std::vector<int, std::allocator<int> > const&, std::vector<doris::vectorized::IColumn const*, std::allocator<doris::vectorized::IColumn const*> >&, unsigned int) at /mnt/disk8/zhangsida/doris/be/src/pipeline/exec/aggregation_sink_operator.cpp:584
 9# doris::Status doris::pipeline::AggSinkLocalState::_execute_with_serialized_key_helper<true>(doris::vectorized::Block*) at /mnt/disk8/zhangsida/doris/be/src/pipeline/exec/aggregation_sink_operator.cpp:496
10# doris::pipeline::AggSinkLocalState::_execute_with_serialized_key(doris::vectorized::Block*) at /mnt/disk8/zhangsida/doris/be/src/pipeline/exec/aggregation_sink_operator.cpp:443
11# doris::pipeline::AggSinkLocalState::Executor<false, false>::execute(doris::pipeline::AggSinkLocalState*, doris::vectorized::Block*) at /mnt/disk8/zhangsida/doris/be/src/pipeline/exec/aggregation_sink_operator.h:62
```

```
                                try {
                                    HashMethodType::try_presis_key_and_origin(key, origin,
                                                                              _agg_arena_pool);
                                    auto mapped =
                                            _shared_state->aggregate_data_container->append_data(
                                                    origin);
                                    auto st = _create_agg_status(mapped);
                                    if (!st) {
                                        throw Exception(st.code(), st.to_string());
                                    }
                                    ctor(key, mapped);
                                   _shared_state->refresh_top_limit(i, key_columns);
                                } catch (...) {
                                    // Exception-safety - if it can not allocate memory or create status,
                                    // the destructors will not be called.
                                    ctor(key, nullptr);
                                    throw;
                                }
```

when _shared_state->refresh_top_limit(i, key_columns); throw exception
will exectue ctor(key, nullptr);, but before have exectue ctor(key,
mapped)

None

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
@zhangstar333 zhangstar333 requested a review from morrySnow as a code owner August 4, 2025 06:14
@Thearas
Copy link
Contributor

Thearas commented Aug 4, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@zhangstar333
Copy link
Contributor Author

run buildall

@morrySnow morrySnow changed the title branch-31: [Bug](agg) fix agg with topn limit coredump with ctor same key twice (#54137) branch-31: [Bug](agg) fix agg with topn limit coredump with ctor same key twice #54137 Aug 4, 2025
@morrySnow morrySnow changed the title branch-31: [Bug](agg) fix agg with topn limit coredump with ctor same key twice #54137 branch-3.1: [Bug](agg) fix agg with topn limit coredump with ctor same key twice #54137 Aug 4, 2025
@doris-robot
Copy link

TPC-H: Total hot run time: 32395 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 1336115b02a68509537c81ca76d0354b479bce5f, data reload: false

------ Round 1 ----------------------------------
q1	17590	5431	5341	5341
q2	2055	276	169	169
q3	10575	1264	750	750
q4	10307	879	448	448
q5	9282	2339	2132	2132
q6	188	165	137	137
q7	910	757	617	617
q8	9335	1410	1158	1158
q9	5223	4898	4881	4881
q10	6788	2273	1836	1836
q11	521	280	260	260
q12	335	358	210	210
q13	17759	3569	3022	3022
q14	228	234	206	206
q15	534	466	469	466
q16	411	421	376	376
q17	593	843	351	351
q18	6800	6441	6364	6364
q19	1371	946	545	545
q20	316	340	214	214
q21	3040	2121	1942	1942
q22	1023	1028	970	970
Total cold run time: 105184 ms
Total hot run time: 32395 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5540	5547	5443	5443
q2	236	332	232	232
q3	2203	2602	2313	2313
q4	1344	1774	1349	1349
q5	4384	4855	4968	4855
q6	174	167	128	128
q7	2002	2003	1803	1803
q8	2580	2770	2675	2675
q9	7204	7217	7166	7166
q10	2989	3339	2738	2738
q11	568	493	481	481
q12	625	730	630	630
q13	3427	3842	3169	3169
q14	285	299	276	276
q15	524	477	473	473
q16	435	507	432	432
q17	1225	1719	1257	1257
q18	7616	7481	7342	7342
q19	785	963	1111	963
q20	1999	2052	1869	1869
q21	5267	4855	4557	4557
q22	1080	1057	1000	1000
Total cold run time: 52492 ms
Total hot run time: 51151 ms

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 0.00% (0/1) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 45.39% (12636/27838)
Line Coverage 36.25% (112647/310736)
Region Coverage 35.30% (58226/164955)
Branch Coverage 32.49% (31667/97456)

@doris-robot
Copy link

TPC-DS: Total hot run time: 196748 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 1336115b02a68509537c81ca76d0354b479bce5f, data reload: false

query1	1288	907	903	903
query2	6326	1885	1870	1870
query3	10923	4582	4442	4442
query4	33458	23827	23533	23533
query5	3729	619	456	456
query6	268	199	182	182
query7	3993	488	310	310
query8	305	244	242	242
query9	9341	2570	2562	2562
query10	467	332	248	248
query11	18228	15361	15229	15229
query12	164	109	104	104
query13	1586	539	420	420
query14	10046	6578	7672	6578
query15	229	205	190	190
query16	7802	636	499	499
query17	1569	750	589	589
query18	2045	403	308	308
query19	230	183	171	171
query20	126	117	119	117
query21	207	125	111	111
query22	4562	4496	4413	4413
query23	35317	34995	34165	34165
query24	7416	2736	2725	2725
query25	474	490	443	443
query26	1099	274	176	176
query27	2092	481	369	369
query28	5033	2206	2174	2174
query29	580	598	489	489
query30	254	210	164	164
query31	1035	901	832	832
query32	70	62	63	62
query33	531	384	318	318
query34	765	848	528	528
query35	819	800	729	729
query36	1016	1037	976	976
query37	117	101	73	73
query38	4008	3976	4024	3976
query39	1558	1467	1463	1463
query40	214	120	114	114
query41	65	62	56	56
query42	125	108	108	108
query43	509	530	477	477
query44	1325	826	822	822
query45	185	185	180	180
query46	883	1049	667	667
query47	1982	2005	1915	1915
query48	412	431	386	386
query49	789	506	455	455
query50	685	708	422	422
query51	7349	7156	7303	7156
query52	108	106	90	90
query53	237	252	192	192
query54	552	557	459	459
query55	80	80	78	78
query56	272	279	265	265
query57	1263	1260	1217	1217
query58	238	218	215	215
query59	3108	3270	2989	2989
query60	314	286	269	269
query61	124	117	118	117
query62	783	729	693	693
query63	232	192	186	186
query64	3885	1032	675	675
query65	3324	3248	3278	3248
query66	967	406	310	310
query67	16012	15757	15489	15489
query68	6650	814	541	541
query69	491	303	264	264
query70	1196	1098	1088	1088
query71	391	295	264	264
query72	5655	3675	3798	3675
query73	642	734	356	356
query74	10311	9309	9339	9309
query75	3165	3143	2656	2656
query76	3135	1182	786	786
query77	479	347	306	306
query78	10383	10394	9718	9718
query79	3716	913	590	590
query80	726	522	434	434
query81	507	259	218	218
query82	641	117	88	88
query83	168	159	142	142
query84	247	98	85	85
query85	787	356	307	307
query86	395	327	291	291
query87	4324	4365	4227	4227
query88	5000	2398	2368	2368
query89	416	334	295	295
query90	1795	189	190	189
query91	141	140	112	112
query92	70	56	53	53
query93	2623	894	530	530
query94	702	415	295	295
query95	337	277	278	277
query96	490	598	281	281
query97	3178	3295	3127	3127
query98	218	215	204	204
query99	1453	1401	1283	1283
Total cold run time: 295530 ms
Total hot run time: 196748 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.86 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 1336115b02a68509537c81ca76d0354b479bce5f, data reload: false

query1	0.03	0.04	0.03
query2	0.07	0.03	0.03
query3	0.23	0.07	0.06
query4	1.60	0.11	0.11
query5	0.54	0.53	0.51
query6	1.13	0.74	0.72
query7	0.03	0.02	0.01
query8	0.04	0.03	0.03
query9	0.58	0.52	0.51
query10	0.55	0.54	0.55
query11	0.15	0.11	0.10
query12	0.14	0.11	0.11
query13	0.62	0.60	0.59
query14	0.76	0.80	0.78
query15	0.83	0.85	0.82
query16	0.37	0.39	0.39
query17	1.05	1.05	1.06
query18	0.23	0.22	0.22
query19	1.93	1.90	1.76
query20	0.02	0.01	0.02
query21	15.37	0.90	0.58
query22	0.76	0.76	0.63
query23	15.14	1.44	0.50
query24	2.79	1.00	1.50
query25	0.11	0.20	0.24
query26	0.27	0.14	0.13
query27	0.06	0.04	0.05
query28	14.04	1.00	0.44
query29	12.57	3.93	3.26
query30	0.25	0.09	0.06
query31	2.83	0.60	0.38
query32	3.22	0.53	0.45
query33	3.00	2.99	2.96
query34	16.83	5.20	4.54
query35	4.60	4.53	4.56
query36	0.67	0.49	0.48
query37	0.08	0.06	0.06
query38	0.04	0.04	0.03
query39	0.03	0.02	0.02
query40	0.16	0.14	0.12
query41	0.08	0.02	0.02
query42	0.03	0.02	0.03
query43	0.03	0.03	0.04
Total cold run time: 103.86 s
Total hot run time: 28.86 s

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (1/1) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 76.22% (20965/27506)
Line Coverage 69.63% (216212/310529)
Region Coverage 67.64% (129397/191294)
Branch Coverage 61.20% (67311/109982)

@morrySnow morrySnow merged commit a7ac92b into apache:branch-3.1 Aug 4, 2025
26 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants