Skip to content

Conversation

@amorynan
Copy link
Contributor

FIX:

  1. array with empty row which will make doc id is not right
  2. array with inverted index with large data set . then compaction action is triggered , maybe here a core would happened like
=================================================================
==1006012==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7f4170e7f800 at pc 0x5564999e655a bp 0x7f46478a19d0 sp 0x7f46478a19c8
READ of size 1 at 0x7f4170e7f800 thread T1281 (CumuCompactionT)
    #0 0x5564999e6559 in doris::segment_v2::InvertedIndexColumnWriterImpl<(doris::FieldType)5>::add_array_values(unsigned long, void const*, unsigned char const*, unsigned char const*, unsigned long) /mnt/disk1/wangqiannan/amory/doris/be/src/olap/rowset/segment_v2/inverted_index_writer.cpp:418:25
    #1 0x55649995254f in doris::segment_v2::ArrayColumnWriter::append_data(unsigned char const**, unsigned long) /mnt/disk1/wangqiannan/amory/doris/be/src/olap/rowset/segment_v2/column_writer.cpp:960:13
    #2 0x556499952a7d in doris::segment_v2::ArrayColumnWriter::append_nullable(unsigned char const*, unsigned char const**, unsigned long) /mnt/disk1/wangqiannan/amory/doris/be/src/olap/rowset/segment_v2/column_writer.cpp:978:5
    #3 0x55649994568c in doris::segment_v2::ColumnWriter::append(unsigned char const*, void const*, unsigned long) /mnt/disk1/wangqiannan/amory/doris/be/src/olap/rowset/segment_v2/column_writer.cpp:419:16
    #4 0x5564998fdfc0 in doris::segment_v2::SegmentWriter::append_block(doris::vectorized::Block const*, unsigned long, unsigned long) /mnt/disk1/wangqiannan/amory/doris/be/src/olap/rowset/segment_v2/segment_writer.cpp:842:9
    #5 0x556499ad46df in doris::VerticalBetaRowsetWriter<doris::BetaRowsetWriter>::add_columns(doris::vectorized::Block const*, std::vector<unsigned int, std::allocator<unsigned int>> const&, bool, unsigned int) /mnt/disk1/wangqiannan/amory/doris/be/src/olap/rowset/vertical_beta_rowset_writer.cpp:112:13
    #6 0x556499146a2a in doris::Merger::vertical_compact_one_group(std::shared_ptr<doris::BaseTablet>, doris::ReaderType, doris::TabletSchema const&, bool, std::vector<unsigned int, std::allocator<unsigned int>> const&, doris::vectorized::RowSourcesBuffer*, std::vector<std::shared_ptr<doris::RowsetReader>, std::allocator<std::shared_ptr<doris::RowsetReader>>> const&, doris::RowsetWriter*, long, doris::Merger::Statistics*, std::vector<unsigned int, std::allocator<unsigned int>>) /mnt/disk1/wangqiannan/amory/doris/be/src/olap/merger.cpp:297:9
    #7 0x556499149b6f in doris::Merger::vertical_merge_rowsets(std::shared_ptr<doris::BaseTablet>, doris::ReaderType, doris::TabletSchema const&, std::vector<std::shared_ptr<doris::RowsetReader>, std::allocator<std::shared_ptr<doris::RowsetReader>>> const&, doris::RowsetWriter*, long, doris::Merger::Statistics*) /mnt/disk1/wangqiannan/amory/doris/be/src/olap/merger.cpp:406:9
    #8 0x5564990accf9 in doris::Compaction::merge_input_rowsets() /mnt/disk1/wangqiannan/amory/doris/be/src/olap/compaction.cpp:175:19
    #9 0x5564990b4cc3 in doris::CompactionMixin::execute_compact_impl(long) /mnt/disk1/wangqiannan/amory/doris/be/src/olap/compaction.cpp:435:5
    #10 0x5564990b33ff in doris::CompactionMixin::execute_compact() /mnt/disk1/wangqiannan/amory/doris/be/src/olap/compaction.cpp:388:17
    #11 0x556499df7514 in doris::CumulativeCompaction::execute_compact() /mnt/disk1/wangqiannan/amory/doris/be/src/olap/cumulative_compaction.cpp:101:5
    #12 0x556499d94e2f in doris::Tablet::execute_compaction(doris::CompactionMixin&) /mnt/disk1/wangqiannan/amory/doris/be/src/olap/tablet.cpp:1662:29
    #13 0x5564990097d8 in doris::StorageEngine::_submit_compaction_task(std::shared_ptr<doris::Tablet>, doris::CompactionType, bool)::$_1::operator()() const /mnt/disk1/wangqiannan/amory/doris/be/src/olap/olap_server.cpp:1001:25
    #14 0x556499009304 in void std::__invoke_impl<void, doris::StorageEngine::_submit_compaction_task(std::shared_ptr<doris::Tablet>, doris::CompactionType, bool)::$_1&>(std::__invoke_other, doris::StorageEngine::_submit_compaction_task(std::shared_ptr<doris::Tablet>, doris::CompactionType, bool)::$_1&) /mnt/disk1/wangqiannan/tool/ldb_toolchain_16/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14
    #15 0x5564990092a4 in std::enable_if<is_invocable_r_v<void, doris::StorageEngine::_submit_compaction_task(std::shared_ptr<doris::Tablet>, doris::CompactionType, bool)::$_1&>, void>::type std::__invoke_r<void, doris::StorageEngine::_submit_compaction_task(std::shared_ptr<doris::Tablet>, doris::CompactionType, bool)::$_1&>(doris::StorageEngine::_submit_compaction_task(std::shared_ptr<doris::Tablet>, doris::CompactionType, bool)::$_1&) /mnt/disk1/wangqiannan/tool/ldb_toolchain_16/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:111:2
    #16 0x5564990090ac in std::_Function_handler<void (), doris::StorageEngine::_submit_compaction_task(std::shared_ptr<doris::Tablet>, doris::CompactionType, bool)::$_1>::_M_invoke(std::_Any_data const&) /mnt/disk1/wangqiannan/tool/ldb_toolchain_16/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291:9
    #17 0x556497352fb2 in std::function<void ()>::operator()() const /mnt/disk1/wangqiannan/tool/ldb_toolchain_16/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:560:9
    #18 0x55649b1a6018 in doris::FunctionRunnable::run() /mnt/disk1/wangqiannan/amory/doris/be/src/util/threadpool.cpp:48:27
    #19 0x55649b191dcd in doris::ThreadPool::dispatch_thread() /mnt/disk1/wangqiannan/amory/doris/be/src/util/threadpool.cpp:543:24
    #20 0x55649b1b8ef3 in void std::__invoke_impl<void, void (doris::ThreadPool::*&)(), doris::ThreadPool*&>(std::__invoke_memfun_deref, void (doris::ThreadPool::*&)(), doris::ThreadPool*&) /mnt/disk1/wangqiannan/tool/ldb_toolchain_16/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74:14
    #21 0x55649b1b8dcc in std::__invoke_result<void (doris::ThreadPool::*&)(), doris::ThreadPool*&>::type std::__invoke<void (doris::ThreadPool::*&)(), doris::ThreadPool*&>(void (doris::ThreadPool::*&)(), doris::ThreadPool*&) /mnt/disk1/wangqiannan/tool/ldb_toolchain_16/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14
    #22 0x55649b1b8d54 in void std::_Bind<void (doris::ThreadPool::* (doris::ThreadPool*))()>::__call<void, 0ul>(std::tuple<>&&, std::_Index_tuple<0ul>) /mnt/disk1/wangqiannan/tool/ldb_toolchain_16/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/functional:420:11
    #23 0x55649b1b8bfd in void std::_Bind<void (doris::ThreadPool::* (doris::ThreadPool*))()>::operator()<void>() /mnt/disk1/wangqiannan/tool/ldb_toolchain_16/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/functional:503:17

Proposed changes

Issue Number: close #xxx

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Member

@airborne12 airborne12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be need a p1 test case

@amorynan
Copy link
Contributor Author

may be need a p1 test case

yes. I will put it later.

Copy link
Contributor

@zzzxl1993 zzzxl1993 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@amorynan
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.19% (8919/25342)
Line Coverage: 26.98% (73346/271879)
Region Coverage: 26.16% (37898/144890)
Branch Coverage: 22.98% (19298/83994)
Coverage Report: http://coverage.selectdb-in.cc/coverage/d6b412da3318cc26ae30b40480e9fbc537cd2322_d6b412da3318cc26ae30b40480e9fbc537cd2322/report/index.html

@amorynan
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

1 similar comment
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.19% (8921/25351)
Line Coverage: 26.97% (73364/271977)
Region Coverage: 26.15% (37896/144939)
Branch Coverage: 22.96% (19293/84012)
Coverage Report: http://coverage.selectdb-in.cc/coverage/eb5599fe2d1efb2630a6f3541b34dda588d6eed6_eb5599fe2d1efb2630a6f3541b34dda588d6eed6/report/index.html

@doris-robot
Copy link

TPC-DS: Total hot run time: 186174 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit eb5599fe2d1efb2630a6f3541b34dda588d6eed6, data reload: false

query1	912	365	340	340
query2	6463	2246	2242	2242
query3	6648	202	213	202
query4	24504	21316	21498	21316
query5	4143	414	409	409
query6	274	187	181	181
query7	4583	288	288	288
query8	257	206	191	191
query9	8644	2308	2292	2292
query10	442	250	253	250
query11	14720	14263	14332	14263
query12	139	96	87	87
query13	1649	365	375	365
query14	9940	7462	6693	6693
query15	229	181	180	180
query16	8055	259	265	259
query17	1857	584	556	556
query18	2079	283	272	272
query19	211	156	152	152
query20	92	86	86	86
query21	200	122	124	122
query22	5068	4861	4840	4840
query23	33643	33167	33091	33091
query24	11926	2909	3032	2909
query25	649	379	372	372
query26	1730	151	145	145
query27	3049	311	324	311
query28	7632	1998	1987	1987
query29	995	616	589	589
query30	280	153	153	153
query31	986	739	717	717
query32	89	54	55	54
query33	753	248	240	240
query34	1054	479	463	463
query35	823	693	692	692
query36	1061	866	912	866
query37	136	65	68	65
query38	3367	3218	3225	3218
query39	1593	1551	1520	1520
query40	276	123	124	123
query41	45	38	38	38
query42	102	93	94	93
query43	568	546	532	532
query44	1181	713	732	713
query45	264	263	265	263
query46	1079	728	713	713
query47	1961	1855	1861	1855
query48	368	299	286	286
query49	1182	394	387	387
query50	776	384	371	371
query51	6656	6685	6628	6628
query52	108	92	95	92
query53	343	283	277	277
query54	317	239	237	237
query55	78	73	74	73
query56	235	227	220	220
query57	1228	1145	1153	1145
query58	230	197	198	197
query59	3268	3223	3291	3223
query60	256	231	231	231
query61	94	87	88	87
query62	656	455	446	446
query63	301	272	270	270
query64	9512	7132	7166	7132
query65	3067	3072	3055	3055
query66	1375	343	334	334
query67	15431	15123	15167	15123
query68	7348	524	524	524
query69	522	305	292	292
query70	1160	1111	1094	1094
query71	499	264	264	264
query72	7852	2603	2449	2449
query73	726	317	333	317
query74	6873	6455	6565	6455
query75	4190	2654	2680	2654
query76	5128	1023	1029	1023
query77	685	270	262	262
query78	10967	10197	10145	10145
query79	8084	510	508	508
query80	1201	423	438	423
query81	488	222	232	222
query82	833	94	88	88
query83	205	164	165	164
query84	261	85	82	82
query85	1385	271	266	266
query86	407	285	299	285
query87	3466	3267	3291	3267
query88	4571	2334	2342	2334
query89	526	378	414	378
query90	2052	182	183	182
query91	123	98	96	96
query92	62	51	46	46
query93	6087	507	495	495
query94	1125	182	178	178
query95	383	291	292	291
query96	588	262	261	261
query97	3131	2946	2918	2918
query98	239	225	220	220
query99	1280	888	869	869
Total cold run time: 304279 ms
Total hot run time: 186174 ms

@amorynan
Copy link
Contributor Author

run external

Copy link
Contributor

@zzzxl1993 zzzxl1993 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@amorynan
Copy link
Contributor Author

run p0

1 similar comment
@amorynan
Copy link
Contributor Author

run p0

Copy link
Member

@eldenmoon eldenmoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 27, 2024
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@eldenmoon eldenmoon merged commit a80adab into apache:master Apr 27, 2024
dataroaring pushed a commit that referenced this pull request Apr 27, 2024
…ataset witch will make core (#34076)

* fix for array inverted index writer with large dataset witch will make core

* add cases

* change p1 to p2

* updated
xiaokang pushed a commit that referenced this pull request Jun 6, 2024
here with some array with inverted index bugfix:
see also: 
#34766
#35086
#34683
#34076
xiaokang pushed a commit that referenced this pull request Jun 12, 2024
mongo360 pushed a commit to mongo360/doris that referenced this pull request Aug 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.0.12-merged dev/2.1.4-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants