Skip to content

Conversation

@zhannngchen
Copy link
Contributor

Proposed changes

Issue Number: close #xxx

To accelerate the speed of sync latest delete bitmap, #35856 try to get the delete bitmap from CloudTxnDeleteBitmapCache first.
In the following situation, compaction may get empty delete bitmap and cause duplicate key:

  1. compaction started
  2. several load succeed during the compaction
  3. compaction finished data merging and start to calculate delete bitmap generated by latest load tasks
  4. compaction try to sync rowset and delete bitmap, it get delete bitmap first from CloudTxnDeleteBitmapCache
  5. CloudTxnDeleteBitmapCache::get_delete_bitmap() can get txn infos from it's inner map, but cache missed when it try to get delete bitmap from LRU cache, it don't report error but returned an empty delete bitmap
  6. compaction used wrong delete bitmap, duplicate key occured.

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@zhannngchen
Copy link
Contributor Author

run buildall

@zhannngchen
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.28% (9622/25811)
Line Coverage: 28.69% (79620/277525)
Region Coverage: 28.11% (41167/146455)
Branch Coverage: 24.75% (20974/84748)
Coverage Report: http://coverage.selectdb-in.cc/coverage/09978221b4add3434992638b88596b3bc0e4bf56_09978221b4add3434992638b88596b3bc0e4bf56/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 40650 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 09978221b4add3434992638b88596b3bc0e4bf56, data reload: false

------ Round 1 ----------------------------------
q1	17587	7358	7257	7257
q2	2022	286	273	273
q3	12261	1044	1137	1044
q4	10567	766	737	737
q5	7765	2875	2817	2817
q6	239	151	144	144
q7	964	610	603	603
q8	9570	1918	1933	1918
q9	7706	6436	6370	6370
q10	6948	2301	2276	2276
q11	434	238	248	238
q12	432	226	217	217
q13	17791	2986	3009	2986
q14	249	212	208	208
q15	571	523	532	523
q16	730	599	621	599
q17	966	577	604	577
q18	7132	6830	6665	6665
q19	1400	911	1018	911
q20	487	205	195	195
q21	4009	3351	3097	3097
q22	1102	995	1007	995
Total cold run time: 110932 ms
Total hot run time: 40650 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7253	7220	7204	7204
q2	338	224	238	224
q3	3085	2934	2902	2902
q4	2080	1834	1779	1779
q5	5662	5647	5745	5647
q6	227	144	143	143
q7	2214	1809	1785	1785
q8	3342	3540	3513	3513
q9	8859	9065	8896	8896
q10	3563	3574	3565	3565
q11	566	478	484	478
q12	827	611	625	611
q13	7553	3125	3191	3125
q14	311	282	274	274
q15	573	547	524	524
q16	743	681	689	681
q17	1824	1646	1591	1591
q18	8158	7771	7769	7769
q19	1725	1564	1677	1564
q20	2158	1868	1869	1868
q21	5671	5245	5487	5245
q22	1159	1034	1060	1034
Total cold run time: 67891 ms
Total hot run time: 60422 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 191145 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 09978221b4add3434992638b88596b3bc0e4bf56, data reload: false

query1	978	380	382	380
query2	6380	2075	1984	1984
query3	8691	198	204	198
query4	33875	23622	23567	23567
query5	3500	474	457	457
query6	274	164	159	159
query7	4193	303	296	296
query8	276	217	215	215
query9	9489	2685	2674	2674
query10	466	283	281	281
query11	18009	15289	15242	15242
query12	138	98	98	98
query13	1522	439	420	420
query14	10239	6672	6953	6672
query15	237	168	171	168
query16	7702	441	490	441
query17	1640	623	578	578
query18	2103	351	341	341
query19	235	164	151	151
query20	125	116	122	116
query21	214	109	108	108
query22	4825	4391	4248	4248
query23	35156	34399	34656	34399
query24	11183	2857	2731	2731
query25	626	415	410	410
query26	1201	170	159	159
query27	2289	300	297	297
query28	7266	2483	2430	2430
query29	816	438	438	438
query30	264	160	151	151
query31	1042	799	813	799
query32	95	54	57	54
query33	757	291	299	291
query34	910	495	486	486
query35	865	743	717	717
query36	1075	954	935	935
query37	159	88	92	88
query38	4082	3892	3883	3883
query39	1495	1413	1435	1413
query40	216	99	100	99
query41	50	48	49	48
query42	113	98	96	96
query43	531	481	475	475
query44	1223	820	790	790
query45	198	162	161	161
query46	1172	710	702	702
query47	1901	1795	1853	1795
query48	460	382	376	376
query49	910	424	400	400
query50	830	413	408	408
query51	7066	6908	6863	6863
query52	110	90	91	90
query53	255	182	185	182
query54	1223	464	458	458
query55	79	78	72	72
query56	272	273	263	263
query57	1203	1080	1112	1080
query58	226	240	245	240
query59	3182	2936	2985	2936
query60	288	264	276	264
query61	104	104	104	104
query62	858	671	690	671
query63	217	188	183	183
query64	4018	640	617	617
query65	3281	3214	3197	3197
query66	798	296	315	296
query67	15997	15641	15686	15641
query68	4546	573	552	552
query69	592	303	291	291
query70	1182	1126	1125	1125
query71	412	280	280	280
query72	7590	4075	3834	3834
query73	779	353	346	346
query74	10046	8973	8951	8951
query75	4302	2652	2680	2652
query76	3605	952	881	881
query77	739	290	292	290
query78	10443	9607	9535	9535
query79	3093	600	621	600
query80	2714	471	442	442
query81	598	241	241	241
query82	688	143	138	138
query83	310	146	133	133
query84	288	82	74	74
query85	1744	295	285	285
query86	467	313	280	280
query87	4465	4279	4305	4279
query88	3944	2412	2364	2364
query89	412	291	314	291
query90	2133	185	201	185
query91	177	143	141	141
query92	58	51	50	50
query93	2429	532	539	532
query94	1217	298	298	298
query95	350	254	257	254
query96	619	278	285	278
query97	3220	3078	3175	3078
query98	215	202	193	193
query99	1585	1318	1290	1290
Total cold run time: 306135 ms
Total hot run time: 191145 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.86 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 09978221b4add3434992638b88596b3bc0e4bf56, data reload: false

query1	0.04	0.04	0.04
query2	0.06	0.03	0.03
query3	0.23	0.07	0.07
query4	1.65	0.10	0.10
query5	0.51	0.52	0.50
query6	1.13	0.73	0.72
query7	0.02	0.02	0.01
query8	0.04	0.03	0.04
query9	0.58	0.49	0.52
query10	0.55	0.55	0.53
query11	0.14	0.11	0.10
query12	0.14	0.11	0.11
query13	0.61	0.60	0.60
query14	3.05	2.99	3.01
query15	0.89	0.82	0.83
query16	0.39	0.39	0.36
query17	1.02	1.07	0.96
query18	0.20	0.19	0.20
query19	1.86	1.77	1.95
query20	0.01	0.02	0.01
query21	15.37	0.61	0.61
query22	2.63	3.04	1.99
query23	17.07	0.83	0.83
query24	2.63	1.05	0.91
query25	0.30	0.13	0.19
query26	0.40	0.15	0.13
query27	0.04	0.05	0.04
query28	11.25	1.08	1.06
query29	12.53	3.18	3.24
query30	0.24	0.06	0.05
query31	2.89	0.37	0.37
query32	3.29	0.48	0.45
query33	2.96	3.03	3.01
query34	17.17	4.43	4.48
query35	4.51	4.50	4.46
query36	0.68	0.49	0.48
query37	0.09	0.06	0.06
query38	0.04	0.03	0.03
query39	0.03	0.02	0.02
query40	0.16	0.12	0.12
query41	0.08	0.02	0.02
query42	0.03	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 107.54 s
Total hot run time: 32.86 s

Copy link
Contributor

@hust-hhb hust-hhb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit 3c93a40 into apache:master Sep 26, 2024
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Sep 26, 2024
dataroaring pushed a commit that referenced this pull request Sep 26, 2024
…map from cache failed (#41309)

## Proposed changes

Issue Number: close #xxx

To accelerate the speed of sync latest delete bitmap, #35856 try to get
the delete bitmap from `CloudTxnDeleteBitmapCache` first.
In the following situation, compaction may get empty delete bitmap and
cause duplicate key:
1. compaction started
2. several load succeed during the compaction
3. compaction finished data merging and start to calculate delete bitmap
generated by latest load tasks
4. compaction try to sync rowset and delete bitmap, it get delete bitmap
first from `CloudTxnDeleteBitmapCache`
5. `CloudTxnDeleteBitmapCache::get_delete_bitmap()` can get txn infos
from it's inner map, but cache missed when it try to get delete bitmap
from LRU cache, it don't report error but returned an empty delete
bitmap
6. compaction used wrong delete bitmap, duplicate key occured.
cjj2010 pushed a commit to cjj2010/doris that referenced this pull request Oct 12, 2024
…map from cache failed (apache#41309)

## Proposed changes

Issue Number: close #xxx

To accelerate the speed of sync latest delete bitmap, apache#35856 try to get
the delete bitmap from `CloudTxnDeleteBitmapCache` first.
In the following situation, compaction may get empty delete bitmap and
cause duplicate key:
1. compaction started
2. several load succeed during the compaction
3. compaction finished data merging and start to calculate delete bitmap
generated by latest load tasks
4. compaction try to sync rowset and delete bitmap, it get delete bitmap
first from `CloudTxnDeleteBitmapCache`
5. `CloudTxnDeleteBitmapCache::get_delete_bitmap()` can get txn infos
from it's inner map, but cache missed when it try to get delete bitmap
from LRU cache, it don't report error but returned an empty delete
bitmap
6. compaction used wrong delete bitmap, duplicate key occured.
@gavinchou gavinchou mentioned this pull request Oct 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.2-merged p0_w reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants