Skip to content

Conversation

@mymeiyi
Copy link
Contributor

@mymeiyi mymeiyi commented Feb 17, 2025

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

if set global enable_memtable_on_sink_node = true, replay wal will failed:

failed to replay wal=/data3/storage/wal/118291/1739470368974/1_40199330_47385148733034496_group_commit_104d6db2bc9875fe_f7d1b359986d9c83, st=[INTERNAL_ERROR][INTERNAL_ERROR]close wait failed coz rpc error. VNodeChannel[1739470368975-40199330], load_id=ca40e6a07491aacf-a1aa8bb9dc1c31a5, txn_id=47401455637994497, node=[192.168.35.80:8060](http://192.168.35.80:8060/), add batch req success but status isn't ok, err: [INVALID_ARGUMENT]PStatus: ([192.168.35.80](http://192.168.35.80/))[INVALID_ARGUMENT]illegal partial update block columns: 61, num key columns: 2, total schema columns: 61, host: [192.168.35.80](http://192.168.35.80/)

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Feb 17, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@mymeiyi
Copy link
Contributor Author

mymeiyi commented Feb 17, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31414 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 901860a7ec5816ef94446a424a5a3437b4f76609, data reload: false

------ Round 1 ----------------------------------
q1	17584	5214	5056	5056
q2	2055	310	169	169
q3	10394	1296	712	712
q4	10218	1007	534	534
q5	7529	2412	2760	2412
q6	193	164	131	131
q7	893	755	615	615
q8	9308	1291	1057	1057
q9	4924	4789	4623	4623
q10	6825	2315	1876	1876
q11	477	282	255	255
q12	344	352	211	211
q13	17770	3677	3074	3074
q14	224	218	204	204
q15	519	473	451	451
q16	623	622	576	576
q17	571	863	347	347
q18	6614	6239	6113	6113
q19	1206	949	556	556
q20	317	327	189	189
q21	2801	2267	1945	1945
q22	367	328	308	308
Total cold run time: 101756 ms
Total hot run time: 31414 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5114	5117	5121	5117
q2	227	330	232	232
q3	2120	2650	2294	2294
q4	1411	1859	1365	1365
q5	4235	4087	4149	4087
q6	209	161	129	129
q7	1839	1790	1655	1655
q8	2607	2694	2543	2543
q9	7324	7187	7058	7058
q10	3034	3217	2750	2750
q11	562	502	483	483
q12	686	798	636	636
q13	3480	3890	3236	3236
q14	275	289	274	274
q15	508	463	439	439
q16	642	682	636	636
q17	1130	1592	1322	1322
q18	7551	7404	7228	7228
q19	794	779	856	779
q20	1953	2027	1851	1851
q21	5440	4970	4739	4739
q22	645	603	542	542
Total cold run time: 51786 ms
Total hot run time: 49395 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 182923 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 901860a7ec5816ef94446a424a5a3437b4f76609, data reload: false

query1	974	379	370	370
query2	6516	1914	1911	1911
query3	6797	224	215	215
query4	26832	23785	22997	22997
query5	4361	681	536	536
query6	314	210	180	180
query7	4617	504	296	296
query8	300	254	233	233
query9	8621	2496	2483	2483
query10	460	315	250	250
query11	15661	15023	14847	14847
query12	156	107	103	103
query13	1656	517	405	405
query14	10345	6314	6326	6314
query15	215	202	181	181
query16	7701	635	490	490
query17	1182	707	536	536
query18	1986	398	303	303
query19	190	187	151	151
query20	121	112	114	112
query21	209	117	101	101
query22	4185	4115	4296	4115
query23	33831	32953	33002	32953
query24	7707	2431	2403	2403
query25	515	471	392	392
query26	1236	265	150	150
query27	2115	484	334	334
query28	3946	2372	2369	2369
query29	678	549	417	417
query30	234	180	151	151
query31	932	866	793	793
query32	74	70	62	62
query33	551	348	309	309
query34	776	830	516	516
query35	791	823	731	731
query36	974	974	902	902
query37	117	97	79	79
query38	4029	4137	4320	4137
query39	1428	1375	1397	1375
query40	210	114	104	104
query41	55	53	56	53
query42	125	108	109	108
query43	495	520	468	468
query44	1308	792	787	787
query45	187	171	161	161
query46	872	1041	658	658
query47	1740	1786	1741	1741
query48	375	422	303	303
query49	815	533	434	434
query50	704	757	421	421
query51	4198	4142	4110	4110
query52	110	103	96	96
query53	225	259	188	188
query54	485	497	415	415
query55	82	81	119	81
query56	270	274	248	248
query57	1121	1115	1075	1075
query58	245	240	239	239
query59	2746	2819	2862	2819
query60	273	285	266	266
query61	122	119	119	119
query62	817	730	653	653
query63	231	202	181	181
query64	4192	1054	685	685
query65	3189	3164	3156	3156
query66	1045	423	306	306
query67	15911	15733	15291	15291
query68	5311	784	493	493
query69	478	290	253	253
query70	1204	1103	1160	1103
query71	393	291	275	275
query72	5748	3556	3746	3556
query73	744	746	351	351
query74	8841	9150	8615	8615
query75	3140	3131	2680	2680
query76	3253	1161	765	765
query77	478	376	281	281
query78	9850	9951	9244	9244
query79	2670	863	576	576
query80	634	523	446	446
query81	489	271	238	238
query82	495	125	91	91
query83	166	168	149	149
query84	245	99	83	83
query85	806	351	307	307
query86	381	301	296	296
query87	4520	4507	4326	4326
query88	3982	2174	2185	2174
query89	388	323	294	294
query90	1861	241	197	197
query91	138	141	112	112
query92	74	60	55	55
query93	2342	999	567	567
query94	684	409	301	301
query95	354	273	258	258
query96	470	563	264	264
query97	2782	2833	2742	2742
query98	243	205	198	198
query99	1340	1437	1265	1265
Total cold run time: 269985 ms
Total hot run time: 182923 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.37 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 901860a7ec5816ef94446a424a5a3437b4f76609, data reload: false

query1	0.03	0.03	0.03
query2	0.07	0.03	0.04
query3	0.24	0.07	0.07
query4	1.62	0.10	0.10
query5	0.43	0.41	0.41
query6	1.18	0.66	0.65
query7	0.02	0.02	0.02
query8	0.04	0.03	0.04
query9	0.60	0.52	0.51
query10	0.58	0.59	0.58
query11	0.15	0.10	0.10
query12	0.15	0.11	0.12
query13	0.62	0.59	0.61
query14	2.74	2.68	2.73
query15	0.93	0.84	0.85
query16	0.38	0.38	0.38
query17	1.02	1.02	1.04
query18	0.22	0.20	0.19
query19	1.86	2.05	1.84
query20	0.01	0.01	0.01
query21	15.74	0.88	0.55
query22	0.85	1.05	0.78
query23	15.10	1.35	0.65
query24	11.29	1.17	0.42
query25	0.43	0.34	0.08
query26	0.84	0.18	0.13
query27	0.04	0.05	0.04
query28	6.16	0.77	0.43
query29	12.60	3.96	3.27
query30	0.25	0.09	0.07
query31	2.82	0.57	0.38
query32	3.23	0.54	0.46
query33	2.97	3.05	3.03
query34	15.84	5.14	4.53
query35	4.56	4.56	4.55
query36	0.66	0.50	0.48
query37	0.09	0.07	0.06
query38	0.05	0.04	0.04
query39	0.02	0.02	0.02
query40	0.17	0.14	0.14
query41	0.08	0.02	0.02
query42	0.03	0.02	0.03
query43	0.04	0.03	0.03
Total cold run time: 106.75 s
Total hot run time: 30.37 s

@mymeiyi
Copy link
Contributor Author

mymeiyi commented Feb 17, 2025

run p0

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 43.81% (11447/26126)
Line Coverage: 33.81% (96376/285071)
Region Coverage: 32.52% (49302/151604)
Branch Coverage: 28.19% (24773/87874)
Coverage Report: http://coverage.selectdb-in.cc/coverage/901860a7ec5816ef94446a424a5a3437b4f76609_901860a7ec5816ef94446a424a5a3437b4f76609/report/index.html

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Feb 18, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

@zhannngchen zhannngchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhannngchen zhannngchen merged commit 0036cca into apache:master Feb 18, 2025
25 of 27 checks passed
github-actions bot pushed a commit that referenced this pull request Feb 18, 2025
…ble_on_sink_node (#47968)

Problem Summary:

if `set global enable_memtable_on_sink_node = true`, replay wal will
failed:
```
failed to replay wal=/data3/storage/wal/118291/1739470368974/1_40199330_47385148733034496_group_commit_104d6db2bc9875fe_f7d1b359986d9c83, st=[INTERNAL_ERROR][INTERNAL_ERROR]close wait failed coz rpc error. VNodeChannel[1739470368975-40199330], load_id=ca40e6a07491aacf-a1aa8bb9dc1c31a5, txn_id=47401455637994497, node=[192.168.35.80:8060](http://192.168.35.80:8060/), add batch req success but status isn't ok, err: [INVALID_ARGUMENT]PStatus: ([192.168.35.80](http://192.168.35.80/))[INVALID_ARGUMENT]illegal partial update block columns: 61, num key columns: 2, total schema columns: 61, host: [192.168.35.80](http://192.168.35.80/)
```
github-actions bot pushed a commit that referenced this pull request Feb 18, 2025
…ble_on_sink_node (#47968)

Problem Summary:

if `set global enable_memtable_on_sink_node = true`, replay wal will
failed:
```
failed to replay wal=/data3/storage/wal/118291/1739470368974/1_40199330_47385148733034496_group_commit_104d6db2bc9875fe_f7d1b359986d9c83, st=[INTERNAL_ERROR][INTERNAL_ERROR]close wait failed coz rpc error. VNodeChannel[1739470368975-40199330], load_id=ca40e6a07491aacf-a1aa8bb9dc1c31a5, txn_id=47401455637994497, node=[192.168.35.80:8060](http://192.168.35.80:8060/), add batch req success but status isn't ok, err: [INVALID_ARGUMENT]PStatus: ([192.168.35.80](http://192.168.35.80/))[INVALID_ARGUMENT]illegal partial update block columns: 61, num key columns: 2, total schema columns: 61, host: [192.168.35.80](http://192.168.35.80/)
```
yiguolei pushed a commit that referenced this pull request Feb 19, 2025
…enable_memtable_on_sink_node #47968 (#48026)

Cherry-picked from #47968

Co-authored-by: meiyi <meiyi@selectdb.com>
lzyy2024 pushed a commit to lzyy2024/doris that referenced this pull request Feb 21, 2025
…ble_on_sink_node (apache#47968)

Problem Summary:

if `set global enable_memtable_on_sink_node = true`, replay wal will
failed:
```
failed to replay wal=/data3/storage/wal/118291/1739470368974/1_40199330_47385148733034496_group_commit_104d6db2bc9875fe_f7d1b359986d9c83, st=[INTERNAL_ERROR][INTERNAL_ERROR]close wait failed coz rpc error. VNodeChannel[1739470368975-40199330], load_id=ca40e6a07491aacf-a1aa8bb9dc1c31a5, txn_id=47401455637994497, node=[192.168.35.80:8060](http://192.168.35.80:8060/), add batch req success but status isn't ok, err: [INVALID_ARGUMENT]PStatus: ([192.168.35.80](http://192.168.35.80/))[INVALID_ARGUMENT]illegal partial update block columns: 61, num key columns: 2, total schema columns: 61, host: [192.168.35.80](http://192.168.35.80/)
```
dataroaring pushed a commit that referenced this pull request Feb 24, 2025
…enable_memtable_on_sink_node #47968 (#48027)

Cherry-picked from #47968

Co-authored-by: meiyi <meiyi@selectdb.com>
@yiguolei yiguolei mentioned this pull request Mar 25, 2025
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
…ble_on_sink_node (apache#47968)

Problem Summary:

if `set global enable_memtable_on_sink_node = true`, replay wal will
failed:
```
failed to replay wal=/data3/storage/wal/118291/1739470368974/1_40199330_47385148733034496_group_commit_104d6db2bc9875fe_f7d1b359986d9c83, st=[INTERNAL_ERROR][INTERNAL_ERROR]close wait failed coz rpc error. VNodeChannel[1739470368975-40199330], load_id=ca40e6a07491aacf-a1aa8bb9dc1c31a5, txn_id=47401455637994497, node=[192.168.35.80:8060](http://192.168.35.80:8060/), add batch req success but status isn't ok, err: [INVALID_ARGUMENT]PStatus: ([192.168.35.80](http://192.168.35.80/))[INVALID_ARGUMENT]illegal partial update block columns: 61, num key columns: 2, total schema columns: 61, host: [192.168.35.80](http://192.168.35.80/)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.9-merged dev/3.0.5-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants