Skip to content

Conversation

@kaijchen
Copy link
Member

backport #47617

…pache#47617)

Issue Number: DORIS-17408

Problem Summary:

When the load is canceled due to an excessive number of filtered rows in
`CsvReader`, the end of stream flag is set, but an error is not
returned. As a result, the following confusing message is shown to the
user:

```json
"Message": "[CANCELLED]cancelled: closed",
```

This PR addresses the issue by returning a `DataQualityError` in such
cases. The updated message will be:

```json
"Message": "[CANCELLED]cancelled: [DATA_QUALITY_ERROR]Encountered unqualified data, stop processing. cur path: .",
```
@kaijchen kaijchen requested a review from dataroaring as a code owner March 20, 2025 03:17
@Thearas
Copy link
Contributor

Thearas commented Mar 20, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@kaijchen
Copy link
Member Author

run buildall

dataroaring
dataroaring previously approved these changes Mar 24, 2025
Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 24, 2025
@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Mar 24, 2025
@kaijchen
Copy link
Member Author

run buildall

@kaijchen
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40339 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 1fc964e22207f3c46dc28d8f5408a9ba80bc4dc4, data reload: false

------ Round 1 ----------------------------------
q1	17570	6674	6592	6592
q2	2055	167	189	167
q3	10597	1070	1135	1070
q4	10515	739	753	739
q5	7719	2868	2851	2851
q6	224	137	131	131
q7	988	626	592	592
q8	9356	1966	2026	1966
q9	6623	6428	6437	6428
q10	7006	2302	2313	2302
q11	473	260	259	259
q12	402	215	217	215
q13	17806	3007	2984	2984
q14	230	217	220	217
q15	520	478	472	472
q16	659	582	599	582
q17	970	627	569	569
q18	7170	6633	6726	6633
q19	1400	1089	1066	1066
q20	461	205	199	199
q21	4009	3288	3287	3287
q22	1109	1030	1018	1018
Total cold run time: 107862 ms
Total hot run time: 40339 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6518	6548	6528	6528
q2	326	243	232	232
q3	2869	2713	2842	2713
q4	2011	1806	1796	1796
q5	5763	5737	5718	5718
q6	218	135	129	129
q7	2222	1839	1807	1807
q8	3398	3557	3496	3496
q9	8688	8851	8900	8851
q10	3576	3532	3540	3532
q11	591	490	500	490
q12	812	577	587	577
q13	8920	3233	3155	3155
q14	317	290	287	287
q15	516	463	467	463
q16	683	660	651	651
q17	1848	1617	1592	1592
q18	8151	7641	7729	7641
q19	1657	1517	1508	1508
q20	2102	1873	1850	1850
q21	5550	5447	5360	5360
q22	1139	1034	1049	1034
Total cold run time: 67875 ms
Total hot run time: 59410 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 198279 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 1fc964e22207f3c46dc28d8f5408a9ba80bc4dc4, data reload: false

query1	1298	946	934	934
query2	6240	2068	2003	2003
query3	10909	4324	4564	4324
query4	65969	28283	23744	23744
query5	4954	450	422	422
query6	407	188	172	172
query7	5499	310	317	310
query8	314	233	225	225
query9	8792	2601	2595	2595
query10	456	271	251	251
query11	17109	15258	15702	15258
query12	158	104	104	104
query13	1487	455	454	454
query14	10914	7603	7706	7603
query15	204	187	180	180
query16	7102	515	540	515
query17	1133	581	579	579
query18	1847	332	333	332
query19	232	162	165	162
query20	118	116	112	112
query21	212	101	103	101
query22	4834	4417	4404	4404
query23	34347	34275	34513	34275
query24	6186	2872	2929	2872
query25	556	426	428	426
query26	669	175	181	175
query27	1774	354	362	354
query28	4154	2479	2477	2477
query29	710	471	442	442
query30	239	160	165	160
query31	1023	839	822	822
query32	66	56	56	56
query33	468	301	303	301
query34	937	513	527	513
query35	834	739	712	712
query36	1078	1000	967	967
query37	112	77	67	67
query38	4073	4036	4073	4036
query39	1541	1480	1473	1473
query40	194	99	98	98
query41	49	46	48	46
query42	115	100	96	96
query43	533	498	491	491
query44	1179	839	839	839
query45	187	167	165	165
query46	1155	716	747	716
query47	2116	1984	2008	1984
query48	482	382	383	382
query49	716	402	390	390
query50	839	446	431	431
query51	7386	7208	7094	7094
query52	104	91	86	86
query53	256	190	178	178
query54	562	452	457	452
query55	80	77	76	76
query56	256	233	238	233
query57	1250	1135	1145	1135
query58	210	199	201	199
query59	3176	2931	3078	2931
query60	278	268	248	248
query61	112	109	110	109
query62	806	663	668	663
query63	213	189	192	189
query64	1377	699	648	648
query65	3268	3241	3164	3164
query66	652	307	300	300
query67	15982	15797	15631	15631
query68	3989	603	574	574
query69	425	277	267	267
query70	1194	1143	1125	1125
query71	335	256	291	256
query72	6417	3984	4039	3984
query73	751	361	351	351
query74	9828	9125	8978	8978
query75	3297	2635	2635	2635
query76	2161	1038	1007	1007
query77	512	279	275	275
query78	10695	9623	9462	9462
query79	1288	606	592	592
query80	866	459	423	423
query81	515	244	234	234
query82	1262	88	86	86
query83	241	144	153	144
query84	279	76	76	76
query85	889	290	318	290
query86	343	298	289	289
query87	4491	4364	4279	4279
query88	3428	2424	2363	2363
query89	422	297	291	291
query90	1961	186	189	186
query91	181	157	148	148
query92	67	49	52	49
query93	1492	556	556	556
query94	752	298	288	288
query95	354	255	264	255
query96	612	286	281	281
query97	3348	3159	3147	3147
query98	217	200	201	200
query99	1610	1321	1302	1302
Total cold run time: 316710 ms
Total hot run time: 198279 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.12 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 1fc964e22207f3c46dc28d8f5408a9ba80bc4dc4, data reload: false

query1	0.03	0.02	0.03
query2	0.06	0.03	0.03
query3	0.23	0.06	0.06
query4	1.62	0.11	0.11
query5	0.50	0.52	0.52
query6	1.14	0.72	0.73
query7	0.02	0.01	0.02
query8	0.05	0.03	0.03
query9	0.56	0.50	0.50
query10	0.56	0.54	0.55
query11	0.14	0.10	0.10
query12	0.13	0.11	0.11
query13	0.61	0.59	0.60
query14	2.76	2.73	2.86
query15	0.90	0.83	0.83
query16	0.38	0.38	0.41
query17	1.05	0.99	1.05
query18	0.24	0.22	0.23
query19	1.91	1.86	1.98
query20	0.01	0.01	0.02
query21	15.35	0.63	0.59
query22	2.50	2.63	2.27
query23	17.05	1.04	0.82
query24	3.36	0.30	1.43
query25	0.18	0.12	0.19
query26	0.38	0.13	0.13
query27	0.04	0.04	0.05
query28	10.54	0.55	0.45
query29	12.55	3.22	3.28
query30	0.24	0.06	0.05
query31	2.89	0.39	0.38
query32	3.23	0.47	0.45
query33	2.93	3.06	3.05
query34	16.82	4.57	4.55
query35	4.55	4.62	4.56
query36	0.66	0.47	0.49
query37	0.08	0.06	0.06
query38	0.04	0.03	0.04
query39	0.04	0.02	0.02
query40	0.16	0.13	0.13
query41	0.07	0.02	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 106.63 s
Total hot run time: 32.12 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 1.23% (1/81) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 38.88% (10177/26175)
Line Coverage 30.30% (86810/286468)
Region Coverage 29.35% (44598/151939)
Branch Coverage 25.88% (22690/87664)

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit 98cd0ff into apache:branch-3.0 Mar 25, 2025
20 of 23 checks passed
@gavinchou gavinchou mentioned this pull request Apr 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants