Skip to content

Conversation

@suxiaogang223
Copy link
Contributor

What problem does this PR solve?

Metadata scans (e.g., Iceberg all_files) can produce many splits, but the prior logic grouped them into at most one scan range per backend, which capped NumScanners and MaxScanConcurrency at 1 in common cases. This change aligns metadata scans with per-split assignment used by file scans, enabling higher parallelism.

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32912 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 6dc8681c8d047b734c50b7c6007c11415251f72f, data reload: false

------ Round 1 ----------------------------------
q1	17646	5353	5053	5053
q2	2040	305	192	192
q3	10205	1316	741	741
q4	10216	851	320	320
q5	7559	2198	1894	1894
q6	200	181	149	149
q7	914	751	598	598
q8	9263	1368	1100	1100
q9	5073	4927	4848	4848
q10	6813	1986	1580	1580
q11	524	293	283	283
q12	335	384	235	235
q13	17777	4078	3285	3285
q14	254	244	233	233
q15	885	835	809	809
q16	696	709	621	621
q17	647	757	509	509
q18	6784	6480	7449	6480
q19	1372	1027	659	659
q20	420	360	245	245
q21	2876	2220	2044	2044
q22	1214	1099	1034	1034
Total cold run time: 103713 ms
Total hot run time: 32912 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5666	5545	5618	5545
q2	280	344	255	255
q3	2390	2935	2618	2618
q4	1451	1841	1424	1424
q5	4563	4549	4682	4549
q6	230	180	142	142
q7	2010	1901	2029	1901
q8	2567	2388	2383	2383
q9	7654	7378	7653	7378
q10	2957	2989	2562	2562
q11	550	485	461	461
q12	701	774	695	695
q13	3878	4069	3283	3283
q14	271	289	261	261
q15	845	794	783	783
q16	631	680	648	648
q17	1080	1237	1270	1237
q18	7608	7322	7483	7322
q19	808	794	812	794
q20	1980	2103	1900	1900
q21	4497	4340	4139	4139
q22	1101	1029	1002	1002
Total cold run time: 53718 ms
Total hot run time: 51282 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.44 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 6dc8681c8d047b734c50b7c6007c11415251f72f, data reload: false

query1	0.05	0.05	0.05
query2	0.10	0.05	0.05
query3	0.26	0.09	0.08
query4	1.61	0.11	0.11
query5	0.27	0.25	0.25
query6	1.16	0.69	0.68
query7	0.04	0.03	0.02
query8	0.05	0.04	0.04
query9	0.56	0.51	0.50
query10	0.55	0.54	0.54
query11	0.15	0.09	0.11
query12	0.13	0.11	0.11
query13	0.63	0.62	0.61
query14	1.08	1.06	1.06
query15	0.88	0.86	0.89
query16	0.40	0.42	0.39
query17	1.14	1.13	1.13
query18	0.24	0.22	0.22
query19	2.14	1.96	2.05
query20	0.02	0.01	0.02
query21	15.41	0.27	0.14
query22	5.25	0.05	0.05
query23	16.00	0.29	0.10
query24	1.91	0.44	0.20
query25	0.08	0.08	0.06
query26	0.13	0.14	0.13
query27	0.06	0.06	0.06
query28	3.66	1.20	0.98
query29	12.59	4.00	3.17
query30	0.28	0.15	0.14
query31	2.82	0.68	0.43
query32	3.24	0.61	0.52
query33	3.27	3.24	3.27
query34	16.37	5.54	4.78
query35	4.90	4.80	4.83
query36	0.64	0.50	0.49
query37	0.12	0.08	0.07
query38	0.08	0.04	0.04
query39	0.05	0.03	0.04
query40	0.19	0.16	0.16
query41	0.09	0.04	0.03
query42	0.04	0.03	0.02
query43	0.05	0.04	0.04
Total cold run time: 98.69 s
Total hot run time: 28.44 s

@suxiaogang223
Copy link
Contributor Author

run buildall

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 0.00% (0/6) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (6/6) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants