Skip to content

Conversation

@hubgeter
Copy link
Contributor

@hubgeter hubgeter commented Oct 30, 2025

What problem does this PR solve?

Problem Summary:
This PR adds a TotalGetRequestTime profile metric to the S3 reader.
The new metric records the total time spent on all GetObject requests, which will be useful for future performance analysis and optimization.

Example profile (Some less relevant information was manually removed.):

FILE_SCAN_OPERATOR  (id=0.  nereids_id=104.  table  name  =  lineitem):
      CommonCounters:
            -  ExecTime:  11sec669ms
            -  OpenTime:  7.395ms
            -  WaitForDependency[FILE_SCAN_OPERATOR_DEPENDENCY]Time:  11sec634ms
              ....
      CustomCounters:
              ....
          Scanner:
                ....
                -  FileReadTime:  4min8sec
                -  FileScannerGetBlockTime:  4min9sec
                -  MergedSmallIO:  0ns
                    -  ReadTime:  4min8sec
                    -  RequestBytes:  2.05  MB
                    -  RequestIO:  20.0K  (20000)
                -  ParquetReader:  0ns
                    -  ColumnReadTime:  4min9sec
                    -  FileOpenTime:  365.293ms
                    -  ParseFooterTime:  5min6sec
                    -  ParseMetaTime:  126.383ms
                         .....
                -  S3Profile:  0ns
                    -  TooManyRequestErr:  0
                    -  TooManyRequestSleepTime:  0ms
                    -  TotalBytesRead:  199.14  MB
                    -  TotalGetRequestTime:  9min12sec
                    -  TotalGetRequest:  40.0K  (40000)
                -  ScannerCpuTime:  20sec946ms
                -  ScannerGetBlockTime:  9min19sec

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Oct 30, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@hubgeter
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

ClickBench: Total hot run time: 27.78 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 2563721118c7e10aa31878e592e9743ed27ddc53, data reload: false

query1	0.05	0.05	0.05
query2	0.10	0.05	0.06
query3	0.26	0.09	0.09
query4	1.62	0.12	0.12
query5	0.28	0.26	0.25
query6	1.19	0.65	0.64
query7	0.03	0.03	0.03
query8	0.05	0.05	0.05
query9	0.64	0.54	0.51
query10	0.59	0.59	0.59
query11	0.17	0.11	0.11
query12	0.16	0.15	0.11
query13	0.61	0.60	0.61
query14	1.01	1.01	1.00
query15	0.86	0.83	0.87
query16	0.39	0.40	0.39
query17	1.01	1.07	1.00
query18	0.22	0.20	0.20
query19	1.92	1.79	1.79
query20	0.02	0.01	0.02
query21	15.44	0.18	0.13
query22	5.16	0.07	0.05
query23	15.69	0.28	0.11
query24	2.39	0.81	0.45
query25	0.07	0.07	0.07
query26	0.15	0.13	0.13
query27	0.06	0.06	0.06
query28	4.90	1.13	0.94
query29	12.59	3.88	3.31
query30	0.29	0.14	0.11
query31	2.82	0.59	0.38
query32	3.23	0.55	0.47
query33	3.02	3.10	3.05
query34	15.89	5.18	4.62
query35	4.61	4.59	4.61
query36	0.67	0.50	0.49
query37	0.10	0.07	0.07
query38	0.06	0.04	0.04
query39	0.04	0.03	0.03
query40	0.17	0.15	0.15
query41	0.09	0.04	0.04
query42	0.04	0.03	0.03
query43	0.04	0.04	0.04
Total cold run time: 98.7 s
Total hot run time: 27.78 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/4) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.71% (18041/34224)
Line Coverage 37.96% (163713/431249)
Region Coverage 32.34% (124854/386105)
Branch Coverage 33.71% (54589/161953)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (4/4) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.46% (24034/33631)
Line Coverage 57.89% (249987/431827)
Region Coverage 53.02% (207581/391539)
Branch Coverage 54.68% (89136/163024)

@hubgeter
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

ClickBench: Total hot run time: 28.12 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c7089225e8729362f1b805afecfb6de4237103c0, data reload: false

query1	0.06	0.05	0.04
query2	0.09	0.04	0.05
query3	0.25	0.08	0.08
query4	1.67	0.11	0.12
query5	0.28	0.26	0.24
query6	1.17	0.65	0.66
query7	0.03	0.03	0.03
query8	0.05	0.04	0.04
query9	0.60	0.52	0.53
query10	0.57	0.59	0.58
query11	0.17	0.11	0.11
query12	0.15	0.12	0.12
query13	0.62	0.60	0.61
query14	1.01	1.01	1.02
query15	0.85	0.84	0.83
query16	0.41	0.40	0.39
query17	1.05	1.02	1.06
query18	0.23	0.21	0.20
query19	1.89	1.89	1.88
query20	0.01	0.01	0.04
query21	15.44	0.18	0.13
query22	5.13	0.09	0.05
query23	15.65	0.27	0.10
query24	2.85	1.06	0.80
query25	0.08	0.06	0.07
query26	0.15	0.15	0.14
query27	0.07	0.06	0.05
query28	5.17	1.14	0.93
query29	12.55	3.94	3.26
query30	0.28	0.14	0.12
query31	2.82	0.61	0.39
query32	3.26	0.56	0.49
query33	3.20	3.06	3.14
query34	15.73	5.14	4.61
query35	4.56	4.57	4.57
query36	0.68	0.50	0.50
query37	0.10	0.07	0.07
query38	0.07	0.05	0.04
query39	0.04	0.03	0.04
query40	0.19	0.14	0.13
query41	0.09	0.03	0.04
query42	0.03	0.03	0.03
query43	0.05	0.04	0.03
Total cold run time: 99.35 s
Total hot run time: 28.12 s

morningman
morningman previously approved these changes Oct 31, 2025
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Oct 31, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 0.00% (0/4) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.69% (18044/34246)
Line Coverage 37.94% (163816/431731)
Region Coverage 32.28% (124901/386875)
Branch Coverage 33.69% (54616/162133)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (4/4) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.36% (24016/33655)
Line Coverage 57.77% (249813/432420)
Region Coverage 52.86% (207434/392429)
Branch Coverage 54.60% (89121/163234)

@hubgeter
Copy link
Contributor Author

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Oct 31, 2025
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Oct 31, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@doris-robot
Copy link

TPC-DS: Total hot run time: 189692 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 751d5fd1e39e2a01a8597c8b8530c7e23de0cfb0, data reload: false

query1	1021	429	401	401
query2	6552	1712	1709	1709
query3	6756	231	217	217
query4	26788	23313	23314	23313
query5	5249	655	472	472
query6	330	226	237	226
query7	4658	502	297	297
query8	348	267	256	256
query9	8724	2572	2575	2572
query10	553	338	299	299
query11	15659	15120	14814	14814
query12	190	118	113	113
query13	1699	592	458	458
query14	11628	9324	9320	9320
query15	207	196	182	182
query16	7673	682	512	512
query17	1552	778	648	648
query18	2121	432	340	340
query19	248	214	194	194
query20	151	130	137	130
query21	216	132	116	116
query22	4627	4586	4625	4586
query23	34830	33889	33531	33531
query24	8402	2553	2591	2553
query25	651	522	475	475
query26	1273	315	167	167
query27	3093	541	381	381
query28	4705	2281	2220	2220
query29	820	684	498	498
query30	307	249	205	205
query31	936	855	767	767
query32	85	76	67	67
query33	655	381	352	352
query34	835	900	559	559
query35	860	863	804	804
query36	973	996	896	896
query37	123	107	86	86
query38	3522	3546	3451	3451
query39	1466	1416	1393	1393
query40	215	124	120	120
query41	63	56	58	56
query42	125	106	113	106
query43	482	504	459	459
query44	1190	733	731	731
query45	182	177	173	173
query46	890	975	633	633
query47	1758	1796	1725	1725
query48	393	416	309	309
query49	769	485	411	411
query50	644	695	406	406
query51	3868	3877	3842	3842
query52	112	106	102	102
query53	236	268	197	197
query54	312	289	265	265
query55	91	83	83	83
query56	317	325	306	306
query57	1174	1189	1118	1118
query58	287	275	272	272
query59	2509	2733	2637	2637
query60	354	355	321	321
query61	161	151	159	151
query62	774	725	670	670
query63	233	196	190	190
query64	4592	1287	971	971
query65	4084	3938	3977	3938
query66	1061	461	349	349
query67	15507	15018	15113	15018
query68	8445	922	601	601
query69	506	330	294	294
query70	1359	1290	1288	1288
query71	463	352	324	324
query72	6094	4818	4798	4798
query73	626	557	352	352
query74	8867	9106	8982	8982
query75	3316	3416	2821	2821
query76	3364	1172	743	743
query77	564	421	309	309
query78	9614	9759	8881	8881
query79	2751	810	595	595
query80	689	564	519	519
query81	510	261	230	230
query82	342	161	133	133
query83	266	271	257	257
query84	256	113	98	98
query85	922	481	440	440
query86	387	310	286	286
query87	3803	3732	3618	3618
query88	3969	2279	2250	2250
query89	395	332	314	314
query90	2069	236	226	226
query91	173	169	140	140
query92	89	73	65	65
query93	2451	990	640	640
query94	723	443	345	345
query95	407	324	318	318
query96	493	594	286	286
query97	2920	3007	2898	2898
query98	231	215	211	211
query99	1363	1425	1291	1291
Total cold run time: 281296 ms
Total hot run time: 189692 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.62 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 751d5fd1e39e2a01a8597c8b8530c7e23de0cfb0, data reload: false

query1	0.06	0.05	0.05
query2	0.08	0.05	0.05
query3	0.25	0.09	0.08
query4	1.61	0.12	0.12
query5	0.28	0.27	0.25
query6	1.17	0.65	0.63
query7	0.03	0.03	0.02
query8	0.05	0.04	0.05
query9	0.60	0.53	0.51
query10	0.58	0.57	0.58
query11	0.17	0.11	0.12
query12	0.15	0.12	0.12
query13	0.61	0.61	0.61
query14	1.01	1.01	1.00
query15	0.85	0.82	0.83
query16	0.40	0.39	0.41
query17	1.04	1.06	1.05
query18	0.22	0.20	0.20
query19	1.97	1.83	1.83
query20	0.02	0.01	0.01
query21	15.43	0.20	0.13
query22	5.02	0.06	0.05
query23	15.66	0.26	0.10
query24	2.53	1.10	0.48
query25	0.07	0.06	0.06
query26	0.16	0.13	0.13
query27	0.07	0.06	0.15
query28	5.05	1.14	0.94
query29	12.59	3.98	3.25
query30	0.28	0.14	0.11
query31	2.83	0.62	0.39
query32	3.23	0.56	0.49
query33	3.09	3.05	3.01
query34	15.74	5.12	4.52
query35	4.57	4.62	4.59
query36	0.67	0.50	0.50
query37	0.10	0.06	0.07
query38	0.07	0.04	0.04
query39	0.04	0.03	0.03
query40	0.18	0.15	0.14
query41	0.08	0.03	0.04
query42	0.04	0.03	0.02
query43	0.05	0.03	0.03
Total cold run time: 98.7 s
Total hot run time: 27.62 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/4) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.69% (18045/34246)
Line Coverage 37.95% (163842/431749)
Region Coverage 32.28% (124869/386877)
Branch Coverage 33.68% (54614/162133)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (4/4) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.47% (24054/33655)
Line Coverage 57.90% (250372/432438)
Region Coverage 52.76% (207051/392431)
Branch Coverage 54.70% (89292/163234)

1 similar comment
@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (4/4) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.47% (24054/33655)
Line Coverage 57.90% (250372/432438)
Region Coverage 52.76% (207051/392431)
Branch Coverage 54.70% (89292/163234)

@morningman morningman merged commit eacaa4b into apache:master Oct 31, 2025
30 of 31 checks passed
hubgeter added a commit to hubgeter/doris that referenced this pull request Nov 3, 2025
…der. (apache#57519)

Problem Summary:
This PR adds a `TotalGetRequestTime` profile metric to the S3 reader.
The new metric records the total time spent on all `GetObject` requests,
which will be useful for future performance analysis and optimization.

Example profile (Some less relevant information was manually removed.):
```
FILE_SCAN_OPERATOR  (id=0.  nereids_id=104.  table  name  =  lineitem):
      CommonCounters:
            -  ExecTime:  11sec669ms
            -  OpenTime:  7.395ms
            -  WaitForDependency[FILE_SCAN_OPERATOR_DEPENDENCY]Time:  11sec634ms
              ....
      CustomCounters:
              ....
          Scanner:
                ....
                -  FileReadTime:  4min8sec
                -  FileScannerGetBlockTime:  4min9sec
                -  MergedSmallIO:  0ns
                    -  ReadTime:  4min8sec
                    -  RequestBytes:  2.05  MB
                    -  RequestIO:  20.0K  (20000)
                -  ParquetReader:  0ns
                    -  ColumnReadTime:  4min9sec
                    -  FileOpenTime:  365.293ms
                    -  ParseFooterTime:  5min6sec
                    -  ParseMetaTime:  126.383ms
                         .....
                -  S3Profile:  0ns
                    -  TooManyRequestErr:  0
                    -  TooManyRequestSleepTime:  0ms
                    -  TotalBytesRead:  199.14  MB
                    -  TotalGetRequestTime:  9min12sec
                    -  TotalGetRequest:  40.0K  (40000)
                -  ScannerCpuTime:  20sec946ms
                -  ScannerGetBlockTime:  9min19sec
```
hubgeter added a commit to hubgeter/doris that referenced this pull request Nov 3, 2025
…der. (apache#57519)

Problem Summary:
This PR adds a `TotalGetRequestTime` profile metric to the S3 reader.
The new metric records the total time spent on all `GetObject` requests,
which will be useful for future performance analysis and optimization.

Example profile (Some less relevant information was manually removed.):
```
FILE_SCAN_OPERATOR  (id=0.  nereids_id=104.  table  name  =  lineitem):
      CommonCounters:
            -  ExecTime:  11sec669ms
            -  OpenTime:  7.395ms
            -  WaitForDependency[FILE_SCAN_OPERATOR_DEPENDENCY]Time:  11sec634ms
              ....
      CustomCounters:
              ....
          Scanner:
                ....
                -  FileReadTime:  4min8sec
                -  FileScannerGetBlockTime:  4min9sec
                -  MergedSmallIO:  0ns
                    -  ReadTime:  4min8sec
                    -  RequestBytes:  2.05  MB
                    -  RequestIO:  20.0K  (20000)
                -  ParquetReader:  0ns
                    -  ColumnReadTime:  4min9sec
                    -  FileOpenTime:  365.293ms
                    -  ParseFooterTime:  5min6sec
                    -  ParseMetaTime:  126.383ms
                         .....
                -  S3Profile:  0ns
                    -  TooManyRequestErr:  0
                    -  TooManyRequestSleepTime:  0ms
                    -  TotalBytesRead:  199.14  MB
                    -  TotalGetRequestTime:  9min12sec
                    -  TotalGetRequest:  40.0K  (40000)
                -  ScannerCpuTime:  20sec946ms
                -  ScannerGetBlockTime:  9min19sec
```
morningman pushed a commit that referenced this pull request Nov 4, 2025
yiguolei pushed a commit that referenced this pull request Nov 5, 2025
… the S3 reader. (#57519) (#57635)

bp #57519
Problem Summary:
This PR adds a `TotalGetRequestTime` profile metric to the S3 reader.
The new metric records the total time spent on all `GetObject` requests,
which will be useful for future performance analysis and optimization.

Example profile (Some less relevant information was manually removed.):
```
FILE_SCAN_OPERATOR  (id=0.  nereids_id=104.  table  name  =  lineitem):
      CommonCounters:
            -  ExecTime:  11sec669ms
            -  OpenTime:  7.395ms
            -  WaitForDependency[FILE_SCAN_OPERATOR_DEPENDENCY]Time:  11sec634ms
              ....
      CustomCounters:
              ....
          Scanner:
                ....
                -  FileReadTime:  4min8sec
                -  FileScannerGetBlockTime:  4min9sec
                -  MergedSmallIO:  0ns
                    -  ReadTime:  4min8sec
                    -  RequestBytes:  2.05  MB
                    -  RequestIO:  20.0K  (20000)
                -  ParquetReader:  0ns
                    -  ColumnReadTime:  4min9sec
                    -  FileOpenTime:  365.293ms
                    -  ParseFooterTime:  5min6sec
                    -  ParseMetaTime:  126.383ms
                         .....
                -  S3Profile:  0ns
                    -  TooManyRequestErr:  0
                    -  TooManyRequestSleepTime:  0ms
                    -  TotalBytesRead:  199.14  MB
                    -  TotalGetRequestTime:  9min12sec
                    -  TotalGetRequest:  40.0K  (40000)
                -  ScannerCpuTime:  20sec946ms
                -  ScannerGetBlockTime:  9min19sec
```

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
yiguolei pushed a commit to yiguolei/incubator-doris that referenced this pull request Dec 30, 2025
…der. (apache#57519) (apache#5628)

bp apache#57519

Problem Summary:
This PR adds a `TotalGetRequestTime` profile metric to the S3 reader.
The new metric records the total time spent on all `GetObject` requests,
which will be useful for future performance analysis and optimization.

Example profile (Some less relevant information was manually removed.):
```
FILE_SCAN_OPERATOR  (id=0.  nereids_id=104.  table  name  =  lineitem):
      CommonCounters:
            -  ExecTime:  11sec669ms
            -  OpenTime:  7.395ms
            -  WaitForDependency[FILE_SCAN_OPERATOR_DEPENDENCY]Time:  11sec634ms
              ....
      CustomCounters:
              ....
          Scanner:
                ....
                -  FileReadTime:  4min8sec
                -  FileScannerGetBlockTime:  4min9sec
                -  MergedSmallIO:  0ns
                    -  ReadTime:  4min8sec
                    -  RequestBytes:  2.05  MB
                    -  RequestIO:  20.0K  (20000)
                -  ParquetReader:  0ns
                    -  ColumnReadTime:  4min9sec
                    -  FileOpenTime:  365.293ms
                    -  ParseFooterTime:  5min6sec
                    -  ParseMetaTime:  126.383ms
                         .....
                -  S3Profile:  0ns
                    -  TooManyRequestErr:  0
                    -  TooManyRequestSleepTime:  0ms
                    -  TotalBytesRead:  199.14  MB
                    -  TotalGetRequestTime:  9min12sec
                    -  TotalGetRequest:  40.0K  (40000)
                -  ScannerCpuTime:  20sec946ms
                -  ScannerGetBlockTime:  9min19sec
```

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.x dev/3.1.3-merged dev/4.0.2-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants