Skip to content

Conversation

@CalvinKirs
Copy link
Member

cp #57116

…S S3A protocol (apache#57116)

…

Previously, for FS-based Paimon catalogs, internal configuration
translation was performed to ensure the storage layer used S3FileIO
(which internally relied on Hadoop S3).

However, Paimon also allows users to specify S3-related options with
various prefixes such as s3., s3a., or fs.s3a. in their configuration.
The S3FileIO implementation in Paimon would automatically normalize
these keys to the standard Hadoop prefix fs.s3a..

With the recent refactor, we have unified all object storage access to
use the HDFS S3A protocol directly. Therefore, the system must now
handle these legacy user-defined prefixes internally to ensure
compatibility.

```
Before this change, users might define custom parameters like:

paimon.s3.list.version=1
paimon.s3.paging.maximum=100
paimon.fs.s3.read.ahead.buffer.size=1
paimon.s3a.replication.factor=3

After normalization, they are automatically converted to Hadoop-compatible S3A keys:

fs.s3a.list.version=1
fs.s3a.paging.maximum=100
fs.s3a.read.ahead.buffer.size=1
fs.s3a.replication.factor=3

```

---------

Co-authored-by: Mingyu Chen (Rayner) <yunyou@selectdb.com>
(cherry picked from commit d739136)
@CalvinKirs CalvinKirs requested a review from morrySnow as a code owner October 30, 2025 10:20
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@CalvinKirs
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34812 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 4ba1875f697024873e3f2ef86044d0aeca08ebe4, data reload: false

------ Round 1 ----------------------------------
q1	17645	5498	5598	5498
q2	2044	413	335	335
q3	11853	1315	802	802
q4	10561	905	519	519
q5	9607	2450	2279	2279
q6	207	169	139	139
q7	948	763	658	658
q8	9367	1613	1465	1465
q9	5377	5025	4999	4999
q10	7342	2349	1943	1943
q11	567	335	328	328
q12	397	403	252	252
q13	17785	3698	3126	3126
q14	236	238	238	238
q15	548	489	475	475
q16	493	481	410	410
q17	743	910	442	442
q18	7075	6810	6668	6668
q19	1231	990	610	610
q20	390	397	237	237
q21	3522	2708	2323	2323
q22	1130	1102	1066	1066
Total cold run time: 109068 ms
Total hot run time: 34812 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5543	5565	5459	5459
q2	254	345	250	250
q3	2361	2729	2397	2397
q4	1459	1873	1512	1512
q5	4720	5162	5084	5084
q6	191	176	135	135
q7	2168	2013	1977	1977
q8	2723	2893	2760	2760
q9	7418	7306	7281	7281
q10	3038	3326	2844	2844
q11	623	558	518	518
q12	708	786	652	652
q13	3547	3893	3323	3323
q14	314	313	289	289
q15	535	470	485	470
q16	463	529	464	464
q17	1319	1768	1285	1285
q18	7896	7649	7439	7439
q19	958	1243	1193	1193
q20	2121	2139	1979	1979
q21	5689	5172	5000	5000
q22	1183	1167	1093	1093
Total cold run time: 55231 ms
Total hot run time: 53404 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.53 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 4ba1875f697024873e3f2ef86044d0aeca08ebe4, data reload: false

query1	0.04	0.03	0.03
query2	0.08	0.04	0.03
query3	0.24	0.07	0.07
query4	1.61	0.11	0.10
query5	0.52	0.52	0.51
query6	1.13	0.74	0.73
query7	0.03	0.01	0.02
query8	0.05	0.04	0.04
query9	0.62	0.54	0.53
query10	0.58	0.60	0.59
query11	0.15	0.12	0.12
query12	0.16	0.13	0.12
query13	0.64	0.61	0.60
query14	0.81	0.84	0.85
query15	0.87	0.87	0.87
query16	0.43	0.42	0.38
query17	1.16	1.08	1.12
query18	0.27	0.24	0.24
query19	1.95	1.94	2.20
query20	0.02	0.01	0.01
query21	15.36	1.14	0.70
query22	0.76	0.85	0.75
query23	14.89	1.56	0.66
query24	3.15	0.42	0.46
query25	0.14	0.08	0.06
query26	0.31	0.18	0.16
query27	0.06	0.06	0.05
query28	12.70	1.26	0.50
query29	12.60	4.05	3.32
query30	0.26	0.10	0.07
query31	2.83	0.67	0.42
query32	3.23	0.58	0.50
query33	3.11	3.11	3.14
query34	16.54	5.29	4.52
query35	4.65	4.61	4.60
query36	0.68	0.55	0.50
query37	0.09	0.07	0.07
query38	0.06	0.05	0.04
query39	0.04	0.03	0.02
query40	0.18	0.14	0.13
query41	0.08	0.03	0.03
query42	0.05	0.03	0.03
query43	0.05	0.04	0.03
Total cold run time: 103.18 s
Total hot run time: 29.53 s

@morningman morningman merged commit 0a687a6 into apache:branch-3.1 Oct 31, 2025
23 checks passed
@morrySnow morrySnow mentioned this pull request Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants