Skip to content

Conversation

@morningman
Copy link
Contributor

cherry pick from #42585

…ss to OSS-HDFS (apache#42585)

Fix the problem that paimon catalog can not access to OSS-HDFS.

There are 2 problems in paimon catalog:
1. Doris FE can not list paimon tables.
This is because we pass these three properties -- `fs.oss.endpoint /
fs.oss.accessKeyId / fs.oss.accessKeySecret` -- to the PaimonCatalog.
When PaimonCatalog get these three properties, it will use `OSSLoader`
rather than `HadoopFileIOLoader`.

2. Doris BE does not use libhdfs to access OSS-HDFS
This is because the `tmpLocation` in `LocationPath` does not contain
`oss-dls.aliyuncs`. We should use `endpoint` to judge if user wants to
access OSS-HDFS

What's more, if you want to access OSS-HDFS with PaimonCatalog, you
should:
1. Download Jindo SDK:
https://github.com/aliyun/alibabacloud-jindodata/blob/latest/docs/user/zh/jindosdk/jindosdk_download.md
2. copy `jindo-core.jar、jindo-sdk.jar` to `${DORIS_HOME}/fe/lib` and
`${DORIS_HOME}/be/lib/java_extensions/preload-extensions` directory.
@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@doris-robot
Copy link

TPC-H: Total hot run time: 40128 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a65e0fbc131bff3976f3d6837b114aff7619d5e1, data reload: false

------ Round 1 ----------------------------------
q1	17582	7796	7249	7249
q2	2049	169	162	162
q3	10690	1087	1154	1087
q4	10560	752	789	752
q5	7764	2857	2725	2725
q6	229	147	148	147
q7	978	597	594	594
q8	9581	1930	1938	1930
q9	8067	6340	6383	6340
q10	6958	2281	2270	2270
q11	456	265	264	264
q12	399	210	204	204
q13	17807	2958	2938	2938
q14	255	217	207	207
q15	549	517	508	508
q16	692	588	600	588
q17	964	508	555	508
q18	7056	6409	6515	6409
q19	1445	1042	1026	1026
q20	464	194	194	194
q21	3824	3040	3047	3040
q22	1092	1003	986	986
Total cold run time: 109461 ms
Total hot run time: 40128 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7215	7196	7179	7179
q2	318	236	228	228
q3	2830	2848	2839	2839
q4	1980	1727	1739	1727
q5	5619	5647	5738	5647
q6	218	140	142	140
q7	2228	1724	1767	1724
q8	3277	3478	3466	3466
q9	8727	8850	8799	8799
q10	3496	3487	3475	3475
q11	590	495	506	495
q12	783	609	592	592
q13	16443	3155	3104	3104
q14	298	285	270	270
q15	567	513	514	513
q16	702	662	667	662
q17	1833	1631	1598	1598
q18	8138	7780	7443	7443
q19	7029	1530	1646	1530
q20	2031	1839	1869	1839
q21	5326	5253	5178	5178
q22	1086	1007	1001	1001
Total cold run time: 80734 ms
Total hot run time: 59449 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 193020 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a65e0fbc131bff3976f3d6837b114aff7619d5e1, data reload: false

query1	1259	909	906	906
query2	6248	2042	1994	1994
query3	10775	3859	3846	3846
query4	67498	29971	23343	23343
query5	5164	447	428	428
query6	440	161	177	161
query7	5644	303	303	303
query8	317	233	229	229
query9	9434	2646	2619	2619
query10	468	269	252	252
query11	17460	15075	15831	15075
query12	160	99	101	99
query13	1616	420	418	418
query14	10746	7378	7020	7020
query15	224	189	177	177
query16	7145	510	474	474
query17	1036	553	564	553
query18	1629	317	318	317
query19	193	151	151	151
query20	115	110	111	110
query21	202	99	100	99
query22	4304	4340	4254	4254
query23	34290	33656	33913	33656
query24	5964	2752	2792	2752
query25	507	412	407	407
query26	659	159	166	159
query27	1694	298	293	293
query28	4164	2517	2489	2489
query29	684	433	436	433
query30	234	151	156	151
query31	1003	776	823	776
query32	63	56	57	56
query33	454	268	267	267
query34	885	498	484	484
query35	847	703	713	703
query36	1053	960	927	927
query37	121	71	68	68
query38	3986	3759	3870	3759
query39	1500	1429	1423	1423
query40	201	97	99	97
query41	52	52	48	48
query42	108	98	98	98
query43	510	480	474	474
query44	1125	782	792	782
query45	184	168	163	163
query46	1122	694	694	694
query47	1885	1767	1832	1767
query48	450	365	371	365
query49	747	398	406	398
query50	822	397	394	394
query51	7211	7051	7181	7051
query52	104	109	88	88
query53	256	184	180	180
query54	565	449	438	438
query55	78	77	79	77
query56	255	242	235	235
query57	1218	1080	1095	1080
query58	204	198	207	198
query59	3035	2838	2870	2838
query60	259	252	240	240
query61	95	99	93	93
query62	771	657	649	649
query63	214	183	179	179
query64	1383	589	570	570
query65	3222	3150	3166	3150
query66	694	312	291	291
query67	15658	15134	15283	15134
query68	4481	537	537	537
query69	431	248	253	248
query70	1161	1083	1045	1045
query71	405	250	247	247
query72	6419	3873	3842	3842
query73	758	335	327	327
query74	10312	8815	8954	8815
query75	3353	2599	2612	2599
query76	1811	1048	987	987
query77	493	259	261	259
query78	10875	9581	9432	9432
query79	8577	597	593	593
query80	2263	398	415	398
query81	551	243	241	241
query82	1366	112	118	112
query83	337	139	135	135
query84	573	76	74	74
query85	1933	287	272	272
query86	506	306	276	276
query87	4511	4240	4276	4240
query88	5516	2369	2362	2362
query89	561	277	284	277
query90	2100	176	180	176
query91	175	135	141	135
query92	71	47	48	47
query93	6764	529	537	529
query94	891	294	297	294
query95	346	248	247	247
query96	636	272	285	272
query97	3343	3109	3126	3109
query98	214	197	193	193
query99	1603	1298	1307	1298
Total cold run time: 335949 ms
Total hot run time: 193020 ms

@morningman morningman merged commit 4be107a into apache:branch-3.0 Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants