Skip to content

Conversation

@morningman
Copy link
Contributor

@morningman morningman commented Dec 1, 2023

Proposed changes

Issue Number: #19897

Remove default_cluster prefix related to database.
When upgrading, all prefix will be removed.

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@morningman morningman changed the title [Draft] remove Db cluster [refactor](cluster)(step-4) remove cluster related to Database Dec 1, 2023
@morningman
Copy link
Contributor Author

run buildall

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.18 seconds
stream load tsv: 560 seconds loaded 74807831229 Bytes, about 127 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.8 seconds inserted 10000000 Rows, about 347K ops/s
storage size: 17167054074 Bytes

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.1 seconds
stream load tsv: 566 seconds loaded 74807831229 Bytes, about 126 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 34 seconds loaded 861443392 Bytes, about 24 MB/s
insert into select: 29.0 seconds inserted 10000000 Rows, about 344K ops/s
storage size: 17167406011 Bytes

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 43.65 seconds
stream load tsv: 576 seconds loaded 74807831229 Bytes, about 123 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 29.0 seconds inserted 10000000 Rows, about 344K ops/s
storage size: 17164183758 Bytes

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 43.94 seconds
stream load tsv: 576 seconds loaded 74807831229 Bytes, about 123 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.7 seconds inserted 10000000 Rows, about 348K ops/s
storage size: 17164066620 Bytes

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit 484d4cdb02f681555bee5ee9bf9b02150e484010, data reload: true

run tpch-sf100 query with default conf and session variables
q1	4851	4590	4575	4575
q2	396	124	117	117
q3	1506	1265	1264	1264
q4	1154	940	979	940
q5	3230	3260	3221	3221
q6	251	128	124	124
q7	993	494	487	487
q8	2203	2206	2198	2198
q9	7034	6972	6935	6935
q10	3305	3348	3328	3328
q11	347	206	195	195
q12	371	208	208	208
q13	4677	3929	3879	3879
q14	242	214	217	214
q15	565	520	513	513
q16	449	369	368	368
q17	1013	632	624	624
q18	8083	7629	7518	7518
q19	1610	1541	1527	1527
q20	539	328	319	319
q21	3456	2959	2955	2955
q22	371	291	292	291
Total cold run time: 46646 ms
Total hot run time: 41800 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4563	4594	4583	4583
q2	317	225	226	225
q3	3768	3737	3753	3737
q4	2528	2499	2515	2499
q5	6160	6155	6162	6155
q6	246	120	126	120
q7	2685	1979	1984	1979
q8	3725	3697	3653	3653
q9	9465	9392	9333	9333
q10	4028	4126	4122	4122
q11	654	535	527	527
q12	826	623	636	623
q13	4368	3626	3636	3626
q14	268	248	241	241
q15	573	527	533	527
q16	508	457	479	457
q17	2090	2094	2033	2033
q18	9729	9142	9108	9108
q19	1802	1769	1738	1738
q20	2332	1993	1992	1992
q21	7437	6999	6923	6923
q22	698	557	595	557
Total cold run time: 68770 ms
Total hot run time: 64758 ms

@morningman morningman force-pushed the db_cluster branch 2 times, most recently from 43f7c27 to cec8613 Compare December 4, 2023 13:02
@morningman
Copy link
Contributor Author

run buildall

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.04 seconds
stream load tsv: 568 seconds loaded 74807831229 Bytes, about 125 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 29.0 seconds inserted 10000000 Rows, about 344K ops/s
storage size: 17167320282 Bytes

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit ac2f361cf3d80cab61b8dcc4e5ae2d3ec0b233c4, data reload: true

run tpch-sf100 query with default conf and session variables
q1	4878	4562	4606	4562
q2	373	123	120	120
q3	1515	1274	1276	1274
q4	1903	945	962	945
q5	3270	3254	3233	3233
q6	249	128	135	128
q7	1012	500	496	496
q8	2211	2201	2197	2197
q9	6967	6956	6940	6940
q10	3294	3348	3358	3348
q11	341	206	198	198
q12	364	208	205	205
q13	4643	3909	3873	3873
q14	246	219	215	215
q15	576	529	512	512
q16	422	385	384	384
q17	1002	607	571	571
q18	8067	7603	7447	7447
q19	1569	1550	1542	1542
q20	838	310	330	310
q21	3409	2953	2945	2945
q22	367	305	300	300
Total cold run time: 47516 ms
Total hot run time: 41745 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4598	4568	4550	4550
q2	319	201	216	201
q3	3747	3735	3743	3735
q4	2502	2509	2508	2508
q5	6151	6160	6184	6160
q6	247	121	124	121
q7	2606	1973	1950	1950
q8	3711	3624	3717	3624
q9	9457	9404	9401	9401
q10	4088	4151	4158	4151
q11	626	539	509	509
q12	805	615	636	615
q13	4382	3701	3632	3632
q14	290	247	265	247
q15	582	523	535	523
q16	532	460	446	446
q17	2130	2054	2071	2054
q18	9544	9011	9048	9011
q19	1789	1747	1751	1747
q20	2311	1988	1972	1972
q21	7303	6970	6939	6939
q22	685	597	586	586
Total cold run time: 68405 ms
Total hot run time: 64682 ms

@morningman
Copy link
Contributor Author

run buildall

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.48 seconds
stream load tsv: 576 seconds loaded 74807831229 Bytes, about 123 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 27.7 seconds inserted 10000000 Rows, about 361K ops/s
storage size: 17165614096 Bytes

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit 3b570c5d2571a16f5ed1b32188f198dddbc8dcce, data reload: true

run tpch-sf100 query with default conf and session variables
q1	4614	4433	4440	4433
q2	375	121	120	120
q3	1462	1233	1189	1189
q4	1118	921	881	881
q5	3265	3218	3220	3218
q6	242	121	122	121
q7	938	474	505	474
q8	2113	2157	2151	2151
q9	6736	6705	6704	6704
q10	3214	3273	3267	3267
q11	312	204	210	204
q12	362	213	209	209
q13	4582	3816	3792	3792
q14	238	205	208	205
q15	570	510	515	510
q16	453	387	393	387
q17	1001	568	577	568
q18	7507	7432	7058	7058
q19	1553	1342	1393	1342
q20	499	340	299	299
q21	3145	2721	2695	2695
q22	357	291	284	284
Total cold run time: 44656 ms
Total hot run time: 40111 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4414	4423	4415	4415
q2	266	165	174	165
q3	3542	3525	3521	3521
q4	2416	2396	2398	2396
q5	5762	5777	5767	5767
q6	234	119	119	119
q7	2389	1882	1870	1870
q8	3500	3510	3525	3510
q9	9112	9050	9017	9017
q10	3912	3997	3995	3995
q11	516	404	410	404
q12	767	583	598	583
q13	4312	3551	3547	3547
q14	289	267	249	249
q15	569	528	518	518
q16	528	444	473	444
q17	1871	1867	1858	1858
q18	8777	8311	8234	8234
q19	1723	1770	1744	1744
q20	2260	1955	1926	1926
q21	6546	6223	6180	6180
q22	511	423	406	406
Total cold run time: 64216 ms
Total hot run time: 60868 ms

@morningman morningman force-pushed the db_cluster branch 2 times, most recently from 68f8ad4 to 92c7515 Compare December 9, 2023 16:46
@morningman
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.52 seconds
stream load tsv: 584 seconds loaded 74807831229 Bytes, about 122 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 67 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 33 seconds loaded 861443392 Bytes, about 24 MB/s
insert into select: 28.6 seconds inserted 10000000 Rows, about 349K ops/s
storage size: 17220967829 Bytes

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit 2e86648bab635ef0ddd5054949fd533493201a5d, data reload: true

run tpch-sf100 query with default conf and session variables
q1	5171	4455	4391	4391
q2	368	119	118	118
q3	1501	1220	1203	1203
q4	3341	919	905	905
q5	3272	3176	3170	3170
q6	242	121	122	121
q7	966	471	470	470
q8	3393	2198	2185	2185
q9	6800	6738	6724	6724
q10	3194	3281	3378	3281
q11	326	185	187	185
q12	353	212	197	197
q13	4573	3828	3816	3816
q14	235	199	211	199
q15	559	510	520	510
q16	429	393	383	383
q17	1007	602	560	560
q18	7380	7253	6886	6886
q19	1559	1328	1444	1328
q20	520	277	300	277
q21	3079	2648	2652	2648
q22	341	274	271	271
Total cold run time: 48609 ms
Total hot run time: 39828 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4430	4422	4375	4375
q2	265	160	165	160
q3	3534	3512	3515	3512
q4	2379	2362	2371	2362
q5	5778	5759	5776	5759
q6	236	118	120	118
q7	2400	1867	1911	1867
q8	3512	3531	3520	3520
q9	9109	9054	9018	9018
q10	3918	4022	4014	4014
q11	498	389	373	373
q12	761	634	611	611
q13	4278	3581	3551	3551
q14	288	262	251	251
q15	557	522	517	517
q16	498	441	452	441
q17	1871	1844	1836	1836
q18	8667	8264	8351	8264
q19	1739	1750	1767	1750
q20	2263	1951	1931	1931
q21	6542	6192	6108	6108
q22	495	408	409	408
Total cold run time: 64018 ms
Total hot run time: 60746 ms

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.2 seconds
stream load tsv: 588 seconds loaded 74807831229 Bytes, about 121 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 67 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.7 seconds inserted 10000000 Rows, about 348K ops/s
storage size: 17219794113 Bytes

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit 200eb10cb174ddac88c265ca68164b423707c156, data reload: true

run tpch-sf100 query with default conf and session variables
q1	4669	4417	4405	4405
q2	402	134	116	116
q3	1501	1229	1200	1200
q4	3331	868	893	868
q5	3370	3141	3168	3141
q6	238	122	122	122
q7	944	475	474	474
q8	3115	2188	2165	2165
q9	6817	6746	6663	6663
q10	3212	3260	3257	3257
q11	329	210	194	194
q12	361	208	203	203
q13	9917	3821	3879	3821
q14	240	209	214	209
q15	567	516	518	516
q16	444	390	383	383
q17	1012	577	564	564
q18	7517	7204	7441	7204
q19	1547	1360	1444	1360
q20	771	309	291	291
q21	3092	2645	2722	2645
q22	347	281	276	276
Total cold run time: 53743 ms
Total hot run time: 40077 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4396	4374	4398	4374
q2	265	161	170	161
q3	3548	3530	3538	3530
q4	2395	2377	2380	2377
q5	5757	5739	5763	5739
q6	237	116	118	116
q7	2390	1885	1870	1870
q8	3527	3532	3512	3512
q9	9090	9015	9005	9005
q10	3884	3992	4008	3992
q11	495	393	395	393
q12	762	598	590	590
q13	4314	3592	3569	3569
q14	287	250	256	250
q15	568	522	520	520
q16	503	451	456	451
q17	1887	1859	1846	1846
q18	8755	8338	8401	8338
q19	1732	1755	1750	1750
q20	2273	1960	1939	1939
q21	6575	6212	6168	6168
q22	509	433	446	433
Total cold run time: 64149 ms
Total hot run time: 60923 ms

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Dec 16, 2023
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morningman morningman merged commit 8c05f7a into apache:master Dec 16, 2023
morningman added a commit that referenced this pull request Dec 17, 2023
…28532)

Introduced from #27861

The `dbName` saved in `CreateTableInfo` has `default_cluster` prefix, it should be removed.

Also modify the entry of `getDb` in internal catalog. This is a cover-up plan in case there may still 
db name exist with `default_cluster` prefix.
hello-stephen pushed a commit to hello-stephen/doris that referenced this pull request Dec 28, 2023
…e#27861)

Issue Number: apache#19897

Remove `default_cluster` prefix related to database.
When upgrading, all prefix will be removed.
hello-stephen pushed a commit to hello-stephen/doris that referenced this pull request Dec 28, 2023
…pache#28532)

Introduced from apache#27861

The `dbName` saved in `CreateTableInfo` has `default_cluster` prefix, it should be removed.

Also modify the entry of `getDb` in internal catalog. This is a cover-up plan in case there may still 
db name exist with `default_cluster` prefix.
HappenLee pushed a commit to HappenLee/incubator-doris that referenced this pull request Jan 12, 2024
…e#27861)

Issue Number: apache#19897

Remove `default_cluster` prefix related to database.
When upgrading, all prefix will be removed.
HappenLee pushed a commit to HappenLee/incubator-doris that referenced this pull request Jan 12, 2024
…pache#28532)

Introduced from apache#27861

The `dbName` saved in `CreateTableInfo` has `default_cluster` prefix, it should be removed.

Also modify the entry of `getDb` in internal catalog. This is a cover-up plan in case there may still 
db name exist with `default_cluster` prefix.
dataroaring pushed a commit that referenced this pull request Jul 23, 2024
…8199)

Use the sql to query routine load records:
```
show routine load for db.job1\G
```
return all routine load job in the db:
```
10 rows in set (0.02 sec)
```


why the bug happen is there is no correct assignment when analyze the
sql:
```
if (Strings.isNullOrEmpty(dbName)) {
            dbFullName = analyzer.getContext().getDatabase();
            if (Strings.isNullOrEmpty(dbFullName)) {
                ErrorReport.reportAnalysisException(ErrorCode.ERR_NO_DB_ERROR);
            }
}
```
dbFullName will always null if dbName is not null or empty.

The bug is introduce by: #27861
dataroaring pushed a commit that referenced this pull request Jul 23, 2024
…8199)

Use the sql to query routine load records:
```
show routine load for db.job1\G
```
return all routine load job in the db:
```
10 rows in set (0.02 sec)
```


why the bug happen is there is no correct assignment when analyze the
sql:
```
if (Strings.isNullOrEmpty(dbName)) {
            dbFullName = analyzer.getContext().getDatabase();
            if (Strings.isNullOrEmpty(dbFullName)) {
                ErrorReport.reportAnalysisException(ErrorCode.ERR_NO_DB_ERROR);
            }
}
```
dbFullName will always null if dbName is not null or empty.

The bug is introduce by: #27861
morrySnow pushed a commit that referenced this pull request Jul 30, 2024
fix 

- when label contains dbName, will loss
intro by #27861

- show routine load for xxx.yyy can export authentication error
intro by #33347

Note: Cases will be added uniformly in other PRs
feiniaofeiafei pushed a commit to feiniaofeiafei/doris that referenced this pull request Aug 9, 2024
fix 

- when label contains dbName, will loss
intro by apache#27861

- show routine load for xxx.yyy can export authentication error
intro by apache#33347

Note: Cases will be added uniformly in other PRs
dataroaring pushed a commit that referenced this pull request Aug 11, 2024
fix 

- when label contains dbName, will loss
intro by #27861

- show routine load for xxx.yyy can export authentication error
intro by #33347

Note: Cases will be added uniformly in other PRs
dataroaring pushed a commit that referenced this pull request Aug 16, 2024
fix 

- when label contains dbName, will loss
intro by #27861

- show routine load for xxx.yyy can export authentication error
intro by #33347

Note: Cases will be added uniformly in other PRs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. meta-change reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants