Skip to content

Conversation

@wsjz
Copy link
Contributor

@wsjz wsjz commented Apr 2, 2024

Proposed changes

Issue Number: #31442

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@wsjz wsjz marked this pull request as ready for review April 3, 2024 09:32
@wsjz
Copy link
Contributor Author

wsjz commented Apr 8, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 38351 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 996c5bfe98a4407838e8aa3d33206ad2aeee4a3f, data reload: false

------ Round 1 ----------------------------------
q1	17622	4053	4098	4053
q2	2024	184	178	178
q3	10486	1220	1272	1220
q4	10202	804	947	804
q5	7470	3001	2921	2921
q6	217	135	133	133
q7	1082	626	601	601
q8	9410	1959	2019	1959
q9	6991	6190	6111	6111
q10	8493	3497	3493	3493
q11	417	238	236	236
q12	385	209	210	209
q13	17782	2897	2918	2897
q14	261	231	241	231
q15	522	472	488	472
q16	512	375	378	375
q17	945	907	906	906
q18	7188	6391	6342	6342
q19	1600	1524	1529	1524
q20	541	321	300	300
q21	3666	3093	3108	3093
q22	356	303	293	293
Total cold run time: 108172 ms
Total hot run time: 38351 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4091	4068	4083	4068
q2	328	219	227	219
q3	2972	2968	2943	2943
q4	1849	1884	1843	1843
q5	5237	5210	5228	5210
q6	207	126	126	126
q7	2237	1806	1814	1806
q8	3210	3266	3253	3253
q9	8468	8479	8469	8469
q10	3779	3970	3998	3970
q11	547	491	471	471
q12	784	566	577	566
q13	16776	3086	3054	3054
q14	314	290	296	290
q15	534	484	490	484
q16	489	441	437	437
q17	1745	1730	1764	1730
q18	8115	7663	7502	7502
q19	1680	1667	1667	1667
q20	2004	1839	1841	1839
q21	5154	4955	5039	4955
q22	498	404	440	404
Total cold run time: 71018 ms
Total hot run time: 55306 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 181716 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 996c5bfe98a4407838e8aa3d33206ad2aeee4a3f, data reload: false

query1	1209	1105	1116	1105
query2	6341	1958	2041	1958
query3	6665	214	210	210
query4	24082	21448	21464	21448
query5	4178	393	402	393
query6	282	184	179	179
query7	4592	307	296	296
query8	221	179	174	174
query9	8465	2205	2237	2205
query10	449	246	251	246
query11	14986	14563	14495	14495
query12	132	94	87	87
query13	1645	368	385	368
query14	8423	6791	6765	6765
query15	221	180	176	176
query16	6860	261	263	261
query17	1016	582	568	568
query18	1862	282	284	282
query19	206	154	156	154
query20	91	86	86	86
query21	201	135	130	130
query22	4958	4820	4806	4806
query23	33507	32715	32595	32595
query24	11569	3181	3174	3174
query25	708	419	435	419
query26	1903	164	160	160
query27	3483	383	378	378
query28	7286	1883	1915	1883
query29	1289	632	590	590
query30	301	173	169	169
query31	1019	751	755	751
query32	96	52	54	52
query33	641	234	243	234
query34	1211	505	511	505
query35	827	739	722	722
query36	992	869	896	869
query37	276	72	75	72
query38	3668	3608	3575	3575
query39	1627	1572	1576	1572
query40	228	135	125	125
query41	46	42	43	42
query42	113	99	104	99
query43	472	429	445	429
query44	1150	743	782	743
query45	279	289	263	263
query46	1095	825	835	825
query47	1975	1877	1885	1877
query48	386	310	304	304
query49	933	381	359	359
query50	823	416	418	416
query51	6863	6717	6758	6717
query52	107	90	92	90
query53	361	284	292	284
query54	269	221	222	221
query55	83	75	78	75
query56	241	224	228	224
query57	1266	1176	1158	1158
query58	235	209	220	209
query59	2712	2398	2340	2340
query60	259	244	231	231
query61	108	105	104	104
query62	671	442	449	442
query63	306	277	282	277
query64	6008	3228	3218	3218
query65	3049	2999	2997	2997
query66	1321	333	313	313
query67	15353	14995	14928	14928
query68	6493	558	563	558
query69	537	295	301	295
query70	1182	1135	1088	1088
query71	522	267	282	267
query72	6417	2542	2402	2402
query73	795	319	323	319
query74	6711	6291	6282	6282
query75	3524	2269	2319	2269
query76	4753	1201	1208	1201
query77	574	241	243	241
query78	10814	10149	10041	10041
query79	8274	525	525	525
query80	1460	420	414	414
query81	495	233	242	233
query82	673	92	89	89
query83	193	167	161	161
query84	271	86	83	83
query85	1375	286	276	276
query86	450	303	286	286
query87	3660	3496	3476	3476
query88	4077	2261	2266	2261
query89	573	370	371	370
query90	1962	178	173	173
query91	135	105	104	104
query92	61	49	50	49
query93	6812	525	534	525
query94	976	182	179	179
query95	417	306	304	304
query96	618	271	271	271
query97	2637	2507	2524	2507
query98	231	214	224	214
query99	1272	831	841	831
Total cold run time: 293144 ms
Total hot run time: 181716 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.4 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 996c5bfe98a4407838e8aa3d33206ad2aeee4a3f, data reload: false

query1	0.04	0.03	0.03
query2	0.07	0.04	0.04
query3	0.23	0.05	0.05
query4	1.68	0.06	0.07
query5	0.47	0.48	0.47
query6	1.12	0.65	0.64
query7	0.02	0.02	0.02
query8	0.06	0.05	0.04
query9	0.56	0.50	0.51
query10	0.56	0.57	0.56
query11	0.16	0.12	0.11
query12	0.14	0.12	0.13
query13	0.61	0.60	0.59
query14	0.77	0.80	0.80
query15	0.85	0.84	0.83
query16	0.35	0.36	0.35
query17	0.97	0.96	0.96
query18	0.25	0.24	0.24
query19	1.79	1.68	1.74
query20	0.01	0.01	0.01
query21	15.43	0.65	0.65
query22	4.35	8.37	1.30
query23	17.96	1.20	1.16
query24	1.74	0.19	0.19
query25	0.16	0.08	0.08
query26	0.27	0.16	0.15
query27	0.08	0.07	0.08
query28	13.65	0.97	0.94
query29	12.54	3.31	3.26
query30	0.27	0.07	0.05
query31	2.87	0.39	0.39
query32	3.31	0.48	0.47
query33	2.93	2.84	2.87
query34	15.50	4.36	4.31
query35	4.39	4.41	4.39
query36	0.67	0.47	0.47
query37	0.18	0.16	0.16
query38	0.15	0.14	0.14
query39	0.04	0.04	0.03
query40	0.17	0.16	0.14
query41	0.10	0.04	0.05
query42	0.05	0.05	0.04
query43	0.04	0.03	0.03
Total cold run time: 107.56 s
Total hot run time: 29.4 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 996c5bfe98a4407838e8aa3d33206ad2aeee4a3f with default session variables
Stream load json:         18 seconds loaded 2358488459 Bytes, about 124 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       15.6 seconds inserted 10000000 Rows, about 641K ops/s

@wsjz
Copy link
Contributor Author

wsjz commented Apr 8, 2024

run buildall

4 similar comments
@wsjz
Copy link
Contributor Author

wsjz commented Apr 9, 2024

run buildall

@wsjz
Copy link
Contributor Author

wsjz commented Apr 9, 2024

run buildall

@wsjz
Copy link
Contributor Author

wsjz commented Apr 9, 2024

run buildall

@wsjz
Copy link
Contributor Author

wsjz commented Apr 9, 2024

run buildall

@wsjz
Copy link
Contributor Author

wsjz commented Apr 10, 2024

run buildall

@wsjz
Copy link
Contributor Author

wsjz commented Apr 10, 2024

run buildall

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@wsjz
Copy link
Contributor Author

wsjz commented Apr 10, 2024

run buildall

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 10, 2024
@wsjz
Copy link
Contributor Author

wsjz commented Apr 10, 2024

run feut

@morningman morningman merged commit 37ae87b into apache:master Apr 10, 2024
dataroaring pushed a commit that referenced this pull request May 1, 2024
…4.0 (#34371)

* [feature](insert)use optional location and add hive regression test (#33153)

* [feature](iceberg)The new DDL syntax is added to create iceberg partitioned tables (#33338)

support partition by :

```
create table tb1 (c1 string, ts datetime) engine = iceberg partition by (c1, day(ts)) () properties ("a"="b")
```

* [Enhancement](hive-writer) Adjust table sink exchange rebalancer params. (#33397)

Issue Number:  #31442

Change table sink exchange rebalancer params to node level and adjust these params to improve write performance by better balance.

rebalancer params:
```
DEFINE_mInt64(table_sink_partition_write_min_data_processed_rebalance_threshold,
              "26214400"); // 25MB
// Minimum partition data processed to rebalance writers in exchange when partition writing
DEFINE_mInt64(table_sink_partition_write_min_partition_data_processed_rebalance_threshold,
              "15728640"); // 15MB
```

* [feature](profile) add transaction statistics for profile (#33488)

1. commit total time
2. fs operator total time
     rename file count
     rename dir count
     delete dir count
3. add partition total time
    add partition count
4. update partition total time
    update partition count
like:
```
      -  Transaction  Commit  Time:  906ms
          -  FileSystem  Operator  Time:  833ms
              -  Rename  File  Count:  4
              -  Rename  Dir  Count:  0
              -  Delete  Dir  Count:  0
          -  HMS  Add  Partition  Time:  0ms
              -  HMS  Add  Partition  Count:  0
          -  HMS  Update  Partition  Time:  68ms
              -  HMS  Update  Partition  Count:  4
```

* [feature](iceberg) add iceberg transaction implement (#33629)

Issue #31442

add iceberg transaction

* [feature](insert)support default value when create hive table (#33666)

Issue Number: #31442

hive3 support create table with column's default value
if use hive3, we can write default value to table

* [refactor](filesystem)refactor `filesystem` interface (#33361)

1. Remame`list` to `globList` . The path of this `list` needs to have a wildcard character, and the corresponding hdfs interface is `globStatus`, so the modified name is `globList`.
2. If you only need to view files based on paths, you can use the `listFiles` operation.
3. Merge `listLocatedFiles` function into `listFiles` function.

* [opt](meta-cache) refine the meta cache (#33449)

1. Use `caffeine` instead of `guava cache` to get better performace
2. Add a new class `CacheFactory`

    All (Async)LoadingCache should be built from `CacheFactory`

3. Use separator executor for different caches

    1. rowCountRefreshExecutor
      For row count cache.
      Row count cache is an async loading cache, and we can ignore the result
      if cache missing or thread pool is full.
      So use a separate executor for this cache.

    2.  commonRefreshExecutor
      For other caches. Other caches are sync loading cache.
      But commonRefreshExecutor will be used for async refresh.
      That is, if cache entry is missing, the cache value will be loaded in caller thread, sychronously.
      if cache entry need refresh, it will be reloaded in commonRefreshExecutor.

    3. fileListingExecutor
      File listing is a heavy operation, so use a separate executor for it.
      For fileCache, the refresh operation will still use commonRefreshExecutor to trigger refresh.
      And fileListingExecutor will be used to list file.

4. Change the refresh and expire logic of caches

    For most of caches, set `refreshAfterWrite` strategy, so that
    even if the cache entry is expired, the old entry can still be
    used while new entry is being loaded.

5. Add new global variable `enable_get_row_count_from_file_list`

    Default is true, if false, will disable getting row count from file list

* [bugfix](hive)delete write path after hive insert (#33798)

Issue #31442

1. delete file according query id
2. delete write path after insert

* [Enhancement](multi-catalog) Rewrite `S3URI` to remove tricky virtual bucket mechanism and support different uri styles by flags. (#33858)

Many domestic cloud vendors are compatible with the s3 protocol. However, early versions of s3 client will only generate path style http requests (aws/aws-sdk-java-v2#763) when encountering endpoints that do not start with s3, while some cloud vendors only support virtual host style http request.

Therefore, Doris used `forceVirtualHosted` in `S3URI` to convert it into a virtual hosted path and implemented it through path style.
For example:
For s3 uri `s3://my-bucket/data/file.txt`, It will eventually be parsed into:
- virtualBucket: my-bucket
- Bucket: data (bucket must be set, otherwise the s3 client will report an error) Especially this step is particularly tricky because of the limitations of the s3 client.
- Key: file.txt

 The path style mode is used to generate an http request similar to the virtual host by setting the endpoint to virtualBucket + original endpoint, setting the bucket and key.
**However, the bucket and key here are inconsistent with the original concepts of s3, but the aws client happens to be able to generate an http request similar to the virtual host through the path style mode.**

However, after #30799 we have upgrade the aws sdk version from 2.17.257 to 2.20.131. The current aws s3 client can already generate a virtual host by third party by default style of http request. So in #31111 need to set the path style option, let the s3 client use doris' virtual bucket mechanism to continue working.

**Finally, the virtual bucket mechanism is too confusing and tricky, and we no longer need it with the new version of s3 client.**

### Resolution:

Rewrite `S3URI` to remove tricky virtual bucket mechanism and support different uri styles by flags.

This class represents a fully qualified location in S3 for input/output operations expressed as as URI.
 #### For AWS S3, URI common styles:
  - AWS Client Style(Hadoop S3 Style): `s3://my-bucket/path/to/file?versionId=abc123&partNumber=77&partNumber=88`
  - Virtual Host Style: `https://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`
  - Path Style: `https://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`
 
  Regarding the above-mentioned common styles, we can use <code>isPathStyle</code> to control whether to use path style
  or virtual host style.
  "Virtual host style" is the currently mainstream and recommended approach to use, so the default value of
  <code>isPathStyle</code> is false.
 
  #### Other Styles:
  - Virtual Host AWS Client (Hadoop S3) Mixed Style:
    `s3://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`
  - Path AWS Client (Hadoop S3) Mixed Style:
     `s3://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`
 
  For these two styles, we can use <code>isPathStyle</code> and <code>forceParsingByStandardUri</code>
  to control whether to use.
  Virtual Host AWS Client (Hadoop S3) Mixed Style: <code>isPathStyle = false && forceParsingByStandardUri = true</code>
  Path AWS Client (Hadoop S3) Mixed Style: <code>isPathStyle = true && forceParsingByStandardUri = true</code>
 
  When the incoming location is url encoded, the encoded string will be returned.
  For <code>getKey()</code>, <code>getQueryParams()</code> will return the encoding string

* [improvement](hive)add the `queryid` to the temporary file path (#34278)

`_temp_<table_name>` to `_temp_<queryid>_<table_name>`.
Prevent users from having a table with the name `_temp_<table_name>`.

So as to partition temp dir

* [feature](Cloud) Load index data into index cache when writing data (#34046)

* [Feature](hive-writer) Implements s3 file committer. (#33937)

Issue Number: #31442

[Feature] (hive-writer) Implements s3 file committer. 

S3 committer will start multipart uploading all files on BE side, and then complete multipart upload these files on FE side. If you do not complete multi parts of a file, the file will not be visible. So in this way, the atomicity of a single file can be guaranteed. But it still cannot guarantee the atomicity of multiple files. Because hive committers have best-effort semantics, this shortens the inconsistent time window.

## ChangeList:
- Add `used_by_s3_committer` in `FileWriterOptions` on BE side to start multi-part uploading files, then complete multi-part uploading files on FE side.
- `cosn://`use s3 client on FE side, because it need to complete multi-part uploading files on FE side.
-  Add `Status directoryExists(String dir)` and `Status deleteDirectory` in `FileSystem`.

---------

Co-authored-by: slothever <18522955+wsjz@users.noreply.github.com>
Co-authored-by: wuwenchi <wuwenchihdu@hotmail.com>
Co-authored-by: Qi Chen <kaka11.chen@gmail.com>
Co-authored-by: AlexYue <yj976240184@gmail.com>
@wsjz wsjz deleted the support_tbl_loc branch July 4, 2024 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.3-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants