Skip to content

Conversation

@wuwenchi
Copy link
Contributor

@wuwenchi wuwenchi commented Apr 7, 2024

Proposed changes

Issue #31442

support partition by :

create table tb1 (c1 string, ts datetime) engine = iceberg partition by (c1, day(ts)) () properties ("a"="b")

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@wuwenchi
Copy link
Contributor Author

wuwenchi commented Apr 7, 2024

run buildall

@wuwenchi
Copy link
Contributor Author

wuwenchi commented Apr 7, 2024

run compile

@doris-robot
Copy link

TPC-H: Total hot run time: 39097 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 9d76c6b19d211c9c2d698a28af82e5f59fe3672b, data reload: false

------ Round 1 ----------------------------------
q1	17914	4277	4181	4181
q2	2287	187	182	182
q3	11086	1406	1400	1400
q4	10524	851	983	851
q5	7927	3011	2953	2953
q6	221	134	132	132
q7	1117	632	627	627
q8	9417	2005	2065	2005
q9	6743	6186	6195	6186
q10	8486	3519	3553	3519
q11	426	244	221	221
q12	386	213	206	206
q13	17763	2905	2963	2905
q14	266	233	241	233
q15	526	486	476	476
q16	515	373	373	373
q17	973	920	921	920
q18	7194	6472	6482	6472
q19	1602	1549	1553	1549
q20	557	322	318	318
q21	3513	3085	3128	3085
q22	352	303	306	303
Total cold run time: 109795 ms
Total hot run time: 39097 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4079	4034	4090	4034
q2	335	217	221	217
q3	2989	2972	2935	2935
q4	1889	1857	1875	1857
q5	5246	5200	5226	5200
q6	207	125	125	125
q7	2235	1802	1828	1802
q8	3230	3276	3300	3276
q9	8470	8482	8510	8482
q10	3756	3841	3813	3813
q11	544	459	457	457
q12	709	551	546	546
q13	16815	2902	2890	2890
q14	297	266	276	266
q15	517	468	467	467
q16	458	405	407	405
q17	1728	1692	1688	1688
q18	7635	7314	7188	7188
q19	1642	1632	1650	1632
q20	1938	1711	1729	1711
q21	5059	4793	4726	4726
q22	519	410	435	410
Total cold run time: 70297 ms
Total hot run time: 54127 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 180705 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 9d76c6b19d211c9c2d698a28af82e5f59fe3672b, data reload: false

query1	1240	1132	1117	1117
query2	6483	1855	1832	1832
query3	6668	219	224	219
query4	24156	21361	21526	21361
query5	4190	395	411	395
query6	269	174	180	174
query7	4598	304	299	299
query8	232	175	184	175
query9	8475	2202	2197	2197
query10	585	254	250	250
query11	14990	14568	14361	14361
query12	134	91	86	86
query13	1640	380	385	380
query14	8708	6687	6740	6687
query15	203	170	183	170
query16	7146	266	256	256
query17	1035	579	563	563
query18	1906	279	276	276
query19	197	151	149	149
query20	90	85	88	85
query21	199	123	120	120
query22	4959	4792	4728	4728
query23	33559	32743	32740	32740
query24	11161	3128	3106	3106
query25	693	389	391	389
query26	1895	158	152	152
query27	3052	324	332	324
query28	6706	1833	1803	1803
query29	1380	603	590	590
query30	299	174	169	169
query31	974	716	722	716
query32	101	55	57	55
query33	669	251	250	250
query34	979	490	495	490
query35	808	681	683	681
query36	997	894	893	893
query37	290	71	73	71
query38	3524	3392	3383	3383
query39	1574	1525	1551	1525
query40	290	130	126	126
query41	49	47	49	47
query42	109	104	99	99
query43	456	420	414	414
query44	1096	717	715	715
query45	288	251	267	251
query46	1085	816	778	778
query47	1877	1778	1790	1778
query48	379	299	312	299
query49	1156	364	365	364
query50	798	396	406	396
query51	6695	6560	6620	6560
query52	107	97	98	97
query53	361	290	292	290
query54	304	231	231	231
query55	88	79	76	76
query56	243	223	221	221
query57	1240	1121	1114	1114
query58	242	219	211	211
query59	2459	2466	2478	2466
query60	272	245	242	242
query61	111	108	112	108
query62	709	458	444	444
query63	313	285	289	285
query64	6402	3172	3265	3172
query65	3086	3018	3017	3017
query66	1454	337	325	325
query67	15416	15012	14830	14830
query68	8708	589	594	589
query69	564	311	308	308
query70	1330	1136	1124	1124
query71	490	276	280	276
query72	6490	2572	2382	2382
query73	890	321	329	321
query74	6640	6240	6312	6240
query75	3484	2255	2295	2255
query76	5737	1183	1232	1183
query77	676	248	242	242
query78	10746	10186	10155	10155
query79	10744	536	539	536
query80	1856	411	408	408
query81	500	244	229	229
query82	461	94	93	93
query83	223	175	174	174
query84	266	87	86	86
query85	1029	277	279	277
query86	389	317	288	288
query87	3673	3500	3482	3482
query88	4004	2367	2371	2367
query89	559	386	377	377
query90	2022	182	187	182
query91	134	103	104	103
query92	70	52	49	49
query93	6905	532	527	527
query94	1337	180	179	179
query95	433	314	321	314
query96	618	274	271	271
query97	2623	2474	2489	2474
query98	231	219	211	211
query99	1281	840	830	830
Total cold run time: 298962 ms
Total hot run time: 180705 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.51 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 9d76c6b19d211c9c2d698a28af82e5f59fe3672b, data reload: false

query1	0.04	0.03	0.03
query2	0.07	0.04	0.04
query3	0.24	0.05	0.05
query4	1.66	0.07	0.07
query5	0.49	0.49	0.49
query6	1.14	0.66	0.65
query7	0.02	0.01	0.01
query8	0.05	0.04	0.04
query9	0.55	0.52	0.50
query10	0.57	0.56	0.55
query11	0.15	0.12	0.11
query12	0.15	0.12	0.12
query13	0.61	0.59	0.59
query14	0.78	0.81	0.79
query15	0.87	0.84	0.84
query16	0.35	0.35	0.35
query17	0.96	1.00	1.01
query18	0.26	0.25	0.26
query19	1.86	1.75	1.75
query20	0.02	0.01	0.01
query21	15.41	0.64	0.64
query22	3.91	8.15	1.16
query23	17.88	1.30	1.23
query24	1.37	0.20	0.19
query25	0.16	0.08	0.08
query26	0.27	0.18	0.17
query27	0.08	0.07	0.07
query28	13.93	0.96	0.95
query29	12.54	3.32	3.27
query30	0.25	0.06	0.05
query31	2.85	0.40	0.39
query32	3.26	0.47	0.48
query33	2.87	2.84	2.89
query34	15.49	4.34	4.33
query35	4.37	4.38	4.37
query36	0.68	0.47	0.48
query37	0.18	0.16	0.15
query38	0.15	0.15	0.14
query39	0.04	0.04	0.03
query40	0.17	0.15	0.15
query41	0.10	0.05	0.05
query42	0.06	0.04	0.05
query43	0.04	0.04	0.03
Total cold run time: 106.9 s
Total hot run time: 29.51 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 9d76c6b19d211c9c2d698a28af82e5f59fe3672b with default session variables
Stream load json:         18 seconds loaded 2358488459 Bytes, about 124 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      32 seconds loaded 861443392 Bytes, about 25 MB/s
Insert into select:       15.7 seconds inserted 10000000 Rows, about 636K ops/s

morningman
morningman previously approved these changes Apr 9, 2024
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

github-actions bot commented Apr 9, 2024

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Apr 9, 2024
@github-actions
Copy link
Contributor

github-actions bot commented Apr 9, 2024

PR approved by anyone and no changes requested.

@wuwenchi
Copy link
Contributor Author

wuwenchi commented Apr 9, 2024

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Apr 9, 2024
@wuwenchi
Copy link
Contributor Author

wuwenchi commented Apr 9, 2024

run p0

Copy link
Contributor

@kaka11chen kaka11chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 10, 2024
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@morningman morningman merged commit 888e004 into apache:master Apr 10, 2024
morningman pushed a commit that referenced this pull request Apr 12, 2024
…tioned tables (#33338)

support partition by :

```
create table tb1 (c1 string, ts datetime) engine = iceberg partition by (c1, day(ts)) () properties ("a"="b")
```
morningman pushed a commit to morningman/doris that referenced this pull request Apr 30, 2024
…tioned tables (apache#33338)

support partition by :

```
create table tb1 (c1 string, ts datetime) engine = iceberg partition by (c1, day(ts)) () properties ("a"="b")
```
dataroaring pushed a commit that referenced this pull request May 1, 2024
…4.0 (#34371)

* [feature](insert)use optional location and add hive regression test (#33153)

* [feature](iceberg)The new DDL syntax is added to create iceberg partitioned tables (#33338)

support partition by :

```
create table tb1 (c1 string, ts datetime) engine = iceberg partition by (c1, day(ts)) () properties ("a"="b")
```

* [Enhancement](hive-writer) Adjust table sink exchange rebalancer params. (#33397)

Issue Number:  #31442

Change table sink exchange rebalancer params to node level and adjust these params to improve write performance by better balance.

rebalancer params:
```
DEFINE_mInt64(table_sink_partition_write_min_data_processed_rebalance_threshold,
              "26214400"); // 25MB
// Minimum partition data processed to rebalance writers in exchange when partition writing
DEFINE_mInt64(table_sink_partition_write_min_partition_data_processed_rebalance_threshold,
              "15728640"); // 15MB
```

* [feature](profile) add transaction statistics for profile (#33488)

1. commit total time
2. fs operator total time
     rename file count
     rename dir count
     delete dir count
3. add partition total time
    add partition count
4. update partition total time
    update partition count
like:
```
      -  Transaction  Commit  Time:  906ms
          -  FileSystem  Operator  Time:  833ms
              -  Rename  File  Count:  4
              -  Rename  Dir  Count:  0
              -  Delete  Dir  Count:  0
          -  HMS  Add  Partition  Time:  0ms
              -  HMS  Add  Partition  Count:  0
          -  HMS  Update  Partition  Time:  68ms
              -  HMS  Update  Partition  Count:  4
```

* [feature](iceberg) add iceberg transaction implement (#33629)

Issue #31442

add iceberg transaction

* [feature](insert)support default value when create hive table (#33666)

Issue Number: #31442

hive3 support create table with column's default value
if use hive3, we can write default value to table

* [refactor](filesystem)refactor `filesystem` interface (#33361)

1. Remame`list` to `globList` . The path of this `list` needs to have a wildcard character, and the corresponding hdfs interface is `globStatus`, so the modified name is `globList`.
2. If you only need to view files based on paths, you can use the `listFiles` operation.
3. Merge `listLocatedFiles` function into `listFiles` function.

* [opt](meta-cache) refine the meta cache (#33449)

1. Use `caffeine` instead of `guava cache` to get better performace
2. Add a new class `CacheFactory`

    All (Async)LoadingCache should be built from `CacheFactory`

3. Use separator executor for different caches

    1. rowCountRefreshExecutor
      For row count cache.
      Row count cache is an async loading cache, and we can ignore the result
      if cache missing or thread pool is full.
      So use a separate executor for this cache.

    2.  commonRefreshExecutor
      For other caches. Other caches are sync loading cache.
      But commonRefreshExecutor will be used for async refresh.
      That is, if cache entry is missing, the cache value will be loaded in caller thread, sychronously.
      if cache entry need refresh, it will be reloaded in commonRefreshExecutor.

    3. fileListingExecutor
      File listing is a heavy operation, so use a separate executor for it.
      For fileCache, the refresh operation will still use commonRefreshExecutor to trigger refresh.
      And fileListingExecutor will be used to list file.

4. Change the refresh and expire logic of caches

    For most of caches, set `refreshAfterWrite` strategy, so that
    even if the cache entry is expired, the old entry can still be
    used while new entry is being loaded.

5. Add new global variable `enable_get_row_count_from_file_list`

    Default is true, if false, will disable getting row count from file list

* [bugfix](hive)delete write path after hive insert (#33798)

Issue #31442

1. delete file according query id
2. delete write path after insert

* [Enhancement](multi-catalog) Rewrite `S3URI` to remove tricky virtual bucket mechanism and support different uri styles by flags. (#33858)

Many domestic cloud vendors are compatible with the s3 protocol. However, early versions of s3 client will only generate path style http requests (aws/aws-sdk-java-v2#763) when encountering endpoints that do not start with s3, while some cloud vendors only support virtual host style http request.

Therefore, Doris used `forceVirtualHosted` in `S3URI` to convert it into a virtual hosted path and implemented it through path style.
For example:
For s3 uri `s3://my-bucket/data/file.txt`, It will eventually be parsed into:
- virtualBucket: my-bucket
- Bucket: data (bucket must be set, otherwise the s3 client will report an error) Especially this step is particularly tricky because of the limitations of the s3 client.
- Key: file.txt

 The path style mode is used to generate an http request similar to the virtual host by setting the endpoint to virtualBucket + original endpoint, setting the bucket and key.
**However, the bucket and key here are inconsistent with the original concepts of s3, but the aws client happens to be able to generate an http request similar to the virtual host through the path style mode.**

However, after #30799 we have upgrade the aws sdk version from 2.17.257 to 2.20.131. The current aws s3 client can already generate a virtual host by third party by default style of http request. So in #31111 need to set the path style option, let the s3 client use doris' virtual bucket mechanism to continue working.

**Finally, the virtual bucket mechanism is too confusing and tricky, and we no longer need it with the new version of s3 client.**

### Resolution:

Rewrite `S3URI` to remove tricky virtual bucket mechanism and support different uri styles by flags.

This class represents a fully qualified location in S3 for input/output operations expressed as as URI.
 #### For AWS S3, URI common styles:
  - AWS Client Style(Hadoop S3 Style): `s3://my-bucket/path/to/file?versionId=abc123&partNumber=77&partNumber=88`
  - Virtual Host Style: `https://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`
  - Path Style: `https://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`
 
  Regarding the above-mentioned common styles, we can use <code>isPathStyle</code> to control whether to use path style
  or virtual host style.
  "Virtual host style" is the currently mainstream and recommended approach to use, so the default value of
  <code>isPathStyle</code> is false.
 
  #### Other Styles:
  - Virtual Host AWS Client (Hadoop S3) Mixed Style:
    `s3://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`
  - Path AWS Client (Hadoop S3) Mixed Style:
     `s3://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88`
 
  For these two styles, we can use <code>isPathStyle</code> and <code>forceParsingByStandardUri</code>
  to control whether to use.
  Virtual Host AWS Client (Hadoop S3) Mixed Style: <code>isPathStyle = false && forceParsingByStandardUri = true</code>
  Path AWS Client (Hadoop S3) Mixed Style: <code>isPathStyle = true && forceParsingByStandardUri = true</code>
 
  When the incoming location is url encoded, the encoded string will be returned.
  For <code>getKey()</code>, <code>getQueryParams()</code> will return the encoding string

* [improvement](hive)add the `queryid` to the temporary file path (#34278)

`_temp_<table_name>` to `_temp_<queryid>_<table_name>`.
Prevent users from having a table with the name `_temp_<table_name>`.

So as to partition temp dir

* [feature](Cloud) Load index data into index cache when writing data (#34046)

* [Feature](hive-writer) Implements s3 file committer. (#33937)

Issue Number: #31442

[Feature] (hive-writer) Implements s3 file committer. 

S3 committer will start multipart uploading all files on BE side, and then complete multipart upload these files on FE side. If you do not complete multi parts of a file, the file will not be visible. So in this way, the atomicity of a single file can be guaranteed. But it still cannot guarantee the atomicity of multiple files. Because hive committers have best-effort semantics, this shortens the inconsistent time window.

## ChangeList:
- Add `used_by_s3_committer` in `FileWriterOptions` on BE side to start multi-part uploading files, then complete multi-part uploading files on FE side.
- `cosn://`use s3 client on FE side, because it need to complete multi-part uploading files on FE side.
-  Add `Status directoryExists(String dir)` and `Status deleteDirectory` in `FileSystem`.

---------

Co-authored-by: slothever <18522955+wsjz@users.noreply.github.com>
Co-authored-by: wuwenchi <wuwenchihdu@hotmail.com>
Co-authored-by: Qi Chen <kaka11.chen@gmail.com>
Co-authored-by: AlexYue <yj976240184@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.3-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants