Skip to content

Conversation

@DongLiang-0
Copy link
Contributor

Proposed changes

Issue Number: close #xxx

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@DongLiang-0 DongLiang-0 marked this pull request as draft October 12, 2023 09:45
@DongLiang-0 DongLiang-0 marked this pull request as ready for review October 18, 2023 07:45
@DongLiang-0
Copy link
Contributor Author

run buildall

@DongLiang-0 DongLiang-0 force-pushed the paimon_complex branch 2 times, most recently from fe3e440 to 34ff765 Compare October 18, 2023 08:16
@DongLiang-0
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.44 seconds
stream load tsv: 577 seconds loaded 74807831229 Bytes, about 123 MB/s
stream load json: 21 seconds loaded 2358488459 Bytes, about 107 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 29.0 seconds inserted 10000000 Rows, about 344K ops/s
storage size: 17162160467 Bytes

@DongLiang-0
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.46 seconds
stream load tsv: 579 seconds loaded 74807831229 Bytes, about 123 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 29.1 seconds inserted 10000000 Rows, about 343K ops/s
storage size: 17162547631 Bytes

def c21 = """select * from auto_bucket where dt="b" and hh="c";"""
def c22 = """select * from auto_bucket where dt="d";"""
def c23 = """select * from complex_tab order by c1;"""
def c24 = """select * from complex_tab where c1=1;"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

24,25,26 is similar,keeping one is enough,can add some case like 'SELECT m['a'] FROM simple_map;'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestion, I have changed it

private int idx;
private InternalRow record;
ColumnType dorisType;
private DataGetters record;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change InternalRow to DataGetters

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When dealing with unpackArray and unpackMap, since org.apache.paimon.data.InternalArray extends org.apache.paimon.data.DataGetters, if do not change the InternalRow to DataGetters, the force transfer will fail.

@DongLiang-0 DongLiang-0 force-pushed the paimon_complex branch 2 times, most recently from bf12d43 to 5e78f3a Compare October 20, 2023 06:21
@DongLiang-0
Copy link
Contributor Author

run buildall

Copy link
Contributor

@zddr zddr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.65 seconds
stream load tsv: 577 seconds loaded 74807831229 Bytes, about 123 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 29.1 seconds inserted 10000000 Rows, about 343K ops/s
storage size: 17162353741 Bytes

@DongLiang-0
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 45.81 seconds
stream load tsv: 555 seconds loaded 74807831229 Bytes, about 128 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 33 seconds loaded 861443392 Bytes, about 24 MB/s
insert into select: 28.8 seconds inserted 10000000 Rows, about 347K ops/s
storage size: 17162015400 Bytes

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morningman morningman merged commit 267c112 into apache:master Oct 23, 2023
@morningman morningman added dev/2.0.3 not-merge/2.0 do not merge into 2.0 branch labels Oct 23, 2023
@morningman morningman added merge_conflict and removed not-merge/2.0 do not merge into 2.0 branch labels Oct 23, 2023
@morningman
Copy link
Contributor

Need more test case, not merge to 2.0 yet

dutyu pushed a commit to dutyu/doris that referenced this pull request Oct 28, 2023
morningman pushed a commit that referenced this pull request Nov 1, 2023
Add Paimon complex nested type regression case.
Related pr:#25364
DongLiang-0 added a commit to DongLiang-0/doris that referenced this pull request Nov 1, 2023
DongLiang-0 added a commit to DongLiang-0/doris that referenced this pull request Nov 1, 2023
Add Paimon complex nested type regression case.
Related pr:apache#25364
dutyu pushed a commit to dutyu/doris that referenced this pull request Nov 4, 2023
Add Paimon complex nested type regression case.
Related pr:apache#25364
DongLiang-0 added a commit to DongLiang-0/doris that referenced this pull request Nov 10, 2023
DongLiang-0 added a commit to DongLiang-0/doris that referenced this pull request Nov 10, 2023
Add Paimon complex nested type regression case.
Related pr:apache#25364
seawinde pushed a commit to seawinde/doris that referenced this pull request Nov 13, 2023
Add Paimon complex nested type regression case.
Related pr:apache#25364
@xiaokang xiaokang mentioned this pull request Dec 4, 2023
XuJianxu pushed a commit to XuJianxu/doris that referenced this pull request Dec 14, 2023
XuJianxu pushed a commit to XuJianxu/doris that referenced this pull request Dec 14, 2023
Add Paimon complex nested type regression case.
Related pr:apache#25364
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants