Skip to content

Conversation

@BiteTheDDDDt
Copy link
Contributor

Proposed changes

select count(1) from (
select ss_customer_sk customer_sk
      ,ss_item_sk item_sk
from store_sales,date_dim
where ss_sold_date_sk = d_date_sk
  and d_month_seq between 1199 and 1199 + 11 and ss_sold_date_sk IS NOT NULL
group by ss_customer_sk
        ,ss_item_sk
) a;
uint256 original
12.27 sec
                              -  HashTableInputCount:  55.286154M  (55286154)
                              -  HashTableIterateTime:  347.592ms
                              -  HashTableSize:  54.116764M  (54116764)
                              -  InsertKeysToColumnTime:  1s79ms
                              -  MaxRowSizeInBytes:  0
                              -  MemoryUsage:  
                                  -  HashTable:  2.50  GB
                                  -  SerializeKeyArena:  1.63  GB
                              -  MergeTime:  0ns
                              -  PeakMemoryUsage:  4.13  GB

stringref
17.81 sec

                              -  HashTableInputCount:  55.285529M  (55285529)
                              -  HashTableIterateTime:  222.769ms
                              -  HashTableSize:  54.116764M  (54116764)
                              -  InsertKeysToColumnTime:  340.796ms
                              -  MaxRowSizeInBytes:  17
                              -  MemoryUsage:  
                                  -  HashTable:  1.50  GB
                                  -  SerializeKeyArena:  1.76  GB
                              -  MergeTime:  0ns
                              -  PeakMemoryUsage:  3.26  GB

uint256 opt
11.20 sec

                              -  HashTableInputCount:  55.285554M  (55285554)
                              -  HashTableIterateTime:  278.187ms
                              -  HashTableSize:  54.116764M  (54116764)
                              -  InsertKeysToColumnTime:  467.894ms
                              -  MaxRowSizeInBytes:  0
                              -  MemoryUsage:  
                                  -  HashTable:  2.50  GB
                                  -  SerializeKeyArena:  1.63  GB
                              -  MergeTime:  0ns
                              -  PeakMemoryUsage:  4.13  GB

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@BiteTheDDDDt
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@hello-stephen
Copy link
Contributor

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 45.4 seconds
stream load tsv: 507 seconds loaded 74807831229 Bytes, about 140 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 31 seconds loaded 861443392 Bytes, about 26 MB/s
insert into select: 29.1 seconds inserted 10000000 Rows, about 343K ops/s
storage size: 17162140226 Bytes

@BiteTheDDDDt
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

1 similar comment
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@hello-stephen
Copy link
Contributor

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 45.55 seconds
stream load tsv: 508 seconds loaded 74807831229 Bytes, about 140 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 29.9 seconds inserted 10000000 Rows, about 334K ops/s
storage size: 17156773957 Bytes

Copy link
Contributor

@HappenLee HappenLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jul 26, 2023
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@BiteTheDDDDt BiteTheDDDDt merged commit 9451382 into apache:master Jul 26, 2023
@xiaokang xiaokang added dev/2.0.0 2.0.0 release dev/2.0.1 and removed dev/2.0.0 2.0.0 release labels Jul 26, 2023
xiaokang pushed a commit to xiaokang/doris that referenced this pull request Aug 9, 2023
…:insert_keys_into_columns (apache#22216)

optimization for AggregationMethodKeysFixed::insert_keys_into_columns
xiaokang pushed a commit that referenced this pull request Aug 11, 2023
…:insert_keys_into_columns (#22216)

optimization for AggregationMethodKeysFixed::insert_keys_into_columns
@xiaokang xiaokang mentioned this pull request Aug 30, 2023
@BiteTheDDDDt BiteTheDDDDt deleted the opt_0725 branch January 20, 2025 06:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.0.1-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants