Skip to content

Conversation

@xionglei0
Copy link
Contributor

doris_exchange_instances speed up the query especially useful when:

  1. data size decrease on a massive scale after exchange when using aggregation function or UDAF
  2. cluster is large but we only need several instances to do aggregation. save computational resources

@xionglei0
Copy link
Contributor Author

image

@kangkaisen
Copy link
Contributor

@xionglei0 Hi, Thanks for your work!

1 Do you have the benchmark for this PR?

2 Could you use parallelExecInstanceNum instead of doris_exchange_instances?

@xionglei0
Copy link
Contributor Author

"parallel_fragment_exec_instance_num" means the parallel during scan_node
"doris_exchange_instances" means the parallel after exchange_node
when scan datasize much more than datasize after exchange, like udaf : sum,count, doris_exchange_instances is useful

@xionglei0
Copy link
Contributor Author

doris_exchange_instances value need to be set according to your data size and UDAF, we usually configure it when table is duplication model

Copy link
Contributor

@imay imay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@imay imay merged commit a232a56 into apache:master Sep 16, 2019
@imay imay mentioned this pull request Sep 26, 2019
swjtu-zhanglei pushed a commit to swjtu-zhanglei/incubator-doris that referenced this pull request Jul 25, 2023
… conf (apache#1788)

The lead may lead to SEGV due to unreclaimable Txn, TxnKv, and Network.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants