-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Closed
Closed
Copy link
Description
issue Reason
Now, Doris will create two fragments for aggretion.
Sometime, Exchange for aggregation is unnecessary.
Think about follow cases:
- aggregate an unpartitioned table.
create table SQL:
CREATE TABLE `llj_test_1` (
`dt` int(11) NOT NULL COMMENT "",
`dis_key` varchar(20) NOT NULL COMMENT ""
) ENGINE=OLAP
DUPLICATE KEY(`dt`, `dis_key`)
COMMENT "OLAP"
DISTRIBUTED BY HASH(`dt`, `dis_key`) BUCKETS 3
PROPERTIES (
"replication_num" = "3",
"in_memory" = "false",
"storage_format" = "DEFAULT"
);
query SQL:select dt, dis_key,count(1) from llj_test_1 group by dt, dis_key;
2. aggregate a partitioned table.
create table SQL:
CREATE TABLE `llj_test` (
`dt` int(11) NOT NULL COMMENT "",
`dis_key` varchar(20) NOT NULL COMMENT ""
) ENGINE=OLAP
DUPLICATE KEY(`dt`, `dis_key`)
COMMENT "OLAP"
PARTITION BY RANGE(`dt`)
(PARTITION p20180822 VALUES [("19000101"), ("20181021")),
PARTITION p20181207 VALUES [("20181021"), ("20181022")))
DISTRIBUTED BY HASH(`dt`, `dis_key`) BUCKETS 3
PROPERTIES (
"replication_num" = "1",
"in_memory" = "false",
"storage_format" = "DEFAULT"
);
query SQL:select dt, dis_key,count(1) from llj_test group by dt, dis_key;
Suggestion
In DistributedPlanner, do not add the unnecessary Exchanges.
For case 1, we only need to judge that the table's distribute hash keys is a subset of the aggregate keys.
For case 2, we should jude two conditions:
- partition keys are also hash keys.
- the table's distribute hash keys is a subset of the aggregate keys.
Metadata
Metadata
Assignees
Labels
No labels