Skip to content

[Bug] Error plan in bucket shuffle join #6171

@HappenLee

Description

@HappenLee

Describe the bug
create table and load data

create table T_DORIS_A
(
ID bigint not null,
APPLY_CRCL bigint(19)
)ENGINE=OLAP
UNIQUE KEY(`ID`)
COMMENT "T_DORIS_A"
DISTRIBUTED BY HASH(`ID`) BUCKETS 32
PROPERTIES(
"replication_num"="1",
"in_memory"="false",
"storage_format"="V2"
);

create table T_DORIS_B
(
ID bigint not null,
FACTOR_FIN_VALUE decimal(19,2),
PRJT_ID bigint(19)
)ENGINE=OLAP
UNIQUE KEY(`ID`)
COMMENT "T_DORIS_B"
DISTRIBUTED BY HASH(`ID`) BUCKETS 32
PROPERTIES(
"replication_num"="1",
"in_memory"="false",
"storage_format"="V2"
);

create table T_DORIS_C
(
ID bigint not null
)ENGINE=OLAP
UNIQUE KEY(`ID`)
COMMENT "T_DORIS_C"
DISTRIBUTED BY HASH(`ID`) BUCKETS 32
PROPERTIES(
"replication_num"="1",
"in_memory"="false",
"storage_format"="V2"
);

create table T_DORIS_D
(
ID bigint not null,
LIMIT_ID bigint(19),
CORE_ID bigint(19)
)ENGINE=OLAP
UNIQUE KEY(`ID`)
COMMENT "T_DORIS_D"
DISTRIBUTED BY HASH(`ID`) BUCKETS 32
PROPERTIES(
"replication_num"="1",
"in_memory"="false",
"storage_format"="V2"
);

create table T_DORIS_E
(
ID bigint not null,
SHARE_ID bigint,
SPONSOR_ID bigint
)ENGINE=OLAP
UNIQUE KEY(`ID`)
COMMENT "T_DORIS_E"
DISTRIBUTED BY HASH(`ID`) BUCKETS 32
PROPERTIES(
"replication_num"="1",
"in_memory"="false",
"storage_format"="V2"
);

To Reproduce
check query paln

desc SELECT B.FACTOR_FIN_VALUE, D.limit_id FROM T_DORIS_A A     LEFT JOIN T_DORIS_B B ON B.PRJT_ID = A.ID     LEFT JOIN T_DORIS_C C ON A.apply_crcl = C.id     JOIN T_DORIS_D D ON C.ID = D.CORE_ID order by B.FACTOR_FIN_VALUE, D.limit_id desc;

T_DORIS_C and T_DORIS_D should not be bucket shuffle join, but we got it.

|   6:HASH JOIN                                                                    |
|   |  join op: INNER JOIN (BUCKET_SHUFFLE)                                             |
|   |  hash predicates:                                                            |
|   |  colocate: false, reason: The src data has been redistributed                |
|   |  equal join conjunct: `C`.`ID` = `D`.`CORE_ID`                               |
|   |                                                                              |
|   |----10:EXCHANGE                                                               |
|   |                                                                              |
|   4:HASH JOIN                                                                    |
|   |  join op: LEFT OUTER JOIN (BROADCAST)                                        |
|   |  hash predicates:                                                            |
|   |  colocate: false, reason: Tables are not in the same group                   |
|   |  equal join conjunct: `A`.`apply_crcl` = `C`.`id`                            |
|   |                                                                              |
|   |----9:EXCHANGE                                                                |
|   |                                                                              |
|   2:HASH JOIN                                                                    |
|   |  join op: LEFT OUTER JOIN (BUCKET_SHUFFLE)                                   |
|   |  hash predicates:                                                            |
|   |  colocate: false, reason: Tables are not in the same group                   |
|   |  equal join conjunct: `A`.`ID` = `B`.`PRJT_ID`                               |
|   |                                                                              |
|   |----8:EXCHANGE                                                                |
|   |                                                                              |
|   0:OlapScanNode                                                                 |
|      TABLE: T_DORIS_A                                                            |
|      PREAGGREGATION: OFF. Reason: No AggregateInfo                               |
|      partitions=1/1                                                              |
|      rollup: T_DORIS_A                                                           |
|      tabletRatio=32/32                                                           |
|      tabletList=71030,71032,71034,71036,71038,71040,71042,71044,71046,71048 ...  |
|      cardinality=9884                                                            |
|      avgRowSize=5.156718                                                         |
|      numNodes=1       

Expected behavior

The right query paln should be:

|   6:HASH JOIN                                                                    |
|   |  join op: INNER JOIN (BROADCAST)                                             |
|   |  hash predicates:                                                            |
|   |  colocate: false, reason: The src data has been redistributed                |
|   |  equal join conjunct: `C`.`ID` = `D`.`CORE_ID`                               |
|   |                                                                              |
|   |----10:EXCHANGE                                                               |
|   |                                                                              |
|   4:HASH JOIN                                                                    |
|   |  join op: LEFT OUTER JOIN (BROADCAST)                                        |
|   |  hash predicates:                                                            |
|   |  colocate: false, reason: Tables are not in the same group                   |
|   |  equal join conjunct: `A`.`apply_crcl` = `C`.`id`                            |
|   |                                                                              |
|   |----9:EXCHANGE                                                                |
|   |                                                                              |
|   2:HASH JOIN                                                                    |
|   |  join op: LEFT OUTER JOIN (BUCKET_SHUFFLE)                                   |
|   |  hash predicates:                                                            |
|   |  colocate: false, reason: Tables are not in the same group                   |
|   |  equal join conjunct: `A`.`ID` = `B`.`PRJT_ID`                               |
|   |                                                                              |
|   |----8:EXCHANGE                                                                |
|   |                                                                              |
|   0:OlapScanNode                                                                 |
|      TABLE: T_DORIS_A                                                            |
|      PREAGGREGATION: OFF. Reason: No AggregateInfo                               |
|      partitions=1/1                                                              |
|      rollup: T_DORIS_A                                                           |
|      tabletRatio=32/32                                                           |
|      tabletList=71030,71032,71034,71036,71038,71040,71042,71044,71046,71048 ...  |
|      cardinality=9884                                                            |
|      avgRowSize=5.156718                                                         |
|      numNodes=1  

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions