Skip to content

[Substrait] Planning errors: Missing column with multiple consecutive joins #18079

@hareshkh

Description

@hareshkh

Describe the bug

We get the following error:

The left or right side of the join does not have all columns on \"on\": \nMissing on the left: {Column { name: \"target_unit:1\", index: 2 }}\nMissing on the right: {}

DataFusion plan:

Projection: left.target_unit AS origin_unit, left.target_unit:1 AS target_unit, left.origin_unit_of_measure AS material_conversions_origin_unit_of_measure, left.target_unit_of_measure AS material_conversions_target_unit_of_measure, right.origin_unit_of_measure, right.target_unit_of_measure
  Left Join: left.target_unit = right.origin_unit_of_measure, left.target_unit:1 = right.target_unit_of_measure
    SubqueryAlias: left
      Projection: target_unit_of_measure, left.target_unit, origin_unit_of_measure, right.target_unit
        Left Join: left.target_unit = origin_unit_of_measure, right.target_unit = target_unit_of_measure
          Projection: right.target_unit, left.target_unit
            Left Join: left.location_material_id = right.location_material_id
              SubqueryAlias: left
                Projection: location_material_id, target_unit
                  TableScan: e55dfa69-3ce2-43cc-9eab-39503aa1577f projection=[target_unit, location_material_id]
              SubqueryAlias: right
                Projection: target_unit AS target_unit, location_material_id AS location_material_id
                  Filter: location_material_id IS NOT NULL
                    TableScan: e55dfa69-3ce2-43cc-9eab-39503aa1577f projection=[target_unit, location_material_id], partial_filters=[location_material_id IS NOT NULL]
          Projection: origin_unit_of_measure, target_unit_of_measure
            Filter: origin_unit_of_measure IS NOT NULL AND target_unit_of_measure IS NOT NULL
              TableScan: c1da230d-a768-4a1f-bda4-9961da6ae26f projection=[target_unit_of_measure, origin_unit_of_measure], partial_filters=[origin_unit_of_measure IS NOT NULL, target_unit_of_measure IS NOT NULL]
    SubqueryAlias: right
      Projection: origin_unit_of_measure, target_unit_of_measure
        Filter: origin_unit_of_measure IS NOT NULL AND target_unit_of_measure IS NOT NULL
          TableScan: c1da230d-a768-4a1f-bda4-9961da6ae26f projection=[target_unit_of_measure, origin_unit_of_measure], partial_filters=[origin_unit_of_measure IS NOT NULL, target_unit_of_measure IS NOT NULL]

Here is the substrait plan: https://pastebin.com/AVkyzgAX

To Reproduce

No response

Expected behavior

No response

Additional context

The equivalent spark plan that works

WriteFiles
  +- Project [origin_unit#1582, target_unit#1596, material_conversions_origin_unit_of_measure#1614, material_conversions_target_unit_of_measure#1613, origin_unit_of_measure#1652, target_unit_of_measure#1651]
      +- Join LeftOuter, ((origin_unit#1582 = origin_unit_of_measure#1652) AND (target_unit#1596 = target_unit_of_measure#1651))
        :- Project [target_unit#1596, material_conversions_origin_unit_of_measure#1614, origin_unit#1582, material_conversions_target_unit_of_measure#1613]
        :  +- Join LeftOuter, ((origin_unit#1582 = material_conversions_origin_unit_of_measure#1614) AND (target_unit#1596 = material_conversions_target_unit_of_measure#1613))
        :     :- Project [origin_unit#1582, target_unit#1596]
        :     :  +- Join LeftOuter, (9a5e3792-832d-4686-b05f-5328affa7612#1590 = b809d14d-0704-4e6f-b304-4ede6fd1d6b8#1593)
        :     :     :- Project [target_unit#1572 AS origin_unit#1582, location_material_id#1573 AS 9a5e3792-832d-4686-b05f-5328affa7612#1590]
        :     :     :  +- Relation A [target_unit#1572,location_material_id#1573] parquet
        :     :     +- Project [location_material_id#1597 AS b809d14d-0704-4e6f-b304-4ede6fd1d6b8#1593, target_unit#1596]
        :     :        +- Filter isnotnull(location_material_id#1597)
        :     :           +- Relation A [target_unit#1596,location_material_id#1597] parquet
        :     +- Project [target_unit_of_measure#1565 AS material_conversions_target_unit_of_measure#1613, origin_unit_of_measure#1566 AS material_conversions_origin_unit_of_measure#1614]
        :        +- Filter (isnotnull(origin_unit_of_measure#1566) AND isnotnull(target_unit_of_measure#1565))
        :           +- Relation B[target_unit_of_measure#1565,origin_unit_of_measure#1566] parquet
        +- Project [origin_unit_of_measure#1652, target_unit_of_measure#1651]
            +- Filter (isnotnull(origin_unit_of_measure#1652) AND isnotnull(target_unit_of_measure#1651))
              +- Relation B[target_unit_of_measure#1651,origin_unit_of_measure#1652] parquet

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions