ARROW-9555: [Rust] [DataFusion] Added inner join #7830

jorgecarleitao · 2020-07-24T19:54:17Z

This is PR implements inner join, E.g. SELECT a FROM simple1 JOIN simple2 ON a. I have not ran any benchmark, this is pure implementation plus some tests.

The gist of the implementation of the physical plan for a given partition is:

for left_record in left_records:
     hash_left = build_hash_of_keys(left_record)
     for right_record in right_records:
            hash_right = build_hash_of_keys(right_record)
            indexes = inner_join(hash_left, hash_right)
            yield concat(left_record, right_record)[indexes]

The implementation is currently sequential, even though it can be trivially distributed as each RecordBatch is evaluated independently (we still lock the mutex on partition reading, as in other physical plans). Since we have not committed to a distributed computational model, IMO the sequential is enough for now.

This PR is built on top of #7687 and #7796

github-actions · 2020-07-24T20:04:38Z

https://issues.apache.org/jira/browse/ARROW-9555

This reduces * the runtime complexity of this operation from O(N*(1 + M)) to O(N*M) (N=number of rows, M=number of aggregations), * the memory footprint from O(N*M) acumulators to O(M) accumulators * the code complexity via DRY.

The gist of the implementation for a given partition is: ``` for left_record in left_records: hash_left = build_hash_of_keys(left_record) for right_record in right_records: hash_right = build_hash_of_keys(right_record) indexes = inner_join(hash_left, hash_right) yield concat(left_record, right_record)[indexes] ``` I.e. inefficient. The implementation is currently sequential, even though it can be trivially distributed as each RecordBatch is evaluated independently (we still lock the mutex on partition reading, as in other physical plans). Since we have not committed to a distributed computational model, IMO the sequential is enough for now.

andygrove · 2020-07-27T13:18:14Z

Hi @jorgecarleitao Is this a nested inner loop join? I think we should be implementing a hash join instead.

jorgecarleitao · 2020-07-27T13:55:55Z

@andygrove , I am still learning these concepts in detail, so you will need to help me out here. :)

AFAIK this is a block nested join, which reduces to the nested inner join if the recordBatch's size is one.

I was unsure about the consequences of moving all the left side to a single partition, and thus took a more parallel approach of not assuming that a single partition fits in memory. Does this make sense?

For me, we can implement both; I was just trying to have a parallel version in place, and then allow optimizers to pick them. I would also be fine with a hash join only.

andygrove · 2020-07-27T16:47:52Z

I think we are going to have to address (re-)partitioning first before we can tackle joins. If the two tables are both partitioned on the join keys and have the same number of partitions then the joins can happen in parallel across those partitions (at least, for inner joins this is true).

However, if we want to implement a has join without doing that, I would suggest that we load one side (the build side) into memory (single partition) and then stream the other side (the probe side), performing a lookup in the hash table for each row. The stream side can happen in parallel.

This is described in more detail here:

https://en.wikipedia.org/wiki/Hash_join#Classic_hash_join

jorgecarleitao · 2020-08-16T10:58:57Z

I agree with you @andygrove that we need to revisit the partitioning before tackling this. Closing

andygrove added the Component: Rust label Jul 24, 2020

jorgecarleitao added 11 commits July 26, 2020 15:37

Moved hash-related code to its own module.

10e4350

Removed unused struct.

e7ec0a1

Simplified grouped aggregation.

c02c4e2

This reduces * the runtime complexity of this operation from O(N*(1 + M)) to O(N*M) (N=number of rows, M=number of aggregations), * the memory footprint from O(N*M) acumulators to O(M) accumulators * the code complexity via DRY.

Added boolean to set of valid types to group by.

994329f

Added LogicalPlan::Join and respective integration.

9e85a5b

Added test for LogicalPla::Join

e4bfa27

Minor generalization of a test.

221694a

Added test for context using joins.

0e86c67

Added support for INNER JOIN on sql.

e8ed369

Added tests to join checks.

ae8dbd3

jorgecarleitao closed this Aug 16, 2020

jorgecarleitao deleted the join branch December 14, 2020 07:34

asfimport mentioned this pull request Nov 21, 2020

[Rust] [DataFusion] Add inner (hash) equijoin physical plan #25621

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ARROW-9555: [Rust] [DataFusion] Added inner join #7830

ARROW-9555: [Rust] [DataFusion] Added inner join #7830

Uh oh!

jorgecarleitao commented Jul 24, 2020 •

edited

Loading

Uh oh!

github-actions bot commented Jul 24, 2020

Uh oh!

andygrove commented Jul 27, 2020

Uh oh!

jorgecarleitao commented Jul 27, 2020 •

edited

Loading

Uh oh!

andygrove commented Jul 27, 2020 •

edited

Loading

Uh oh!

jorgecarleitao commented Aug 16, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ARROW-9555: [Rust] [DataFusion] Added inner join #7830

ARROW-9555: [Rust] [DataFusion] Added inner join #7830

Uh oh!

Conversation

jorgecarleitao commented Jul 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jul 24, 2020

Uh oh!

andygrove commented Jul 27, 2020

Uh oh!

jorgecarleitao commented Jul 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andygrove commented Jul 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jorgecarleitao commented Aug 16, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jorgecarleitao commented Jul 24, 2020 •

edited

Loading

jorgecarleitao commented Jul 27, 2020 •

edited

Loading

andygrove commented Jul 27, 2020 •

edited

Loading