SortMerge join support for IS NOT DISTINCT FROM.#16003
Conversation
The patch adds a "requiredNonNullKeyParts" field to the sortMerge processor, which has the list of key parts that must be nonnull for an equijoin condition to match. Conditions with SQL "=" are present in the list; conditions with SQL "IS NOT DISTINCT FROM" are absent from the list.
| if (keyParts.length == 0) { | ||
| return true; | ||
| } |
There was a problem hiding this comment.
It does allow us to skip calling getRowPositionInDataRegion, which I thought might be useful.
|
|
||
| @Override | ||
| public boolean isCompletelyNonNullKey(int row) | ||
| public boolean hasNonNullKeyParts(int row, int[] keyParts) |
There was a problem hiding this comment.
Sidenote, we should mention somewhere in Javadoc that compare implementation assumes null == null, therefore this method can be called to filter out those rows.
| 0 | ||
| ) | ||
| ); | ||
| } |
There was a problem hiding this comment.
I was hoping to find a UT test case with is not distinct from sql query with sort merge enabled.
There was a problem hiding this comment.
Good news, testJoinWithExplicitIsNotDistinctFromCondition is that test case 😄
This test was passing on master because the join was getting switched quietly to broadcast. Now it actually runs as sort-merge, which is why I had to add sortIfSortBased (the results are now in a different order).
The patch adds a "requiredNonNullKeyParts" field to the sortMerge processor, which has the list of key parts that must be nonnull for an equijoin condition to match. Conditions with SQL "=" are present in the list; conditions with SQL "IS NOT DISTINCT FROM" are absent from the list.