feat: Improve InListExpr types, flatten dict haystacks and validate in try_new_from_array#21402
feat: Improve InListExpr types, flatten dict haystacks and validate in try_new_from_array#21402buraksenn wants to merge 1 commit intoapache:mainfrom
Conversation
kosiew
left a comment
There was a problem hiding this comment.
Looks good overall. No blocking issues from my side. I left a few small suggestions that could help with consistency and test coverage.
| /// cheaper to build. | ||
| /// Returns an error if the expression's data type and the array's data type | ||
| /// are not logically equal. Null arrays are always accepted. | ||
| pub fn try_new_from_array( |
There was a problem hiding this comment.
I noticed that try_new_from_array now does its own logical type validation, but try_new still has a very similar check a few lines below.
Would it make sense to extract this into a small helper so both constructors share the same validation path and error message? That might help avoid subtle drift in the future, especially around dictionary or logical equality handling.
| } | ||
|
|
||
| #[test] | ||
| fn test_try_new_from_array_dict_haystack_int32() -> Result<()> { |
There was a problem hiding this comment.
Nice coverage here for primitive, string, and float dictionary haystacks. One gap I still see is the multi-column or StructArray path, which is what shared_bounds.rs builds for composite hash-join filters.
It could be worth adding a focused try_new_from_array test with a struct haystack to cover the exact dynamic filter path this PR is enabling.
| )) as Arc<dyn PhysicalExpr> | ||
| }; | ||
|
|
||
| // Use in_list_from_array() helper to create InList with static_filter optimization (hash-based lookup) |
There was a problem hiding this comment.
Small cohesion nit: this comment still mentions an in_list_from_array() helper, but the code now calls InListExpr::try_new_from_array(...) directly.
Updating the wording so it matches the current API would make it easier to follow when tracing the dynamic filter path.
Which issue does this PR close?
InListExprtypes and expectations #20969.Rationale for this change
To make needle/haystack more clear, add comments and with flattened we can use optimizations.
What changes are included in this PR?
try_new_from_arrayAre these changes tested?
Yes. I've added new tests and existing pass test.
Are there any user-facing changes?
Yes.
try_new_from_arraynow expects data types to be logically equal