fix: array_concat widens container variant for mixed List/LargeList inputs#21704
Merged
Jefffrey merged 1 commit intoapache:mainfrom Apr 21, 2026
Merged
Conversation
…ist/LargeList inputs `ArrayConcat::coerce_types` only coerced the base element type, leaving the outer container variant unchanged. With a mix of `List` and `LargeList` (or `FixedSizeList` and `LargeList`) inputs, `array_concat_inner` would later try to downcast the `List` array to `GenericListArray<i64>` and fail with an internal cast error. When the resolved return type is `LargeList`, also promote each input's outermost `List` to `LargeList` so the cast targets match what `array_concat_inner` expects.
Jefffrey
approved these changes
Apr 18, 2026
| } | ||
|
|
||
| fn coerce_types(&self, arg_types: &[DataType]) -> Result<Vec<DataType>> { | ||
| let base_type = base_type(&self.return_type(arg_types)?); |
Contributor
There was a problem hiding this comment.
I just realized something funny the existing implementation does; it calls return_type within coerce_types even though the docstring for return_type states that the input types are already coerced (i.e. coerce_types would have already been called) 😅
4 tasks
crm26
added a commit
to crm26/datafusion
that referenced
this pull request
Apr 20, 2026
Addresses round-2 review comments on apache#21542: - Widen container variant in coerce_types when inputs mix List and LargeList (or FixedSizeList), so mixed-type calls like `cosine_distance([1.0, 0.0], arrow_cast([0.0, 1.0], 'LargeList(Float64)'))` succeed. Follows the pattern from PR apache#21704 (ArrayConcat). - Coerce bare NULL inputs to a matching list variant so `cosine_distance(NULL, [1.0, 2.0])` returns NULL instead of erroring. - Drop the `list_cosine_distance` alias — the base name is not `array_cosine_distance`, so the `array_X` -> `list_X` convention does not apply. - Expand SLT coverage: mixed-type variants, FixedSizeList inputs, Float32 and Int64 inner types, bare NULL in each position, NULL row in a multi-row VALUES, and an unsupported-type plan error case. Dispatch fallthrough in cosine_distance_inner is now unreachable after the coerce_types widening, changed from exec_err! to internal_err!. Part of apache#21536. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
|
Thanks @hcrosse |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
array_concathit an internal cast error when given a mix ofListandLargeList(orFixedSizeListandLargeList) arguments:ArrayConcat::coerce_typeswas coercing only the base element type, leaving the outer container alone. When the resolved return type isLargeList,array_concat_innerlater tries to downcast each arg toGenericListArray<i64>, which fails for anyListargument that slipped through.What changes are included in this PR?
In
ArrayConcat::coerce_types, after coercing the base type, also promote each input's outermostListtoLargeListwhen the return type is aLargeList.FixedSizeListinputs already go throughFixedSizedListToListfirst and then get promoted too. Per-arg dimensionality is preserved, so nested cases keep working withalign_array_dimensions.Are these changes tested?
Yes, added sqllogictests in
array_concat.sltcovering:List+LargeListLargeList+ListFixedSizeList+LargeListList,LargeList,ListEach one also asserts
arrow_typeof(...) = LargeList(Int64).Are there any user-facing changes?
Queries that previously returned an internal cast error now return the concatenated
LargeListas expected. No API changes.