-
Notifications
You must be signed in to change notification settings - Fork 1.8k
feat: support array_resize
#8744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, but I have a different idea about array_size.
We can apply the combination of existing array_length, array_slice, array_repeat and array_concat to resize the array. But I'm not sure is it faster or slower, is it easier to maintain or not. How do you think about this approach @Weijun-H ?
array_concat(array_slice(array, 0, min(array_length, count)), array_repeat(default_array, max(count-array_length), 0))
|
|
||
| for (index, arr) in array.iter().enumerate() { | ||
| let count = count_array.value(index).to_usize().ok_or_else(|| { | ||
| DataFusionError::Execution( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: exec_err
| array: &GenericListArray<O>, | ||
| count_array: &Int64Array, | ||
| field: &FieldRef, | ||
| new_element: Option<ArrayRef>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default_element might be more concise?
I think this approach is not a better idea, as supporting the column for |
|
thanks @Weijun-H I'll review it tomorrow more closely if no one else beats me. |
| let new_rows = vec![&default_value; count - remain_count]; | ||
| rows.extend(new_rows); | ||
|
|
||
| let last_offset = offsets.last().copied().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please handle the potential error on .unwrap
comphead
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Weijun-H tests are awesome.
Im just wondering could we just concat 2 arrays(old+new one with default values) to do a resize?
I have slight concern that performance on current implementation, maybe I'm wrong
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment was marked as outdated.
This comment was marked as outdated.
e66acd6 to
cba3269
Compare
|
|
||
| for (row_index, offset_window) in array.offsets().windows(2).enumerate() { | ||
| let count = count_array.value(row_index).to_usize().ok_or_else(|| { | ||
| exec_datafusion_err!("array_resize: failed to convert size to usize") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does type conversion more like internal_err?
wasn't expected/anticipated by the implementation and that is most likely a bug (the error message even encourages users to open a bug report)
unlike array.len() != number, which is error from user.
jayzhan211
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much better with MutableArray! LGTM
comphead
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which issue does this PR close?
Closes #7194
Rationale for this change
array_resizeLargeListWhat changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?