Add try_unary_mut by viirya · Pull Request #3134 · apache/arrow-rs

viirya · 2022-11-18T08:18:15Z

Which issue does this PR close?

Closes #3133.

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

tustvold

Thank you for working on this, I wonder if you have some benchmarks of how this compares to the allocating version, ideally using a modern allocator like jemalloc.

tustvold · 2022-11-21T14:54:42Z

+    pub fn as_slice(&mut self) -> (&mut [T::Native], Option<&[u8]>) {
+        (
+            self.values_builder.as_slice_mut(),
+            self.null_buffer_builder.as_slice(),
+        )


Suggested change

pub fn as_slice(&mut self) -> (&mut [T::Native], Option<&[u8]>) {

(

self.values_builder.as_slice_mut(),

self.null_buffer_builder.as_slice(),

)

pub fn slices_mut(&mut self) -> (&mut [T::Native], Option<&mut [u8]>) {

(

self.values_builder.as_slice_mut(),

self.null_buffer_builder.as_slice_mut(),

)

Or something, it seems a bit unusual for both to not be mutable.

We could then add validity_slice and validity_slice_mut for completeness

Okay, it is because for this I don't need null buffer slice to be mutable. But yes, it looks a bit strange to have one mutable with one non-mutable. 😄

tustvold · 2022-11-21T15:08:52Z

+    pub fn try_unary_mut<F, E>(
+        self,
+        op: F,
+    ) -> Result<PrimitiveArray<T>, Result<PrimitiveArray<T>, E>>


Suggested change

) -> Result<PrimitiveArray<T>, Result<PrimitiveArray<T>, E>>

) -> Result<Result<PrimitiveArray<T>, E>, PrimitiveArray<T>>

I think the reverse result order makes more sense, as it represents the order of fallibility. If we can't convert to a builder is the first error case, then within that we have the error case of ArrowError.

I think it will also make it easier to implement a fallback, e.g.

arr.try_unary_mut(&mut op).unwrap_or_else(|arr| arr.try_unary(&mut op))?

Makes sense.

tustvold · 2022-11-21T15:09:30Z

    }
+
+    #[inline]
+    pub fn as_slice(&self) -> Option<&[u8]> {


👍

Could also add an as_slice_mut for completeness

viirya · 2022-11-22T08:48:24Z

Thank you for working on this, I wonder if you have some benchmarks of how this compares to the allocating version, ideally using a modern allocator like jemalloc.

Not yet on benchmarking. I think it should be better and hard to think that it could be slower. I will revise this based on above comments and run a benchmark later. Thanks.

tustvold · 2022-11-22T08:49:41Z

I think it should be better and hard to think that it could be slower

Yeah, I'm just curious as it is a non-trivial additional complexity and so I'm curious what difference it makes 😅

viirya · 2022-11-23T06:23:15Z

Simply updated slice related functions.

I will update the result type later or tomorrow. (occupied occasionally by other matters in recent days, will be a bit slow response)

viirya · 2022-11-27T23:18:09Z

I ran a simple benchmark which calls add or add_mut on two primitive arrays, e.g.,

let mut arr_a = create_primitive_array::<Float32Type>(5120000, 0.0);

for _ in 0..10 {
    let arr_b = create_primitive_array::<Float32Type>(5120000, 0.0);
    // arr_a = add(&arr_a, &arr_b).unwrap();
    arr_a = add_mut(arr_a, &arr_b).unwrap().unwrap();
}

array add               time:   [675.93 ms 676.09 ms 676.28 ms]
                        change: [-3.2262% -3.0312% -2.8816%] (p = 0.00 < 0.05)
                        Performance has improved.

tustvold · 2022-11-28T12:23:47Z

This seems like quite a lot of additional complexity for only a 3% performance improvement. What do you think about just providing try_unary_mut and leave it at that, just thinking about the number of additional _mut arithmetic kernels this is going to produce? Especially given that most of these kernels would be trivial for a user to implement themselves, should they wish to?

viirya · 2022-11-28T16:46:21Z

Yeah, thought about this option too. I think it makes sense. I can remove _mut arithmetic kernels and only keep try_unary_mut so we can implement these kernels.

tustvold · 2022-11-28T17:34:28Z

    }
+
+    /// Returns the current values buffer as a slice
+    #[allow(dead_code)]


I think this shouldn't be necessary, as they are public

Yeah, removed.

ursabot · 2022-11-28T21:25:20Z

Benchmark runs are scheduled for baseline = 6f41b95 and contender = 5d84746. 5d84746 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

github-actions Bot added the arrow Changes to the arrow crate label Nov 18, 2022

viirya force-pushed the add_scalar_mut branch from 0d17644 to 4261762 Compare November 18, 2022 08:39

Add add_scalar_mut and add_scalar_checked_mut

c98d21a

viirya force-pushed the add_scalar_mut branch from 4261762 to c98d21a Compare November 18, 2022 08:55

tustvold reviewed Nov 21, 2022

View reviewed changes

tustvold mentioned this pull request Nov 21, 2022

Add binary_mut and try_binary_mut #3144

Merged

Update slice related functions for completeness.

1b68fdf

viirya added 3 commits November 26, 2022 17:42

Merge remote-tracking branch 'upstream/master' into add_scalar_mut

0fdb8f8

Change result type

338e605

Update API doc

1b3be81

viirya requested a review from tustvold November 27, 2022 03:16

viirya changed the title ~~Add add_scalar_mut and add_scalar_checked_mut~~ Add try_unary_mut Nov 28, 2022

Remove _mut arithmetic kernels

9f07fa6

viirya force-pushed the add_scalar_mut branch from f32042c to 9f07fa6 Compare November 28, 2022 17:11

tustvold approved these changes Nov 28, 2022

View reviewed changes

For review

9220b76

tustvold merged commit 5d84746 into apache:master Nov 28, 2022

tustvold mentioned this pull request Feb 13, 2023

Support unary_dyn_mut in arth #3708

Closed

	) -> Result<PrimitiveArray<T>, Result<PrimitiveArray<T>, E>>
	) -> Result<Result<PrimitiveArray<T>, E>, PrimitiveArray<T>>

Conversation

viirya commented Nov 18, 2022

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

Uh oh!

tustvold left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

viirya commented Nov 22, 2022

Uh oh!

tustvold commented Nov 22, 2022

Uh oh!

viirya commented Nov 23, 2022

Uh oh!

viirya commented Nov 27, 2022

Uh oh!

tustvold commented Nov 28, 2022

Uh oh!

viirya commented Nov 28, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ursabot commented Nov 28, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants