-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Looking at the source code:
https://docs.rs/slice-group-by/0.2.6/src/slice_group_by/linear_group/linear_group_by.rs.html#131-136
It seems this crate makes extensive use of unsafe.
However, it also looks like this unsafe is unnecessary as it creates no performance improvement:
Representation with the two pointers is exactly the memory representation of a slice, so it looks like this:
pub struct LinearGroupBy<'a, T: 'a, P> {
ptr: *const T,
end: *const T,
predicate: P,
_phantom: marker::PhantomData<&'a T>,
}could be replaced by this:
pub struct LinearGroupBy<'a, T: 'a, P> {
elements_left: &'a [T],
predicate: P,
}while this:
let mut i = 0;
let mut ptr = self.ptr;
// we use an unsafe block to avoid bounds checking here.
// this is safe because the only thing we do here is to get
// two elements at `ptr` and `ptr + 1`, bounds checking is done by hand.
// we need to get *two* contiguous elements so we check that:
// - the first element is at the `end - 1` position because
// - the second one will be read from `ptr + 1` that must
// be lower or equal to `end`
unsafe {
while ptr != self.end.sub(1) {
let a = &*ptr;
ptr = ptr.add(1);
let b = &*ptr;
i += 1;
if !(self.predicate)(a, b) {
let slice = $mkslice(self.ptr, i);
self.ptr = ptr;
return Some(slice)
}
}
}could be rewritten using for (i,s) in self.elements_left.windows(2).enumerate() and slice::split_at/slice::split_at_mut.
(and obviously this:
// `i` is either `0` or the `slice length - 1` because either:
// - we have not entered the loop and so `i` is equal to `0`
// the slice length is necessarily `1` because we ensure it is not empty
// - we have entered the loop and we have not early returned
// so `i` is equal to the slice `length - 1`
let slice = unsafe { $mkslice(self.ptr, i + 1) };
self.ptr = self.end;
Some(slice)by this:
Some(std::mem::replace(&mut self.elements_left, &[])))
I don't think this change would create additional runtime cost, as most double-bounds-checking (or constant-bounds-checking in particular with the constants 2 and 1 here) usually gets optimized-out by the compiler, and I would feel much more comfortable using this crate, as I wouldn't like to introduce so much unsafe to our codebase for such a simple algorithm.