Skip to content

std.equalRange: Compute lower and upper bounds simultaneously.#21290

Closed
LucasSantos91 wants to merge 221 commits intoziglang:masterfrom
LucasSantos91:equalRange
Closed

std.equalRange: Compute lower and upper bounds simultaneously.#21290
LucasSantos91 wants to merge 221 commits intoziglang:masterfrom
LucasSantos91:equalRange

Conversation

@LucasSantos91
Copy link
Contributor

The current implementation of equalRange just calls lowerRange and upperRange, but a lot of
the work done by these two functions can be shared. Specifically, each iteration gives information about whether the lower bound or the upper bound can be tightened. This leads to fewer iterations and, since there is one comparison per iteration, fewer comparisons.
Implementation adapted from GCC.
This sample demonstrates the difference between the current implementation and mine:

fn S(comptime T: type) type {
    return struct {
        needle: T,
        count: *usize,

        pub fn order(context: @This(), item: T) std.math.Order {
            context.count.* += 1;
            return std.math.order(item, context.needle);
        }
        pub fn orderLength(context: @This(), item: []const u8) std.math.Order {
            context.count.* += 1;
            return std.math.order(item.len, context.needle);
        }
    };
}
pub fn main() !void {
    var count: usize = 0;

    try std.testing.expectEqual(.{ 0, 0 }, equalRange(i32, &[_]i32{}, S(i32){ .needle = 0, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 0, 0 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 0, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 0, 1 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 2, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 2 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 5, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 3 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 8, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 5, 6 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 64, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 6, 6 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 100, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 6 }, equalRange(i32, &[_]i32{ 2, 4, 8, 8, 8, 8, 15, 22 }, S(i32){ .needle = 8, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 2 }, equalRange(u32, &[_]u32{ 2, 4, 8, 16, 32, 64 }, S(u32){ .needle = 5, .count = &count }, S(u32).order));
    try std.testing.expectEqual(.{ 1, 1 }, equalRange(f32, &[_]f32{ -54.2, -26.7, 0.0, 56.55, 100.1, 322.0 }, S(f32){ .needle = -33.4, .count = &count }, S(f32).order));
    try std.testing.expectEqual(.{ 3, 5 }, equalRange(
        []const u8,
        &[_][]const u8{ "Mars", "Venus", "Earth", "Saturn", "Uranus", "Mercury", "Jupiter", "Neptune" },
        S(usize){ .needle = 6, .count = &count },
        S(usize).orderLength,
    ));

    std.debug.print("Count: {}\n", .{count});
}

For each comparison, we bump the count. With the current implementation, we get 57 comparisons. With mine, we get 43.

This optimization is orthogonal to left-bias proposed by 21278

alexrp and others added 22 commits August 31, 2024 03:31
The kernel does define the struct, it just doesn't use it. Yet both glibc and
musl expose it directly as their public stat struct, and std.c takes it from
std.os.linux. So just define it after all.
Both glibc and musl use time64 as the base ABI for riscv32. This fixes the
`sleep` test in `std.time` hanging forever due to the libc functions reading
bogus values.
This should eventually be converted to the void/{} pattern along with the other
syscalls that are compile errors for riscv32.
There are targets (e.g. MIPS) where PIC actually affects assembler behavior.
This commit modifies the representation of the AIR `switch_br`
instruction to represent ranges in cases. Previously, Sema emitted
different AIR in the case of a range, where the `else` branch of the
`switch_br` contained a simple `cond_br` for each such case which did a
simple range check (`x > a and x < b`). Not only does this add
complexity to Sema, which we would like to minimize, but it also gets in
the way of the implementation of #8220. That proposal turns certain
`switch` statements into a looping construct, and for optimization
purposes, we want to lower this to AIR fairly directly (i.e. without
involving a `loop` instruction). That means we would ideally like a
single instruction to represent the entire `switch` statement, so that
we can dispatch back to it with a different operand as in #8220. This is
not really possible to do correctly under the status quo system.

This commit implements lowering of this new `switch_br` usage in the
LLVM and C backends. The C backend just turns any case containing ranges
entirely into conditionals, as before. The LLVM backend is a little
smarter, and puts scalar items into the `switch` instruction, only using
conditionals for the range cases (which direct to the same bb). All
remaining self-hosted backends are temporarily regressed in the presence
of switch range cases. This functionality will be restored for at least
the x86_64 backend before merge.
This commit introduces a new AIR instruction, `repeat`, which causes
control flow to move back to the start of a given AIR loop. `loop`
instructions will no longer automatically perform this operation after
control flow reaches the end of the body.

The motivation for making this change now was really just consistency
with the upcoming implementation of #8220: it wouldn't make sense to
have this feature work significantly differently. However, there were
already some TODOs kicking around which wanted this feature. It's useful
for two key reasons:

* It allows loops over AIR instruction bodies to loop precisely until
  they reach a `noreturn` instruction. This allows for tail calling a
  few things, and avoiding a range check on each iteration of a hot
  path, plus gives a nice assertion that validates AIR structure a
  little. This is a very minor benefit, which this commit does apply to
  the LLVM and C backends.

* It should allow for more compact ZIR and AIR to be emitted by having
  AstGen emit `repeat` instructions more often rather than having
  `continue` statements `break` to a `block` which is *followed* by a
  `repeat`. This is done in status quo because `repeat` instructions
  only ever cause the direct parent block to repeat. Now that AIR is
  more flexible, this flexibility can be pretty trivially extended to
  ZIR, and we can then emit better ZIR. This commit does not implement
  this.

Support for this feature is currently regressed on all self-hosted
native backends, including x86_64. This support will be added where
necessary before this branch is merged.
The parse of `fn foo(a: switch (...) { ... })` was previously handled
incorrectly; `a` was treated as both the parameter name and a label.

The same issue exists for `for` and `while` expressions -- they should
be fixed too, and the grammar amended appropriately. This commit does
not do this: it only aims to avoid introducing regressions from labeled
switch syntax.
`.loop` is also a block, so the block_depth must be stored *after* block
creation, ensuring a correct block_depth to jump back to when receiving
`.repeat`.

This also un-regresses `switch_br` which now correctly handles ranges
within cases. It supports it for both jump tables as well as regular
conditional branches.
This does *not* yet implement the new `loop_switch_br` instruction.
Also, don't use the special switch lowering for errors if the switch
is labeled; this isn't currently supported. Related: #20627.
simultaneously.
The current implementation of `equalRange` just
calls `lowerRange` and `upperRange`, but a lot of
the work done by these two functions can be shared.
Specifically, each iteration gives information
about whether the lower bound or the upper bound
can be tightened. This leads to fewer iterations
and, since there is one comparison per iteration,
fewer comparisons.
Implementation adapted from [GCC](https://github.com/gcc-mirror/gcc/blob/519ec1cfe9d2c6a1d06709c52cb103508d2c42a7/libstdc%2B%2B-v3/include/bits/stl_algo.h#L2063)
This sample demonstrates the difference between
the current implementation and mine:

```zig
fn S(comptime T: type) type {
    return struct {
        needle: T,
        count: *usize,

        pub fn order(context: @this(), item: T) std.math.Order {
            context.count.* += 1;
            return std.math.order(item, context.needle);
        }
        pub fn orderLength(context: @this(), item: []const u8) std.math.Order {
            context.count.* += 1;
            return std.math.order(item.len, context.needle);
        }
    };
}
pub fn main() !void {
    var count: usize = 0;

    try std.testing.expectEqual(.{ 0, 0 }, equalRange(i32, &[_]i32{}, S(i32){ .needle = 0, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 0, 0 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 0, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 0, 1 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 2, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 2 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 5, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 3 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 8, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 5, 6 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 64, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 6, 6 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 100, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 6 }, equalRange(i32, &[_]i32{ 2, 4, 8, 8, 8, 8, 15, 22 }, S(i32){ .needle = 8, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 2 }, equalRange(u32, &[_]u32{ 2, 4, 8, 16, 32, 64 }, S(u32){ .needle = 5, .count = &count }, S(u32).order));
    try std.testing.expectEqual(.{ 1, 1 }, equalRange(f32, &[_]f32{ -54.2, -26.7, 0.0, 56.55, 100.1, 322.0 }, S(f32){ .needle = -33.4, .count = &count }, S(f32).order));
    try std.testing.expectEqual(.{ 3, 5 }, equalRange(
        []const u8,
        &[_][]const u8{ "Mars", "Venus", "Earth", "Saturn", "Uranus", "Mercury", "Jupiter", "Neptune" },
        S(usize){ .needle = 6, .count = &count },
        S(usize).orderLength,
    ));

    std.debug.print("Count: {}\n", .{count});
}
```
For each comparison, we bump the count. With the
current implementation, we get 57 comparisons.
With mine, we get 43.

This optimization is orthogonal to left-bias
proposed by [21278](#21278)
@Olvilock
Copy link

Olvilock commented Sep 3, 2024

After the equal element is found, you discard information about bounds already computed in low and high, essentially precomputing binarySearch, and using lowerBound and upperBound on both halves. This can still be improved with little effort:

pub fn equalRange(
    comptime T: type,
    items: []const T,
    context: anytype,
    comptime compareFn: fn (@TypeOf(context), T) std.math.Order,
) struct { usize, usize } {
    var low: usize = 0;
    var high: usize = items.len;

    while (low < high) {
        const mid = low + (high - low) / 2;
        switch (compareFn(context, items[mid])) {
            .lt => low = mid + 1,
            .gt => high = mid,
            .eq => return .{
                low + lowerBound(T, items[low..mid], context, compareFn),
                mid + upperBound(T, items[mid..high], context, compareFn),
            },
        }
    }
    return .{ low, low };
}

when calling `lowerBound` and `upperBound`, the previous implementation was discarding information about low and high bounds that had already been computed.
Thanks, @Olvilock.
@LucasSantos91
Copy link
Contributor Author

Thanks, @Olvilock.

mikdusan and others added 4 commits September 3, 2024 22:56
Consequently, `AstGen.ret()` now passes the error code to
`.defer_error_code`. Previously, the error union value was passed.

closes #20371
Based on:

* `include/elf/common.h` in binutils
* `include/uapi/linux/elf-em.h` in Linux
* https://www.sco.com/developers/gabi/latest/ch4.eheader.html

I opted to use the tag naming of binutils because it seems to be by far the most
complete and authoritative source at this point in time.
andrewrk and others added 15 commits September 19, 2024 18:20
oops, I forgot to enable LLVM assertions though
Windows does not really have weak symbols. So when we bootstrap with `zig cc`
and link both Zig's compiler-rt and the CBE's `compiler_rt.c` we end up with
duplicate symbol errors at link time.
This time the LLVM builds have assertions enabled.

Also the zig builds support `-rtlib=none` for disabling compiler-rt.
This reverts commit 7e66b6d.

I don't think this is needed, I don't get any errors locally when I
bootstrap windows without this change.
This works around a problem that started happening with LLD around
version 18.1.8:

```
lld-link: error: duplicate symbol: .weak.__nexf2.default
>>> defined at CMakeFiles/zig2.dir/compiler_rt.c.obj
>>> defined at compiler_rt.lib(compiler_rt.lib.obj)
```
Upgrades the LLVM, Clang, and LLD dependencies to LLVM 19.x

Related to #16270

Big thanks to Alex Rønne Petersen for doing the bulk of the upgrade work
in this branch.
@Olvilock
Copy link

Olvilock commented Sep 20, 2024

Example test in this gist (run on zig master).
Seek equal range of element 5 in array [_]u32{ 2, 3, 4, 5, 5 }

The problem is that current implementation of lowerBound and upperBound does not match the contract in the docs (the assumed order is opposite), the tests in std.sort for those functions use precisely the order defined in struct S.

simultaneously.
The current implementation of `equalRange` just
calls `lowerRange` and `upperRange`, but a lot of
the work done by these two functions can be shared.
Specifically, each iteration gives information
about whether the lower bound or the upper bound
can be tightened. This leads to fewer iterations
and, since there is one comparison per iteration,
fewer comparisons.
Implementation adapted from [GCC](https://github.com/gcc-mirror/gcc/blob/519ec1cfe9d2c6a1d06709c52cb103508d2c42a7/libstdc%2B%2B-v3/include/bits/stl_algo.h#L2063)
This sample demonstrates the difference between
the current implementation and mine:

```zig
fn S(comptime T: type) type {
    return struct {
        needle: T,
        count: *usize,

        pub fn order(context: @this(), item: T) std.math.Order {
            context.count.* += 1;
            return std.math.order(item, context.needle);
        }
        pub fn orderLength(context: @this(), item: []const u8) std.math.Order {
            context.count.* += 1;
            return std.math.order(item.len, context.needle);
        }
    };
}
pub fn main() !void {
    var count: usize = 0;

    try std.testing.expectEqual(.{ 0, 0 }, equalRange(i32, &[_]i32{}, S(i32){ .needle = 0, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 0, 0 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 0, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 0, 1 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 2, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 2 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 5, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 3 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 8, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 5, 6 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 64, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 6, 6 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 100, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 6 }, equalRange(i32, &[_]i32{ 2, 4, 8, 8, 8, 8, 15, 22 }, S(i32){ .needle = 8, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 2 }, equalRange(u32, &[_]u32{ 2, 4, 8, 16, 32, 64 }, S(u32){ .needle = 5, .count = &count }, S(u32).order));
    try std.testing.expectEqual(.{ 1, 1 }, equalRange(f32, &[_]f32{ -54.2, -26.7, 0.0, 56.55, 100.1, 322.0 }, S(f32){ .needle = -33.4, .count = &count }, S(f32).order));
    try std.testing.expectEqual(.{ 3, 5 }, equalRange(
        []const u8,
        &[_][]const u8{ "Mars", "Venus", "Earth", "Saturn", "Uranus", "Mercury", "Jupiter", "Neptune" },
        S(usize){ .needle = 6, .count = &count },
        S(usize).orderLength,
    ));

    std.debug.print("Count: {}\n", .{count});
}
```
For each comparison, we bump the count. With the
current implementation, we get 57 comparisons.
With mine, we get 43.

This optimization is orthogonal to left-bias
proposed by [21278](#21278)
when calling `lowerBound` and `upperBound`, the previous implementation was discarding information about low and high bounds that had already been computed.
Thanks, @Olvilock.
@LucasSantos91
Copy link
Contributor Author

Ops, messed up the rebase. I will fix the bug mentioned by @Olvilock and reopen.

andrewrk pushed a commit that referenced this pull request Sep 23, 2024
The current implementation of `equalRange` just calls `lowerRange` and `upperRange`, but a lot of
the work done by these two functions can be shared. Specifically, each iteration gives information about whether the lower bound or the upper bound can be tightened. This leads to fewer iterations and, since there is one comparison per iteration, fewer comparisons.
Implementation adapted from [GCC](https://github.com/gcc-mirror/gcc/blob/519ec1cfe9d2c6a1d06709c52cb103508d2c42a7/libstdc%2B%2B-v3/include/bits/stl_algo.h#L2063).
This sample demonstrates the difference between the current implementation and mine:

```zig
fn S(comptime T: type) type {
    return struct {
        needle: T,
        count: *usize,

        pub fn order(context: @this(), item: T) std.math.Order {
            context.count.* += 1;
            return std.math.order(item, context.needle);
        }
        pub fn orderLength(context: @this(), item: []const u8) std.math.Order {
            context.count.* += 1;
            return std.math.order(item.len, context.needle);
        }
    };
}
pub fn main() !void {
    var count: usize = 0;

    try std.testing.expectEqual(.{ 0, 0 }, equalRange(i32, &[_]i32{}, S(i32){ .needle = 0, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 0, 0 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 0, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 0, 1 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 2, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 2 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 5, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 3 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 8, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 5, 6 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 64, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 6, 6 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 100, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 6 }, equalRange(i32, &[_]i32{ 2, 4, 8, 8, 8, 8, 15, 22 }, S(i32){ .needle = 8, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 2 }, equalRange(u32, &[_]u32{ 2, 4, 8, 16, 32, 64 }, S(u32){ .needle = 5, .count = &count }, S(u32).order));
    try std.testing.expectEqual(.{ 1, 1 }, equalRange(f32, &[_]f32{ -54.2, -26.7, 0.0, 56.55, 100.1, 322.0 }, S(f32){ .needle = -33.4, .count = &count }, S(f32).order));
    try std.testing.expectEqual(.{ 3, 5 }, equalRange(
        []const u8,
        &[_][]const u8{ "Mars", "Venus", "Earth", "Saturn", "Uranus", "Mercury", "Jupiter", "Neptune" },
        S(usize){ .needle = 6, .count = &count },
        S(usize).orderLength,
    ));

    std.debug.print("Count: {}\n", .{count});
}
```
For each comparison, we bump the count. With the current implementation, we get 57 comparisons. With mine, we get 43.

With contributions from @Olvilock.
This is my second attempt at this, since I messed up the [first one](#21290).
DivergentClouds pushed a commit to DivergentClouds/zig that referenced this pull request Sep 24, 2024
The current implementation of `equalRange` just calls `lowerRange` and `upperRange`, but a lot of
the work done by these two functions can be shared. Specifically, each iteration gives information about whether the lower bound or the upper bound can be tightened. This leads to fewer iterations and, since there is one comparison per iteration, fewer comparisons.
Implementation adapted from [GCC](https://github.com/gcc-mirror/gcc/blob/519ec1cfe9d2c6a1d06709c52cb103508d2c42a7/libstdc%2B%2B-v3/include/bits/stl_algo.h#L2063).
This sample demonstrates the difference between the current implementation and mine:

```zig
fn S(comptime T: type) type {
    return struct {
        needle: T,
        count: *usize,

        pub fn order(context: @this(), item: T) std.math.Order {
            context.count.* += 1;
            return std.math.order(item, context.needle);
        }
        pub fn orderLength(context: @this(), item: []const u8) std.math.Order {
            context.count.* += 1;
            return std.math.order(item.len, context.needle);
        }
    };
}
pub fn main() !void {
    var count: usize = 0;

    try std.testing.expectEqual(.{ 0, 0 }, equalRange(i32, &[_]i32{}, S(i32){ .needle = 0, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 0, 0 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 0, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 0, 1 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 2, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 2 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 5, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 3 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 8, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 5, 6 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 64, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 6, 6 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 100, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 6 }, equalRange(i32, &[_]i32{ 2, 4, 8, 8, 8, 8, 15, 22 }, S(i32){ .needle = 8, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 2 }, equalRange(u32, &[_]u32{ 2, 4, 8, 16, 32, 64 }, S(u32){ .needle = 5, .count = &count }, S(u32).order));
    try std.testing.expectEqual(.{ 1, 1 }, equalRange(f32, &[_]f32{ -54.2, -26.7, 0.0, 56.55, 100.1, 322.0 }, S(f32){ .needle = -33.4, .count = &count }, S(f32).order));
    try std.testing.expectEqual(.{ 3, 5 }, equalRange(
        []const u8,
        &[_][]const u8{ "Mars", "Venus", "Earth", "Saturn", "Uranus", "Mercury", "Jupiter", "Neptune" },
        S(usize){ .needle = 6, .count = &count },
        S(usize).orderLength,
    ));

    std.debug.print("Count: {}\n", .{count});
}
```
For each comparison, we bump the count. With the current implementation, we get 57 comparisons. With mine, we get 43.

With contributions from @Olvilock.
This is my second attempt at this, since I messed up the [first one](ziglang#21290).
richerfu pushed a commit to richerfu/zig that referenced this pull request Oct 28, 2024
The current implementation of `equalRange` just calls `lowerRange` and `upperRange`, but a lot of
the work done by these two functions can be shared. Specifically, each iteration gives information about whether the lower bound or the upper bound can be tightened. This leads to fewer iterations and, since there is one comparison per iteration, fewer comparisons.
Implementation adapted from [GCC](https://github.com/gcc-mirror/gcc/blob/519ec1cfe9d2c6a1d06709c52cb103508d2c42a7/libstdc%2B%2B-v3/include/bits/stl_algo.h#L2063).
This sample demonstrates the difference between the current implementation and mine:

```zig
fn S(comptime T: type) type {
    return struct {
        needle: T,
        count: *usize,

        pub fn order(context: @this(), item: T) std.math.Order {
            context.count.* += 1;
            return std.math.order(item, context.needle);
        }
        pub fn orderLength(context: @this(), item: []const u8) std.math.Order {
            context.count.* += 1;
            return std.math.order(item.len, context.needle);
        }
    };
}
pub fn main() !void {
    var count: usize = 0;

    try std.testing.expectEqual(.{ 0, 0 }, equalRange(i32, &[_]i32{}, S(i32){ .needle = 0, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 0, 0 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 0, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 0, 1 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 2, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 2 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 5, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 3 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 8, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 5, 6 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 64, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 6, 6 }, equalRange(i32, &[_]i32{ 2, 4, 8, 16, 32, 64 }, S(i32){ .needle = 100, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 6 }, equalRange(i32, &[_]i32{ 2, 4, 8, 8, 8, 8, 15, 22 }, S(i32){ .needle = 8, .count = &count }, S(i32).order));
    try std.testing.expectEqual(.{ 2, 2 }, equalRange(u32, &[_]u32{ 2, 4, 8, 16, 32, 64 }, S(u32){ .needle = 5, .count = &count }, S(u32).order));
    try std.testing.expectEqual(.{ 1, 1 }, equalRange(f32, &[_]f32{ -54.2, -26.7, 0.0, 56.55, 100.1, 322.0 }, S(f32){ .needle = -33.4, .count = &count }, S(f32).order));
    try std.testing.expectEqual(.{ 3, 5 }, equalRange(
        []const u8,
        &[_][]const u8{ "Mars", "Venus", "Earth", "Saturn", "Uranus", "Mercury", "Jupiter", "Neptune" },
        S(usize){ .needle = 6, .count = &count },
        S(usize).orderLength,
    ));

    std.debug.print("Count: {}\n", .{count});
}
```
For each comparison, we bump the count. With the current implementation, we get 57 comparisons. With mine, we get 43.

With contributions from @Olvilock.
This is my second attempt at this, since I messed up the [first one](ziglang#21290).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.