Skip to content

Proposal: randfloat() for sampling from _all_ floats in [0,1)? #72

@milankl

Description

@milankl

As it's written in the documentation, the standard approach of for rand(Float32)/rand(Float64)

An UInt64 will firstly be converted to a Float64 that is perfectly uniformly distributed in [1.0, 2.0), and then minus one. This is a very fast approach, but not completely ideal, since the one bit is wasted. The current default RNG in Base.Random library does the same thing, so it also causes some tricky problems

is probably good enough in most cases, but not perfect. For Float32 it samples only from 2^23 = 8,388,608 floats in [+0,1) (as that’s the number of floats in [1,2)) whereas there are 1,065,353,216 (=2^30-2^23) in that range in total. So rand(Float32) samples from about 1% possible floats, and rand(Float64) from ~0.1% (2^52/(2^62-2^52)). I propose a function that solves that problem by drawing the exponent bits at correct chances first and then combine them with significant bits, similar to the following

using RandomNumbers.Xorshifts
const Xoroshiro128P = Xoroshiro128Plus()

"""Random number generator for floats in [0,1) that samples from *all* floats.""" 
function randfloat(::Type{Float32})
    # create exponent bits in 0000_0000 to 0111_1111
    # at following chances
    # e=01111110 at 50.0% for [0.5,1.0)
    # e=01111101 at 25.0% for [0.25,0.5)
    # e=01111100 at 12.5% for [0.125,0.25)
    # ... etc
    ui = rand(Xoroshiro128P,UInt128) >> 1 | 0x1 # first bit always 0, last always 1
    lz = leading_zeros(ui) % UInt8              # 2 leading zeros is 50%, 3 is 25%, etc.  
    e = ((0x7f - lz) % UInt32) << 23            # convert lz to exponent bits of float32

    # significant bits
    f = rand(Xoroshiro128P,UInt32) >> 9 
    
    # combine exponent and significand (sign always 0)
    return reinterpret(Float32,e | f)
end

I don't know whether something like this exists already in Julia somewhere, I don't even know whether this is an established algorithm. I thought it would be nice to have this function and different to implemented here, it could take the actual RNG as an argument. I've just used Xoroshiro128+ as it's fast. I'm happy to implement a Float64 version too, this requires technically a 1024bit UInt, but in practice one could first draw a UInt64 and only if 0x0 is drawn (does that actually happen?) one draws another UInt64 and count the leading zeros again. The above implementation seems to be reasonably fast, about 3x slower than Xoroshiro128+ with the "[1,2)-1"-technique but still faster than Julia's Base rand(Float32):

julia> @btime randfloat($Float32)   # this approach
  4.603 ns (0 allocations: 0 bytes)
julia> @btime rand($Float32)          # Julia's Base MT19337
  6.675 ns (0 allocations: 0 bytes)
julia> @btime rand($Xoroshiro128P,$Float32)    # Xoroshiro128+ with [1,2)-1 technique
  1.544 ns (0 allocations: 0 bytes)

Again, there might be ways to speed this up further by first drawing a UInt64 and only if that's 0x0 draw another one.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions