Skip to content

Comments

[WIP] std.range: implement bitwise adapter over Integral type Ranges#4927

Merged
andralex merged 6 commits intodlang:masterfrom
edi33416:bitwise_adapter
Dec 19, 2016
Merged

[WIP] std.range: implement bitwise adapter over Integral type Ranges#4927
andralex merged 6 commits intodlang:masterfrom
edi33416:bitwise_adapter

Conversation

@edi33416
Copy link
Contributor

Add a generic bitwise adapter to std.range

@jmdavis
Copy link
Member

jmdavis commented Nov 30, 2016

I don't have time to look at this in depth right now, but I would point out that it would be more appropriate for something like this to be in std.bitmanip.

@andralex
Copy link
Member

andralex commented Dec 5, 2016

@jmdavis I was joking to @edi33416 that an older (Google?) paper argued that tag-based systems are better and more intuitive to humans than hierarchical systems. This seems to be a good anecdote supporting that. Bitwise defines a range interface on top of a range interface, so it's about std.range. It also does a bunch of bit-level access, so it's about std.bitmanip. We have a hierarchical package system so whatever we choose won't be the perfect fit.

@jmdavis
Copy link
Member

jmdavis commented Dec 6, 2016

@andralex Well, since a large portion of Phobos is just code that creates ranges of one sort or another (std.algorithm most notably), if the fact that something is a range puts it in std.range, a large portion of Phobos belongs in there. Obviously, it's sometimes non-obvious or debatable where something should go, but it seems that std.range has mostly been for range primitives and stuff that are building blocks for ranges, which this isn't. std.bitmanip, on the other hand, is where all of the bit manipulation / bitwise stuff has gone thus far. And since that's the sort of thing that this is doing, I think that it makes a lot more sense for it to be in std.bitmanip. So, while it's certainly debatable, I would have thought that it was fairly clear in this case, and I'm inclined to resist putting stuff in std.range just because it involves a range, particularly since ranges are so pervasive in Phobos and only becoming more so.

Copy link
Member

@andralex andralex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, ready for one more pass with an eye for minimalism.

}

/**
Bitwise adapter over Integral type Ranges. Consumes the range elements bit by bit.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integral and Ranges in lowercase

if (isInputRange!R && isIntegral!(ElementType!R))
{
alias ER = ElementType!R;
alias UER = Unsigned!ER;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We try to avoid acronyms that lead to multi-capitalized names. I suggest

private alias ElementType = ElementType!R;
private alias MaskType = Unsigned!ElementType;


private R parent;
private UER mask;
private int maskPos;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer uint for this kind of stuff


private enum UER bitsNum = ER.sizeof * 8;

this()(auto ref R range)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pass by value should suffice here, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used a pass by ref here in order to copy the range once instead of twice (first when it gets passed to the ctor and the second time when it gets copied into parent)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler should be eliding the copies anyway. You shouldn't need to worry about it.

}
}

bool empty() const
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you put const on this method you force parent.empty to also be const. If you leave no const, deduction will take care of it appropriately (making it const wherever possible).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless something has changed recently, const is not inferred. So, if you want it to be const or not based on the template argument, then you have to use introspection and static if. That being said, it still shouldn't just be marked as const, because then it won't work with many ranges.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const inferrence does have a workaround:

bool empty(this This)() { ... }

This will only compile on a const wrapper range if all the calls can be made on the lower range. It's more crude than defining or not defining the method as const, but it would work.

However, my opinion is that const on ranges is useless anyway -- you can't iterate a const range. So I would recommend just not putting const here.

private void initPrivates()
{
// Helper function to avoid code duplication
mask = cast(UER)(1) << (bitsNum - 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UER(1) should suffice

{
typeof(this) save()
{
return this;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you need to save parent as well:

typeof(this) save()
{
    auto result = this;
    result.parent = parent.save;
    return result;
}


bool back()
{
return cast(bool)(backAccumulator & backMask);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to avoid using cast and triggering all greps for unsafe code, use return (backAccumulator & backMask) != 0;

Also, here you can use const on the method because you don't depend on any templated stuff.

}
else
{
mask <<= 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that mask or backMask?

import core.exception : RangeError;
import std.exception : enforce;

enforce(n < len, new RangeError);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inside contracts always use assert, never use enforce

@andralex
Copy link
Member

andralex commented Dec 6, 2016

@jmdavis how about this: a close review of the next round of this code will amplify your voting rights by 10x :)

@andralex
Copy link
Member

andralex commented Dec 6, 2016

(BTW I approve the feature.)

@jmdavis
Copy link
Member

jmdavis commented Dec 6, 2016

@jmdavis how about this: a close review of the next round of this code will amplify your voting rights by 10x :)

LOL. Okay.

@9il
Copy link
Member

9il commented Dec 7, 2016

+1 for @jmdavis comment about std.bitmanip

@9il
Copy link
Member

9il commented Dec 7, 2016

Also, if we add this, what is real world use case?

@9il
Copy link
Member

9il commented Dec 7, 2016

I asked because range API is not universal. And maybe use cases you will find may require more specialised API. Note, that we do not need this for RNGs because it can be replaced with combination of Bernoulli2Variable and universal range adaptor.

This is good example where more generic != better.

If one need a lazy by element, then he probably needs to be able to get a multiple bits at once.

In other hand, if one need a random access binary range, an ndslice based implementation requires only 30 LOC. ndslice is preferable if you need a random access binary range. Furthermore ndslice based implementation would be faster (does not requires any if conditions)

It looks better for me understand use cases and, if we have them, add adaptors to existing one API instead of introducing a new large piece of code.

@9il
Copy link
Member

9il commented Dec 7, 2016

Oops, we already have BitRange :-( http://dlang.org/phobos/core_bitop.html#.BitRange
It should be deprecated
it returns indexes

@9il
Copy link
Member

9il commented Dec 7, 2016

Proof of concept.
It can be generalized easily for common random access range if required

struct BitMap
{
    size_t* ptr;

    import core.bitop;

    bool opIndex(size_t index) const
    {
        return bt(ptr, index) != 0;
    }

    void opIndexAssign(bool val, size_t index)
    {
        if(val)
            bts(ptr, index);
        else
            btr(ptr, index);
    }
}

import std.experimental.ndslice;

void main()
{
    auto arr = new size_t[3];
    auto sl = BitMap(arr.ptr).sliced(size_t.sizeof * 8 * arr.length);

    sl[4] = true;
    sl[100] = true;
    sl.popFrontN(3);
    assert(sl[1]);
    assert(sl[97]);
}

@9il
Copy link
Member

9il commented Dec 7, 2016

{
elemIndex = 0;
}
size_t elemIndex = n / bitsNum + 1;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think that the following construct would be faster?
elemIndex = n >> (bitsNum.bsf + 1) + 1;

@edi33416
Copy link
Contributor Author

edi33416 commented Dec 9, 2016

This is still a WIP

Copy link
Member

@andralex andralex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's drop the length cache, plus a few nits


static if (hasLength!R)
{
ulong len;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's delete this field and compute it on the fly from the source's length and the mask(s). It looks like more work but it's simpler and may actually be faster.

}
}

bool empty()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a ddoc comment before this

bool front()
{
assert(!empty());

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for all this whitespace - just say what you mean and be done

parent.popFront;
maskPos = bitsNum;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

too much whitespace - in Phobos we don't use whitespace around each compound statement


static if (hasLength!R)
{
ulong length() const
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to return size_t

static if (hasLength!R)
{
assert(n < len,
__PRETTY_FUNCTION__ ~ ": Index out of bounds");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep on same line

By truncating n with maskPos bits we have skipped the remaining
maskPos bits in parent[0], so we need to add 1 to elemIndex.
*/
size_t elemIndex = n / bitsNum + 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be const/immutable

{
size_t sliceLen = end - start;
size_t startElemIndex;
size_t endElemIndex;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move down closest to initialization point

}

// Get the slice to be returned from the parent
R resultParent = (parent.save)[startElemIndex .. endElemIndex + 1];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for save when indexing or slicing


typeof(return) result;

result.parent = resultParent;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.save

@edi33416
Copy link
Contributor Author

With this change, I have:

  • removed duplicate unittests
  • removed branching from opIndex and opSlice by using bitwise operations
  • reduced vertical whitespace

Copy link
Member

@andralex andralex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool cool cool

R = an input range to iterate over

Returns:
At minimum, an input range. If `R` was a forward, bidirectional or random
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/was/is/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

A Bitwise input range with propagated forward, bidirectional and random access capabilities

{
static if (hasLength!R)
{
return length() == 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prevalent style is to omit () on no-argument calls

{
bool back()
{
assert(!empty());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is not covered (please install the Codecov Chrome Extension and you'll see uncovered lines in red on this page)


void popBack()
{
++backMaskPos;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert(!empty) I guess

}

/**
Assignes `flag` to the `n`th bit within the range
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Assignes/Assigns/


Bitwise!R opSlice()
{
return this;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add .save

Copy link
Contributor

@wilzbach wilzbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(a couple of nits)

*/
static if (hasLength!R)
{
assert(n < length(), __PRETTY_FUNCTION__ ~ ": Index out of bounds");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While __PRETTY_FUNCTION__ is nice, I think there was some consensus to just use plain/old asserts for range asserts.

// Test opIndex and opSlice
///
@system unittest
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// will render this as example on the public docs and we don't want to shock the users by exposing all intrinsic tests in the example.
Could you please add a nice example with one type for the public docs?

assert(bw.empty());
static if (isForwardRange!T)
{
assert(bw2.empty());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto about not using parenthesis with no args

Bitwise adapter over integral type ranges. Consumes the range elements bit by bit.

Params:
R = an input range to iterate over
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably better to be a bit more precise here -> `integral input range"

R = an input range to iterate over

Returns:
At minimum, an input range. If `R` was a forward, bidirectional or random
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

A Bitwise input range with propagated forward, bidirectional and random access capabilities

assert(bw[1] == true);

auto bw2 = bw[0 .. $ - 5];
auto bw3 = bw2[];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unittest blocks in D are cheap and help the reader to show that there's a new context.
Also it has happened quite often that one accidentally has introduced errors by mistyping the variable end number e, g. 4 vs 5.

}

// Test all range types over all integral types
///
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an internal test - so it shouldn't be exposed to the user. Rather provide small, simple "snippets" that can be quickly grasped (e.g. five lines).

static if (isBidirectionalRange!T)
{
auto bw3 = bw.save;
auto bw4 = bw.save;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cryptic naming like bw3, bw4 make it very hard to read your code :/
How about bwTestBack?


// Test all range types over all integral types
///
@safe unittest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with the enforce removed this should be possible as pure nothrow @safe

bool front()
{
assert(!empty());
return (parent.front & mask(maskPos)) != 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calculating the mask every time seems to be an unnecessary expensive operation. What is your rationale for storing the position and not the mask directly?
(front is expected to be called a lot more often thanopSlice and opIndex`)

@wilzbach
Copy link
Contributor

btw @edi33416 you might be interested in reviving the bits printer of #4703

@andralex
Copy link
Member

Auto-merge toggled on

@wilzbach
Copy link
Contributor

Auto-merge toggled on

@andralex did you see @9il's discussion about adding more code bloat?

http://forum.dlang.org/post/vnhmrfjjennoytgomkfs@forum.dlang.org

@9il
Copy link
Member

9il commented Dec 19, 2016

@andralex did you see @9il's discussion about adding more code bloat?

http://forum.dlang.org/post/vnhmrfjjennoytgomkfs@forum.dlang.org

Yes, but we deprecated ndslice

@andralex andralex merged commit b6bcf75 into dlang:master Dec 19, 2016
@wilzbach
Copy link
Contributor

@edi33416 would you be so kind to add a changelog entry, s.t. everyone sees your great work on the next release?

@wilzbach
Copy link
Contributor

Btw did anyone knew about BitRange being part of Druntime since this summer?

@andralex
Copy link
Member

@wilzbach yah, it's fairly specialized (for the GC) though.

@MartinNowak
Copy link
Member

It's unfortunate that Ilya's point wasn't picked up.
http://forum.dlang.org/post/vnhmrfjjennoytgomkfs@forum.dlang.org
This is fairly redundant with BitArray.opApply and core.bitop.BitRange, please reconsider adding it to std.range in it's current form.

@MartinNowak
Copy link
Member

@wilzbach yah, it's fairly specialized (for the GC) though.

BitRange is a range over the 1-bit indices of an integral array.

@edi33416 edi33416 deleted the bitwise_adapter branch October 9, 2017 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants