Skip to content

Comments

Address comments from #234#235

Merged
9il merged 2 commits intolibmir:masterfrom
wilzbach:fix_mixture2
Jun 26, 2016
Merged

Address comments from #234#235
9il merged 2 commits intolibmir:masterfrom
wilzbach:fix_mixture2

Conversation

@wilzbach
Copy link
Member

started to adressed comments from #234.

  • renamed unittest variable rndGen -> gen
  • renamed r -> _cdSortedRange

@9il @WebDrake @joseph-wakeling-sociomantic

As back isn't const, just using the sorted range doesn't work.
As an option we might save the last element.

@wilzbach wilzbach changed the title Adress comments from #234 Address comments from #234 Jun 26, 2016
@codecov-io
Copy link

codecov-io commented Jun 26, 2016

Current coverage is 96.52%

Merging #235 into master will decrease coverage by <.01%

@@             master       #235   diff @@
==========================================
  Files            19         19          
  Lines          3256       3255     -1   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
- Hits           3143       3142     -1   
  Misses          113        113          
  Partials          0          0          

Sunburst

Powered by Codecov. Last updated by b6c255f...d7d2255

import std.range : assumeSorted;
_cdPoints = cdPoints;
r = typeof(r)(_cdPoints);
_cdSortedRange = _cdPoints.assumeSorted!"a <= b";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verify sorted occurs after the SortedRange constructor has been called:
https://github.com/dlang/phobos/blob/master/std/range/package.d#L7919

assumeSorted is:

auto assumeSorted(alias pred = "a < b", R)(R r)
{
    return SortedRange!(Unqual!R, pred)(r);
}

so imho I would keep the direct constructor call..

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still feel it's an unnecessary complication of the implementation to have both _cdPoints and _cdSortedRange. The latter is all that's actually needed, is there anything else that isn't just workaround for the lack of const support?

- renamed unittest variable rndGen -> gen
- removed r -> in favor of using assumeSorted directly in opCall
{
return _cdPoints;
_cdSum = cdPoints[$ - 1];
_cdSortedRange = cdPoints.assumeSorted!"a <= b";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, this also works. It's still a workaround, though ;-)

OTOH it might have some application if e.g. you wanted to generalize the constructor to take an arbitrary SortedRange as input (which would probably have to be guaranteed to be a forward range, but not necessarily a bidirectional range).

this(const(T)[] cdPoints)
{
_cdPoints = cdPoints;
r = typeof(r)(_cdPoints);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something to do in a separate PR, but I think that cdPoints can be generalized to a forward range, and it should probably be saved to make sure it belongs to the Discrete distribution (cf. my point in the earlier issue that const(T)[] gives insufficient guarantees that your distribution will not change over time).

The use of assumeSorted in opCall should at least provide better debug checks that the fundamental property of sortedness isn't being violated, though.

@9il
Copy link
Member

9il commented Jun 26, 2016

LGTM

return (cast(SortedRange!(const(T)[], "a <= b")) r).lowerBound(v).length;
import std.range : assumeSorted;

T v = uniform!("[)", T, T)(0, cdPoints[$-1], gen);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WebDrake there are (at least) two ways to do this:

inclusive uniform interval & less sortedness

T v = uniform!"[]"(0, cdPoints[$-1], gen);
return cdPoints.assumeSorted!("a < b").lowerBound(v).length;

semi-inclusive uniform interval & less-or-equal sortedness

T v = uniform!"[)"(0, cdPoints[$-1], gen);
return cdPoints.assumeSorted!("a <= b").lowerBound(v).length;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I right in thinking that a < b doesn't allow for any discrete element to have zero probability, whereas a <= b does ... ?

Copy link
Member Author

@wilzbach wilzbach Jun 26, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think elements with zero probs are problematic in both cases - see these two examples:

General example

import std.range;
auto cdPoints = [1, 3, 4];
foreach (el; 0..5)
    writeln(el, "->", cdPoints.assumeSorted!"a < b".lowerBound(el));
foreach (el; 0..5)
    writeln(el, "->", cdPoints.assumeSorted!"a <= b".lowerBound(el));
0->[]
1->[]
2->[1]
3->[1]
4->[2] // only included in "[]"

0->[]
1->[1]
2->[1]
3->[1, 3]
4->[1, 3] // only included in "[]"

With zero probs

import std.range;
auto cdPoints = [0, 1, 3, 4];
foreach (el; 0..5)
    writeln(el, "->", cdPoints.assumeSorted!"a < b".lowerBound(el));
foreach (el; 0..5)
    writeln(el, "->", cdPoints.assumeSorted!"a <= b".lowerBound(el));
0->[]
1->[0]
2->[0, 1]
3->[0, 1]
4->[0, 1, 3] // only included in "[]"

0->[0] 
1->[0, 1]
2->[0, 1]
3->[0, 1, 3]
4->[0, 1, 3, 4]  // only included in "[]"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see #236 for a fix ;-)

@9il 9il merged commit 220b2c8 into libmir:master Jun 26, 2016
@wilzbach
Copy link
Member Author

@WebDrake

Maybe something to do in a separate PR, but I think that cdPoints can be generalized to a forward range, and it should probably be saved to make sure it belongs to the Discrete distribution (cf. my point in the earlier issue that const(T)[] gives insufficient guarantees that your distribution will not change over time).

lowerBound (aka binary search) requires at least slicing, the uniform generator needs access to the last element.
Doing a linear search in the case would result in a different complexity with the same syntax which isn't intuitive, so I wouldn't be a huge fan of this.

The use of assumeSorted in opCall should at least provide better debug checks that the fundamental property of sortedness isn't being violated, though.

check is in SortedRange, now the verification of the sortedness is called on every opCall instead of being called in the Discrete constructor.

@wilzbach wilzbach deleted the fix_mixture2 branch June 26, 2016 12:54
@WebDrake
Copy link

lowerBound (aka binary search) requires at least slicing, the uniform generator needs access to the last element.
Doing a linear search in the case would result in a different complexity with the same syntax which isn't intuitive, so I wouldn't be a huge fan of this.

Yea, fair points. What I'm getting at, though, is that you should probably not constrain your user to giving you an array.

check is in SortedRange, now the verification of the sortedness is called on every opCall instead of being called in the Discrete constructor.

Yes, indeed. But can it guarantee for you that someone else is not modifying the elements of that array without affecting sortedness?

@9il
Copy link
Member

9il commented Jun 26, 2016

lowerBound (aka binary search) requires at least slicing, the uniform generator needs access to the last element.
Doing a linear search in the case would result in a different complexity with the same syntax which isn't intuitive, so I wouldn't be a huge fan of this.
Yea, fair points. What I'm getting at, though, is that you should probably not constrain your user to giving you an array.

Current version is OK. I don't want Mir be another Phobos with template hell

check is in SortedRange, now the verification of the sortedness is called on every opCall instead of being called in the Discrete constructor.
Yes, indeed. But can it guarantee for you that someone else is not modifying the elements of that array without affecting sortedness?

No. We have not such goals. Goals are 1. quality 2. speed 3. Small API 4. Clean code

@WebDrake
Copy link

Current version is OK. I don't want Mir be another Phobos with template hell [... ] Goals are 1. quality 2. speed 3. Small API 4. Clean code

OK, fair enough. It's certainly better to press on with the next challenge rather than chase these questions further now.

@9il
Copy link
Member

9il commented Jun 26, 2016

@WebDrake We have a lot of really important issues in Tinflex implementation. They include optimisation and complexity issues. Also a lot of bugs are hidden and may be found during code review. I spent all Saturday to review and wrote comments on Tinflex. As soon as @wilzbach will push the next revision please start implementation review. Implementation is much more important for now and we have really small amount of time to create good Tinflex.

@WebDrake
Copy link

OK, I'll try to get onto it soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants