Fix Issue 18595 & Issue 18596: add unpredictableSeed!UIntType, support non-arc4random entropy sources, & replace MindstdRand0 with SplitMix#6388
Conversation
|
Thanks for your pull request and interest in making D better, @n8sh! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please see CONTRIBUTING.md for more information. If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment. Bugzilla references
Testing this PR locallyIf you don't have a local development environment setup, you can use Digger to test this PR: dub fetch digger
dub run digger -- build "master + phobos#6388" |
32c6b08 to
289bff8
Compare
std/random.d
Outdated
| // getrandom was introduced in Linux 3.17. | ||
| // checks whether the Linux kernel supports getRandom by looking at the | ||
| // reported version. | ||
| private auto checkHasGetRandom()() @nogc nothrow @trusted |
There was a problem hiding this comment.
Rather than an explicit check it might make more sense to just try the syscall and check for an error code.
There was a problem hiding this comment.
When I implemented this, I went with this because it's more "bullet-proof" than hoping that the kernel provides a proper error code, but yeah it should work fine too.
79fdb0d to
57be447
Compare
std/random.d
Outdated
| len = number of bytes to write | ||
|
|
||
| Returns: | ||
| always `len` (does not fail) |
There was a problem hiding this comment.
Nit: We typically indent for one-level after sections
std/random.d
Outdated
| BOOL, HMODULE, FARPROC, LPCSTR, LPCWSTR, PBYTE, | ||
| ULONG_PTR, FreeLibrary, GetProcAddress, LoadLibraryA; | ||
| import core.sys.windows.wincrypt : HCRYPTPROV; | ||
| private alias DWORD = size_t; // uint in druntime |
There was a problem hiding this comment.
Maybe we should finally fix druntime 😆
The reason I added this was that the build was segfaulting otherwise on AppVeyor, but my Window-foo is limited to seeing things crash on Windows.
There was a problem hiding this comment.
For some reason I thought it had already been fixed, but it seems not.
std/random.d
Outdated
| } | ||
| } | ||
| } | ||
| assert(hProvider); |
There was a problem hiding this comment.
For Phobos, we try to add messages to assert so that in case they see it (e.g. and don't have line number on their stack trace working like on OSX), they have a good start where too look. Also sometimes the assert already tells enough, s.t. Phobos's code doesn't need to be looked at.
std/random.d
Outdated
| // getrandom was introduced in Linux 3.17. | ||
| // checks whether the Linux kernel supports getRandom by looking at the | ||
| // reported version. | ||
| private auto checkHasGetRandom()() @nogc nothrow @trusted |
There was a problem hiding this comment.
When I implemented this, I went with this because it's more "bullet-proof" than hoping that the kernel provides a proper error code, but yeah it should work fine too.
std/random.d
Outdated
| atomicStore(_globalSeedLow, cast(uint) s[0]); | ||
| gammaLow = cast(uint) s[1]; | ||
| uint h = cast(uint) (s[1] >>> 32); | ||
| if (h == 0) // Should be a constraint violation, but check anyway. |
There was a problem hiding this comment.
Better make an assert out of this?
| return cast(uint) (MonoTime.currTime.ticks ^ rand.front); | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
How about adding a lit test that a unpredictableSeedOf (or an array of it), is non-zero?
There was a problem hiding this comment.
added
EDIT: skipped over the word "lit"
std/random.d
Outdated
| } | ||
| // Under contention one thread can update the low half of | ||
| // the global seed, then several other threads could update | ||
| // the low half and the high half of the global seed, then |
There was a problem hiding this comment.
Do you mean gamma or low here?
There was a problem hiding this comment.
I mean low which is the result of incrementing _globalSeedLow. The gamma doesn't change after being initialized.
There was a problem hiding this comment.
Yeah I guessed so, but it's pretty confusing because low is TLS and there's _globalSeedLow, so you maybe improve to improve the wording a bit?
There was a problem hiding this comment.
Yeah, I'm thinking about the whole comment. I think it's too wordy and doesn't effectively get the point across. How about:
Under contention updates can be interleaved. This is completely fine.
There was a problem hiding this comment.
I like it a lot more than the existing one :)
std/random.d
Outdated
| @property uint unpredictableSeed() @trusted nothrow @nogc | ||
| alias unpredictableSeed = unpredictableSeedOf!uint; | ||
| /// ditto | ||
| @property UIntType unpredictableSeedOf(UIntType)() @nogc nothrow @trusted |
There was a problem hiding this comment.
(as mentioned in you earlier PR)
Technically this is a new symbol and requires @andralex approval
405bd1c to
24e9426
Compare
On machines with hardware-accelerated AES I would use AES in counter mode. That happens to be precisely what macOS does with arc4random :). Otherwise I'd probably use ChaCha12. |
24e9426 to
a834665
Compare
|
Changed it from |
e4b31a6 to
0fbef60
Compare
Actually before using ChaCha12 I'd try to use RDRAND if the processor supported it. I previously investigated using it in mir-random, but unfortunately LDC's inline assembler doesn't support the RDRAND instruction. (Using data directives as a workaround isn't an option because those aren't supported either.) |
0fbef60 to
547b5af
Compare
|
I'm back from holiday. I don't really have time to look at this now but the immediate thought is that this is a big dump of code where the concerns could arguably be much better separated (e.g. without necessarily introducing a new public symbol, it ought to be possible for the Any chance of 24-48 hours to give a more in-depth review? |
I don't think there's any risk of this being rushed through. |
WebDrake
left a comment
There was a problem hiding this comment.
OK, so. I've taken a more detailed look through things, and a variety of more specific feedback follows.
Overall, it needs to be recognized that this PR introduces -- in a single patch -- a great variety of different and overlapping changes. This would include:
-
rewriting the "fallback" mechanism of
unpredictableSeedto use SplitMix -
rewriting the fallback mechanism to use global shared state instead of having thread-local state
-
extending
unpredictableSeedto support different seed types (probably dependent on the SplitMix change or something similar, as otherwise there's no reliable way to supportulong) -
introducing a variety of new platform-specific entropy sources
-
using those new entropy sources to seed the SplitMix fallback
The code as currently written is also very inline-heavy, i.e. not only does the patch mix a lot of different things together, but so does the new unpredictableSeed implementation.
I recognize that this is largely because the code is being copied straight from mir-random, but that's all the more reason why we should think carefully about what the changes are and whether they are the right changes for phobos.
So, my first suggestions would be:
-
could we please try to break the changes down into separate patches (and maybe PRs) that more effectively separate out these concerns?
-
can we put some effort into better separating out the components of the new code?
That's a bit vague and general, of course, so (besides the individual feedback points below) I'll try to break things down into a more specific (but still high-level) set of concerns.
1. sources of entropy
The functions introducing new entropy sources (currently called oneTimeGenRandom) can be introduced in a separate patch. It would also be good to consider their structure and purpose: right now they mix up (1) providing new sources of entropy and (2) selecting a particular source of entropy to use.
I think it would be helpful to separate out those concerns, so that we have a clear distinction between individual sources of entropy that can be used, and the choice of which of them to use in specific circumstances. This is particularly noticeable w.r.t. the relationship with arc4random. Some more specifics on this are provided in the detailed feedback notes.
It's also worth questioning if they are actually necessary at all in order to address https://issues.dlang.org/show_bug.cgi?id=18595. I would advise that we first do the absolute minimum needed in order to support the new API (1 PR), and then think about further improving the implementation.
2. introducing splitmix
There are 2 concerns I see here. First, the splitmix implementation is purely inline inside unpredictableSeed, which makes it harder to see the logic of what unpredictableSeed itself is doing. It would be preferable to introduce it as a new random engine, and use an instance of that engine inside unpredictableSeed.
Second, the system entropy functions (oneTimeGenRandom) are used under the hood as part of the seeding process, but not for any other purpose. It doesn't seem necessary to couple the introduction of this kind of seeding to the use of SplitMix, so probably it would be good to first introduce a private splitmix engine (1 PR), then rewrite unpredictableSeed to use it instead of Minstdrand0, and only after that start thinking about rewriting how the splitmix instance itself is seeded.
3. extending unpredictableSeed for arbitrary UIntType
This can be done as soon as the fallback entropy generation mechanism supports ulong as well as smaller types. It shouldn't need to be coupled with any of the more intrusive changes.
I think there are other things to discuss (some of them already noted in detailed notes below), but this should be enough to be going on with.
| len = number of bytes to write | ||
|
|
||
| Returns: | ||
| Number of bytes successfully written up to `len`, or -1 on failure |
There was a problem hiding this comment.
The documentation here leaves open the possibility that we can have result != -1 && result < len.
However, if we look internally, we find that where /dev/urandom is used, there is an explicit check to ensure that exactly len bytes were generated, while according to its manpage (quoted above), NR_getrandom is guaranteed to either error or generate exactly len bytes.
This would suggest that (this implementation of) oneTimeGenRandom will either return len or -1, and nothing else. Is that correct?
There was a problem hiding this comment.
No, it's up to len:
On success, getrandom() returns the number of bytes that were copied
to the buffer buf. This may be less than the number of bytes
requested via buflen if either GRND_RANDOM was specified in flags and
insufficient entropy was present in the random source or the system
call was interrupted by a signal.
http://man7.org/linux/man-pages/man2/getrandom.2.html
Though for small n it's guaranteed to succeed:
If the entropy pool has been initialized and the
request size is small (buflen <= 256), then getrandom() will not fail
with EINTR. Instead, it will return all of the bytes that have been
requested.
edit: that's why in general one would have to write sth. like I did in mir-random:https://github.com/libmir/mir-random/blob/3c17c14b217f930826d8f475620d995abae91b56/source/mir/random/engine/package.d#L668
while (len > 0)
{
auto res = syscall(GETRANDOM, cast(size_t) ptr, len, 0);
if (res >= 0)
{
len -= res;
ptr += res;
}
else
{
return res;
}
}There was a problem hiding this comment.
I think we might as well make sure that we either get all the bytes we want, or that we get a failure. That means that the function (which I'd still like to call getSystemEntropy) can have a common behaviour on all platforms where it exists at all.
It'd have to be else if (errno != ENOSYS), wouldn't it ... ?
There was a problem hiding this comment.
BTW, a policy question: given the level this function operates at, might it not be better placed in druntime in the long run? It feels like the sort of functionality that should live there rather than in phobos.
There was a problem hiding this comment.
I think we might as well make sure that we either get all the bytes we want, or that we get a failure.
Oh, and note: given our existing use-cases, since we are never using GRND_RANDOM and we're always requesting < 256 bytes, we should never actually see those kinds of failure case. So in the short term we might as well make it:
if (result < len) return -1;... with a FIXME or issue to improve things if we ever do want to start supporting larger quantites of entropy.
There was a problem hiding this comment.
I think we might as well make sure that we either get all the bytes we want, or that we get a failure.
Yep, my point!
So in the short term we might as well make it:
Yeah, but then someone comes along and uses it for other bits in std.random, so I would feel safer if the "proper" checking is done on the beginning. Wrapping it in a loop isn't so hard after all.
BTW, a policy question: given the level this function operates at, might it not be better placed in druntime in the long run? It feels like the sort of functionality that should live there rather than in phobos.
Well my PR to add just the numbers for syscalls is pending for four months:
So while I agree that this is low-level stuff, I don't think that we should block ourselves with that.
Also exposing something via druntime will mean that it's publicly visible and thus needs to have a well-though API, whereas this code can live in private land.
There was a problem hiding this comment.
Wrapping it in a loop isn't so hard after all.
Yea, sounds good, then. I wasn't sure if you were wanting that to actually be introduced now.
So while I agree that this is low-level stuff, I don't think that we should block ourselves with that.
Oh, sure. Although given that we can address the substantial issue (templatizing unpredictableSeed) without these entropy-generating functions, we could also just leave that to druntime and avoid the need for adding and then removing phobos code. It would be nicer to do things patiently, but right, than to generate code churn through impatience.
That incidentally lends itself to better separation of concerns among all the changes.
Also exposing something via druntime will mean that it's publicly visible and thus needs to have a well-though API
Er, are you suggesting that being private is an excuse for not having a well-thought-through API? ;-)
Of course, I know what you mean, but that pressure to think things through properly is valuable. I think we'd do better to implement things with that level of thought even if it's going to be private for now. Apart from anything else, that makes it easier to upstream to druntime later.
| import core.stdc.errno : ENOSYS, errno; | ||
| const result = syscall(NR_getrandom, cast(size_t) ptr, len, GRND_NONBLOCK); | ||
| if (result != -1 || errno != ENOSYS) | ||
| return result; |
There was a problem hiding this comment.
Do I understand right that this if block is to ensure that we will only return if the NR_getrandom syscall succeeds? And that if we fail, we drop through and use /dev/urandom instead?
If so, do I understand right that this is the expected behaviour on earlier linux kernels that don't support the syscall?
Finally: assuming the above, is it not possible to avoid the wasted syscall where the kernel does not support NR_getrandom? I assume not, because while one might be able to check at compile time whether the kernel supports this syscall, there's no guarantee that the compiled code will be running on top of the same kernel.
However, since NR_getrandom was introduced in 3.17 (in 2014) and AFAICT all 3.x.y kernel releases are now EOL, this should not matter in practice.
If all the above is correct, though, it might be worth providing some comments on this.
There was a problem hiding this comment.
Do I understand right that this if block is to ensure that we will only return if the NR_getrandom syscall succeeds? And that if we fail, we drop through and use /dev/urandom instead?
Yes.
If so, do I understand right that this is the expected behaviour on earlier linux kernels that don't support the syscall?
Yes. The earlier versions I reviewed/wrote used the C API to check before, but directly checking for ENOSYS is faster.
However, since NR_getrandom was introduced in 3.17 (in 2014) and AFAICT all 3.x.y kernel releases are now EOL, this should not matter in practice.
Unfortunately it does. Some of my cheap VMs still run with 2.6.X because that's the latest release supported by OpenVZ and many hosters haven't switched to LXC. They even backported the Meltdown fixes to 2.6.X
Finally: assuming the above, is it not possible to avoid the wasted syscall where the kernel does not support NR_getrandom? I assume not, because while one might be able to check at compile time whether the kernel supports this syscall, there's no guarantee that the compiled code will be running on top of the same kernel.
Yes, no compile-time guarantee.
There was a problem hiding this comment.
Unfortunately it does. Some of my cheap VMs still run with 2.6.X because that's the latest release supported by OpenVZ and many hosters haven't switched to LXC.
Sure. I meant that the performance cost of the guaranteed-to-fail syscall does not really matter, as these are really niche edge-cases for something that doesn't need to be that fast in any case.
std/random.d
Outdated
| /+ | ||
| Tries to use a system API to get entropy. Resources are not cached | ||
| due to the expectation that this function will not be called more | ||
| than once per program execution. |
There was a problem hiding this comment.
Note that the name oneTimeGenRandom is potentially misleading, as it could be interpreted as meaning that the function is only meant to be called once.
Do I understand right that in fact, while the function is not optimized for multiple calls, it doesn't have a problem with them (whichever implementation is used)? If so it seems a bit odd to put oneTime... into the function name.
It would be much better to name it according to what it does, e.g. something like getSystemEntropy.
There was a problem hiding this comment.
I would also note: documentation says "a system API", which is (maybe usefully) vague. But it might be a good idea to be more precise: is this any system API, or the one that is defined as providing the OS's standard source of entropy?
Right now, obviously not, because arc4random is preferred over other sources. But I would suggest that there may be value in drawing a clean distinction in code between particular sources of entropy, versus the choice of which is the best one for a given platform and use-case.
While all of this is private for now, it's arguably a good idea to be thinking about that from the beginning, rather than having to unpick it later.
There was a problem hiding this comment.
It would be much better to name it according to what it does, e.g. something like
getSystemEntropy.
The name is a warning to developers against using the function that way repeatedly. The comment hints at the reason: "Resources are not cached due to the expectation that this function will not be called more than once per program execution." So there may be resources that a caller would expect a reasonable implementation to reuse that are not cached for reuse here, making repeated calls more expensive than expected.
Maybe a name like getExpensiveSystemEntropy or getSystemEntropyWithoutSavingResources would be better than oneTimeGenRandom?
There was a problem hiding this comment.
The name is a warning to developers against using the function that way repeatedly.
If I understand right, what you want to ensure is that users understand that this should not be used as their default random number generator. Right?
But that's what the documentation is for. The inefficiency that you note is an implementation detail that might be fixed later, not something inherent to the API. It shouldn't pollute the API naming choices.
There was a problem hiding this comment.
BTW, I think it's explicitly not the case that we expect "that this function will not be called more than once per program execution". I would anticipate for example that it might be called once per thread (to seed different, thread-local pseudo-RNG instances).
There was a problem hiding this comment.
If I understand right, what you want to ensure is that users understand that this should not be used as their default random number generator. Right?
Since it's private I was only thinking of other developers when I wrote it. I assume they know whether they want system entropy or something else. The message is: "if you want to retrieve system entropy repeatedly over the life of the application, write a different function."
The inefficiency that you note is an implementation detail that might be fixed later
For repeated use it's an inefficiency, but for one-time use it may be optimal. We might not want to open /dev/urandom and keep it open for the life of the application.
There was a problem hiding this comment.
Since it's private I was only thinking of other developers when I wrote it
That's what documentation is for (to advise developers on how to use a function).
By the way, I like the way you've phrased it in this particular response ("If you want to retrieve system entropy repeatedly ...") much better than how you phrased it in the actual documentation.
For repeated use it's an inefficiency, but for one-time use it may be optimal. We might not want to open
/dev/urandomand keep it open for the life of the application.
Oh, absolutely. But note that this also touches on the importance of providing the different sources of entropy as entities in their own right, that can be used in multiple different ways, according to use case.
There was a problem hiding this comment.
By the way, I like the way you've phrased it in this particular response ("If you want to retrieve system entropy repeatedly ...") much better than how you phrased it in the actual documentation.
Okay, changed it.
| { | ||
| arc4random_buf(ptr, len); | ||
| return len; | ||
| } |
There was a problem hiding this comment.
If I've understood the unpredictableSeed changes right, is this version of the function ever actually used in phobos? It does not appear so, because unpredictableSeed checks first for arc4random and only uses oneTimeGenRandom if arc4random is not available. So why add this code, given that it's private and therefore has no other use-cases?
There is also a separate point of consideration about whether a function to get system entropy should default to arc4random. This is essentially a judgement call phobos-side that arc4random is preferable to any other implementation. Is that necessarily true? And if so, why?
This can be related to the point made elsewhere about separation of concerns between sources of entropy, versus the choice of the best. Instead of a phobos-side judgement call, why not define a function which provides entropy generated using the standard entropy-generating mechanism of the host system, whatever that may be?
That would provide a clean separation between the arc4random API (which may well be very advisable to use, but arguably the choice should be the user's) and the source of entropy the host system provides as standard (which might well be the same as arc4random, but the choice is the host system's).
This in turn would mean that any given use case has a clearly defined set of choices about what options they have when generating entropy.
Note that we could still provide a high-level getEntropy function that would make phobos' own choice of the best source for a given platform. But that would be much better defined in terms of a clean separation between the underlying sources.
There was a problem hiding this comment.
There is also a separate point of consideration about whether a function to get system entropy should default to
arc4random. This is essentially a judgement call phobos-side that arc4random is preferable to any other implementation. Is that necessarily true? And if so, why?
The main attractive feature of arc4random is its simplicity. It requires no Phobos-side management of global state and has no error conditions. It is hard to imagine a better interface.
why not define a function which provides entropy generated using the standard entropy-generating mechanism of the host system, whatever that may be?
For the most part we do that.
There was a problem hiding this comment.
The main attractive feature of
arc4randomis its simplicity. It requires no Phobos-side management of global state and has no error conditions. It is hard to imagine a better interface.
That's obviously nice to have, but I don't know that I'd necessarily privilege that over other factors, like the quality of the source of entropy.
For the most part we do that.
I accept that this is the intent, but it's not really what the code does, because too much stuff is hidden under the hood in a way that denies choices to the actual use-cases.
Consider what you have right now:
version (AnyARC4Random)
{
private ptrdiff_t oneTimeGenRandom()(scope void* ptr, ptrdiff_t len) @system
{
// arc4random-based implementation
}
}
else version (Windows)
{
private ptrdiff_t oneTimeGenRandom()(scope void* ptr, ptrdiff_t len) @system
{
// Windows syscall based implementation
}
}
else version (Posix)
{
private ptrdiff_t oneTimeGenRandom()(scope void* ptr, size_t len) @system
{
// Posix syscall based implementation, with some scope for Linux specialization
}
}
else
{
// implementation that always fails
}Contrast this with:
// you already have this ...
version (AnyARC4Random)
{
extern (C) @private @nogc nothrow
{
uint arc4random() @safe;
void arc4random_buf(scope void* ptr, size_t len) @system;
}
}
// ... but you don't have this
version (linux)
{
int getrandom(void* ptr, size_t len, out bool isImplemented)
{
// wrap up all the syscall stuff inside this
}
}
// ... or this
version (Posix)
{
int devURandom(void ptr*, size_t len)
{
// wrap up the implementation here
}
}
// ... or this
version (Windows)
{
int fnCryptGenRandom(void ptr*, size_t len)
{
// wrap up implementation here
}
}
// ... and now, you can do this
int getEntropy(void ptr*, size_t len)
{
version (AnyARC4Random)
{
arc4random_buf(ptr, len);
return len;
}
else version (Windows)
{
return fnCryptGenRandom(void ptr*, size_t len);
}
else version (Posix)
{
version (linux)
{
bool getrandomExists;
result = getrandom(ptr, len, getrandomExists);
if (getrandomExists) return result;
}
return devURandom(ptr, len);
}
else
{
return -1;
}
}What's the difference?
With the first option, you have one single "get some entropy from the system" function. It's a black box. It may interact in odd ways with other choices made elsewhere in the codebase (as indeed this one does, in the case of AnyARC4Random and unpredictableSeed).
With the second option, there's still a general-purpose "get system entropy" function, but it's defined as an aggregate of lots of other choices. It doesn't preclude a user making use of those choices in a different way, depending on their use-case.
For example, with this way of doing things, it would be trivial, if we wanted to, to switch out the unpredictableSeed implementation to include:
version (Windows)
{
UIntType result = void;
// notice, we're using a _specific_ source of entropy here, not just some
// abstracted-away black box
auto bytesOfEntropy = fnCryptGenRandom(&result, UIntType.sizeof);
enforce(bytesOfEntropy == UIntType.sizeof);
return result;
}
else version (AnyARC4Random)
{
// ... existing implementation continues from here ...
}Of course, we might not want to make that specific change. The key thing is to note how this enables unpredictableSeed to make its own choices with its own priorities, that need not necessarily be the same as getEntropy.
There was a problem hiding this comment.
That's a fine way to structure it. That's similar to how mir-random divides things.
std/random.d
Outdated
| else | ||
| { | ||
| // Value based on a combination of pid, tid, and time. | ||
| private ulong fallbackSeed()() |
There was a problem hiding this comment.
Also: should have proper ddoc, despite being private.
There was a problem hiding this comment.
Attribute inference?
Also: should have proper ddoc, despite being private.
That's not a Phobos requirement (though of course it doesn't hurt)
There was a problem hiding this comment.
Attribute inference, and so if it isn't used it won't appear in the final executable.
There was a problem hiding this comment.
Also: should have proper ddoc, despite being private.
Added.
std/random.d
Outdated
| gammaLow = atomicLoad!(_globalGammaLow); | ||
| if (!gammaLow) | ||
| { | ||
| ulong[2] s = oneTimeInitSeedAndGamma(); |
There was a problem hiding this comment.
Why a ulong[2], since the 2 values s[0] and s[1] are always used separately, and s is never used in its own right?
This seems to be a case of the oneTimeInitSeedAndGamma implementation driving the API: under the hood it's convenient to use a ulong[2] because then one can call oneTimeGenRandom(&s, typeof(s).sizeof). But that doesn't make for easy to understand usage from the outside.
Given that all of this is an under-the-hood detail anyway, it's not that big a deal, but it would be nicer from the calling side to be able to pass in separate seed and gamma parameters.
std/random.d
Outdated
| static if (has64BitAtomicOps) | ||
| { | ||
| shared static ulong _globalSeed; | ||
| shared static ulong _globalGamma; |
There was a problem hiding this comment.
This involves creating shared global state inside the unpredictableSeed implementation. Can you talk through the reasons for doing this, and the risks associated with it? Is this motivated by performance, or to improve the quality of randomness?
There was a problem hiding this comment.
You mean putting the variables inside the function rather than outside the function? No reason other than making it immediately obvious that they are only used in one place.
There was a problem hiding this comment.
No, I'm asking about the rationale for them being shared. The existing unpredictableSeed used thread-local storage. What's the motivation for making this different?
There was a problem hiding this comment.
Initially I was going to keep them thread-local merely because that's how the previous design worked, but then I asked myself "why make them thread-local?" and I couldn't think of a justification. Using a thread-local PRNG is a performance optimization, but the case it's optimizing is one thread calling unpredictableSeed many times while other threads are doing the same thing. In most cases it seems like that would be a misuse of the library.
There was a problem hiding this comment.
but then I asked myself "why make them thread-local?" and I couldn't think of a justification
Well, there's another way of looking at that. Thread-local is the default for static and global variables in D, and with good reason.
So, perhaps the question should be: what is the strong, positive reason for departing from the default option?
There was a problem hiding this comment.
With a global PRNG as a fallback seeding mechanism the options are to use the output of the global seeder directly, or to use the output of the global seeder to initialize a thread-local seeder then use the output of the thread-local seeder. Both designs have the same number of global variables; the second design just adds thread-locals on top.
There was a problem hiding this comment.
I'm not sure I follow your reasoning here about the same number of global variables. Can you explain in more detail?
There was a problem hiding this comment.
The first design was:
system entropy -> global prng -> thread-local prng -> unpredictableSeed
The revised design was:
system entropy -> global prng -> unpredictableSeed
std/random.d
Outdated
| shared static uint _globalSeedLow; | ||
| shared static uint _globalSeedHigh; | ||
| shared static uint _globalGammaLow; | ||
| shared static uint _globalGammaHigh; |
There was a problem hiding this comment.
Same questions about the rationale for introducing global shared state.
| { | ||
| return cast(UIntType) unpredictableSeed!ulong; | ||
| } | ||
| else |
There was a problem hiding this comment.
Here we fall back to using splitmix if arc4random is not available. But is that generally preferable to just using system entropy directly, where the system provides a means to get it? We are already using system entropy under the hood in many cases, via the oneTimeInitSeedAndGamma function, so why not cut out the middle man where we can?
There was a problem hiding this comment.
^^^ Just to ping, either this question has been overlooked, or I missed the answer (in which case, sorry!). Is there a reason to prefer using the system entropy only to seed splitmix, rather than directly as the source of the unpredictable seed?
There was a problem hiding this comment.
I think the full answer is that it varies by platform. Defaulting to SplitMix will always work reasonably well, so it's a safe generic implementation, but based on benchmarks we might want to add special cases on specific platforms to use the system entropy source directly instead of just using it to seed SplitMix. This is basically what we're already doing with arc4random.
std/random.d
Outdated
| atomicOp!"+="(example, ulong.max); | ||
| }); | ||
|
|
||
| static if (has64BitAtomicOps) |
There was a problem hiding this comment.
Can you describe the motivation for these static if blocks? Do I understand right that these are micro-optimizations for performance?
Is this really necessary given that we're talking about a fallback mechanism for something that is not optimized for speed in other respects, and which is expected to only be called once (or maybe once per thread) in an application's lifetime?
There was a problem hiding this comment.
It's so the code will compile on platforms that don't support atomic operations on 64 bit integers.
There was a problem hiding this comment.
OK. But why write the code in these 2 different ways? Why not just write the implementation that will work on systems without 64-bit atomic op support?
There was a problem hiding this comment.
Ah, I get it. Yes, you can call it an optimization to update with a single atomic operation rather than two atomic operations that can't be done in parallel, although in this case though I wrote the "optimized" version first then the 32-bit-only version second. I don't see a reason to delete the better branch when it is (to my eyes) clearer than the other branch.
a fallback mechanism for something that is not optimized for speed in other respects
Although it is not the most important consideration speed is considered throughout the design. The reason I chose to keep SplitMix as the global seeder even after discarding the idea of "splitting" is that its state can be updated using solely atomic operations rather than requiring a mutex. The reason I don't open and close /dev/urandom on every unpredictableSeed call and instead use /dev/urandom to initialize a global seeder is speed. I wrote this with the awareness that user code out in the wild might be using unpredictableSeed incorrectly and so (all else being equal) I would like to not make such code become catastrophically slow.
There was a problem hiding this comment.
Yes, you can call it an optimization to update with a single atomic operation rather than two atomic operations that can't be done in parallel, although in this case though I wrote the "optimized" version first then the 32-bit-only version second. I don't see a reason to delete the better branch when it is (to my eyes) clearer than the other branch.
Well, I'd suggest an alternative consideration here.
This is a function that is not expected to be called many times in an application's lifetime. So while speed is always nice, it's probably a secondary consideration to being clear, and simple, and easy to maintain.
Right now it looks like there is a series of design choices impacting on one another:
-
there's a choice to use
sharedstate for the fallback mechanism -
that means that you need to either use mutexes or atomic ops
-
that means that you need to care about whether the system has support for 64-bit atomic ops
-
that means that you have to supply multiple different implementations
What if, instead, you simply used a single, thread-local static splitmix instance? It sounds like this would get rid of any need to care about atomic operations, and I don't think there would be any meaningful cost in terms of speed.
It would make the code much simpler to implement, and much easier to understand and modify in future. And it would fit nicely with separating out the splitmix implementation into a struct SplitMixEngine.
There was a problem hiding this comment.
What if, instead, you simply used a single, thread-local static splitmix instance
That's an option. We still need to seed it.
I could also just delete the branch for architectures that don't have 64-bit atomic ops. Right now core.atomic.has64BitCAS is true for DMD on every platform it supports. Same for LDC.
There was a problem hiding this comment.
That's an option. We still need to seed it.
This is one of the reasons why I strongly recommend we separate out the concerns of this PR.
Right now we have a simple (and imperfect) seeding mechanism, which is the one using pid and tid and current time. It's obviously nice to replace this. But the introduction of splitmix doesn't need to depend on that.
So, why not start by just implementing splitmix -- in a cleanly separated way -- and updating unpredictableSeed to use it, again in the simplest possible way?
Then, as a separate PR, we can add support for the different system entropy sources, and update unpredictableSeed to make use of them -- either directly (as sources of the returned value) or indirectly (to generate the seed for splitmix).
There was a problem hiding this comment.
But the introduction of splitmix doesn't need to depend on that.
This SplitMix is initialized with two 64-bit values and our current seeding method produces a single 32-bit value, so minimally we would need to also add a stopgap seeding mechanism that we would immediately replace with the real seeding mechanism in a subsequent pull request. I think that is bad.
There was a problem hiding this comment.
The motivation for introducing the global PRNG was to drastically improve the fallback case. In the case where we can't get system entropy at all, we aren't reduced to initializing thread-local PRNGs with very similar tid+pid+time seeds. Using the global PRNG directly was the next logical step from there.
| len = number of bytes to write | ||
|
|
||
| Returns: | ||
| Number of bytes successfully written up to `len`, or -1 on failure |
There was a problem hiding this comment.
No, it's up to len:
On success, getrandom() returns the number of bytes that were copied
to the buffer buf. This may be less than the number of bytes
requested via buflen if either GRND_RANDOM was specified in flags and
insufficient entropy was present in the random source or the system
call was interrupted by a signal.
http://man7.org/linux/man-pages/man2/getrandom.2.html
Though for small n it's guaranteed to succeed:
If the entropy pool has been initialized and the
request size is small (buflen <= 256), then getrandom() will not fail
with EINTR. Instead, it will return all of the bytes that have been
requested.
edit: that's why in general one would have to write sth. like I did in mir-random:https://github.com/libmir/mir-random/blob/3c17c14b217f930826d8f475620d995abae91b56/source/mir/random/engine/package.d#L668
while (len > 0)
{
auto res = syscall(GETRANDOM, cast(size_t) ptr, len, 0);
if (res >= 0)
{
len -= res;
ptr += res;
}
else
{
return res;
}
}| import core.stdc.errno : ENOSYS, errno; | ||
| const result = syscall(NR_getrandom, cast(size_t) ptr, len, GRND_NONBLOCK); | ||
| if (result != -1 || errno != ENOSYS) | ||
| return result; |
There was a problem hiding this comment.
Do I understand right that this if block is to ensure that we will only return if the NR_getrandom syscall succeeds? And that if we fail, we drop through and use /dev/urandom instead?
Yes.
If so, do I understand right that this is the expected behaviour on earlier linux kernels that don't support the syscall?
Yes. The earlier versions I reviewed/wrote used the C API to check before, but directly checking for ENOSYS is faster.
However, since NR_getrandom was introduced in 3.17 (in 2014) and AFAICT all 3.x.y kernel releases are now EOL, this should not matter in practice.
Unfortunately it does. Some of my cheap VMs still run with 2.6.X because that's the latest release supported by OpenVZ and many hosters haven't switched to LXC. They even backported the Meltdown fixes to 2.6.X
Finally: assuming the above, is it not possible to avoid the wasted syscall where the kernel does not support NR_getrandom? I assume not, because while one might be able to check at compile time whether the kernel supports this syscall, there's no guarantee that the compiled code will be running on top of the same kernel.
Yes, no compile-time guarantee.
| {{ | ||
| import core.stdc.errno : ENOSYS, errno; | ||
| const result = syscall(NR_getrandom, cast(size_t) ptr, len, GRND_NONBLOCK); | ||
| if (result != -1 || errno != ENOSYS) |
There was a problem hiding this comment.
Ah, on looking more closely at this I think I see where some of my confusion w.r.t. this came from. It returns if either the result is not -1 (i.e. no error) or if the error is not ENOSYS.
In other words, it does not return only if the syscall fails because of a lack of implementation. If the syscall is implemented, it will always return from here (whether with success or failure).
I'd had the impression it would fall through with failure, and hence my confusion over the possibility of < len.
| // generator, since it can't cover the full range of | ||
| // possible seed values. Ideally we need a 64-bit | ||
| // unpredictable seed to complement the 32-bit one! | ||
| static assert(isSeedable!(Mt19937_64, typeof(map!((a) => (cast(ulong) unpredictableSeed))(repeat(0))))); |
There was a problem hiding this comment.
This is a great change, but it should be in a separate patch that follows the update to unpredictableSeed itself.
See #6388 (comment)
|
For what it's worth, this is another reason why I'm uncomfortable with the "big black box" approach to When it's just a simple function using a "crummy" seed generation mechanism or a won't-fail one like But with other system entropy sources in the mix, suddenly there's the "What if it fails?" scenario to deal with. And suddenly we have a cascading set of concerns to deal with ... where really the control of what to do ought to be in user hands. (Speaking personally -- if I can't get system entropy to launch up my Keeping a Anyway, more constructively:
... was this based on the struct |
|
Leaving aside the bigger picture questions, I'm not sure we really need to care that much about multiple calls per thread. OK, if you want to do the |
|
Just to go over this in a bit more detail:
... defines
... does the same
... isn't even defining a random device: it's a mersenne twister implementation which defines a
... horrible "superclass" style of doing things that is very much the opposite of phobos' easily recombinable components
... wrapper around Windows All of which leads to an important lesson: going by these examples, what everyone else is doing for their "random device" (or whatever they name it) is thinly wrapping system entropy sources, and throwing exceptions if they fail. No one is rolling their own pseudo-random generator under the hood on top of system entropy, and no one else is providing a fallback mechanism if the system entropy supply fails. So, seriously: why are WE trying to over-engineer things by taking that route? Given that every platform that D targets has a well defined entropy source (either arc4random, or Linux getrandom, or /dev/urandom, or Windows CryptGenRandom), why not just rewrite |
a) "Rolling their own pseudo-random generator"? I know you read the paper linked in the source code, so I have no idea why you're pretending this isn't a well-known algorithm.
Throwing on failure is a usability regression. That is an extremely bad suggestion. |
This sounds attractive. Disclaimer: haven't looked over the problem or the work yet. |
A complication: To avoid this, we could easily add a new function (or new callable struct) to provide this system-entropy access. A callable struct could be That would allow a useful distinction:
|
I don't want to make a big deal of this, but I do think it would be a good idea to re-read what I wrote, and consider that there could be other interpretations than the one placed on it here. I don't think it's constructive to suggest bad faith.
But Python does that with different purpose: to provide a default PRNG instance. It's the equivalent of
... to the same end, I suspect. An explicit default PRNG, exposed as a PRNG, is something quite different from using a PRNG as a fallback if system entropy fails.
I agree that it's problematic to break |
We can throw an Error. |
It's possible that could be considered reasonable given the types of entropy source we'd be using and the numbers of bits being requested (e.g. |
|
I still honestly feel more comfortable with the clean separation between a new |
|
Looking at the code after a few days, I agree that it was unnecessarily difficult to read. I have greatly simplified the implementation. The quirky no-64-bit-atomics case has been replaced by an 8 line function that uses thread-locals. |
8a4f2ed to
007e1ef
Compare
… simplify the fallback PRNG (no shared state).
234543e to
646d367
Compare
646d367 to
7632411
Compare
I've updated the PR to do this. Because of this I've removed the global SplitMix. |
You're talking about mechanisms for producing arbitrary sequences of random bytes, but here we're talking about a mechanism for producing distinct seeds. For the former it's a statistical defect if there isn't a chance of getting two identical sequential results, but for the latter it is an improvement. That is why it makes sense to use a permutation of the counting numbers.
In the case where you are expecting system entropy or random bytes suitable for cryptographic purposes it is an error to get anything else. When that it is not what the user was promised or expecting, it is extremely bad library design to abort because a thing the user didn't want or need is not available. |
Followup to #6267
New summary: Add unpredictableSeed!UIntType. Add system-specific entropy sources for platforms lacking arc4random. Unlike arc4random these sources can fail so the fallback branch is not eliminated. Replace current use of MinstdRand0 with SplitMix.
When arc4random is unavailable use system entropy to initialize ashared staticSplitMix PRNG and use its output forunpredictableSeed. The original plan was for the global SplitMix to split off a thread-local SplitMix for each thread but upon reflection I decided that was unnecessary so I just use SplitMix as an ordinary PRNG.EDIT: There was a way after all. NowunpredictableSeedis now an alias ofunpredictableSeedOf!uint. I would rather the name beunpredictableSeed!uint/unpredictableSeed!ulongbut there seems to be no way to make that work while still allowingunpredictableSeedwith no parentheses.unpredictableSeed!uintis an alias ofunpredictableSeed.The OS-specific entropy gathering code was largely written by @wilzbach for libmir/mir-random#13.