use array interface to hashOf() by WalterBright · Pull Request #1595 · dlang/druntime

WalterBright · 2016-06-17T08:04:38Z

Because the array interface is safer.

WalterBright · 2016-06-17T08:05:17Z

src/object.d

    }

-    override size_t getHash(in void* p) @safe pure nothrow const
+    override size_t getHash(in void* p) @trusted pure nothrow const


Because the slicing is an unsafe operation. The older code was unsafe as well.

How can this interface (dereferencing a void*) even be @safe in the first place?

getHash is supposed to take a pointer to the object matching the TypeInfo.

"is supposed" is something entirely different than "is guaranteed to". You could of course argue that the interface was broken before, but it still remains a safety violation that should at least be recorded in Bugzilla (and subsequently be deprecated/made un-@safe).

Please do add a bugzilla issue for it. Such is beyond the scope of this PR, however, and should not impede it.

Please do add a bugzilla issue for it. Such is beyond the scope of this PR, however, and should not impede it.

The PR adds a @trusted annotation that is clearly wrong. It's barely acceptable because the code was already broken before, but now it is your responsibility to make sure it is eventually cleaned up. Previously, the place would have been flagged automatically once the mistaken @trusted annotation on hashOf is removed.

The @safe was equally wrong. This PR does not address (or change) the interface to getHash(), the change was made to reflect the unsafe operation inside the function. The issue you brought up is with the interface to getHash(), not the interface to hashOf().

your responsibility

It is the responsibility of people who find bugs to post them to bugzilla. The bug you found has nothing to do with this PR, which is to fix the interface to hashOf(). You should take the credit for finding the issue by posting it to bugzilla.

Meanwhile, this PR is only concerned with the interface to hashOf.

The @safe was equally wrong.

No; @safe can never be wrong as in breaking the type system, it can just cause code not to compile. You are adding a new, incorrect instance of @trusted, which does add an extra type system violation.

As I'm sure you realise, this is a huge deal because instances where things are mistakenly marked as @safe can be detected automatically, which is impossible for mistaken uses of @trusted. The only way to find those is to painstakingly go through all occurrences of @trusted and manually audit each one of them.

Just trust me on that one. I've found a large share of the Phobos safety bugs you went through recently, this PR would add another such bug, I point it out, you fix/document it before it goes in. That's how pull request review works. You can't justify knowingly introducing an issue without any documentation, etc. by pointing out that the code base contained bugs before.

PetarKirov · 2016-06-17T08:59:27Z

The changes LGTM. Interfaces with void* don't look @safe to me either, but I suggest we leave this topic for discussion in other PR, as this one is a clear improvement.

WalterBright · 2016-06-17T09:19:46Z

src/rt/util/hash.d

 }

 @trusted pure nothrow
-size_t hashOf( const(void)* buf, size_t len, size_t seed = 0 )


get rid of the default argument, it just causes bugs

dnadlinger · 2016-06-17T12:22:26Z

(This is currently blocked due to an incorrect addition of @trusted.)

WalterBright · 2016-06-17T19:43:14Z

It is not incorrect. The revised code will not compile with @safe, even though the original, despite being marked @safe, was not detected as being unsafe. The unsafety was in the interface to hashOf(). By fixing the interface to hashOf(), the unsafety of the getHash() interface is exposed.

This is similar to a discussion I had with @yebblies some time back about adding const correctness incrementally. Adding const tends to be viral, i.e. an all-or-nothing proposition. But trying to const-correct the entire program in on PR is thoroughly impractical, so I was doing it incrementally by adding unsafe casts here and there to satisfy the type system.

As const spread through the program in successive PRs, the casts were removed.

yebblies · 2016-06-18T12:01:00Z

As const spread through the program in successive PRs, the casts were removed.

Are they all gone now?

Of course, you could just do what every other contributor does and FINISH the changes in a branch before pushing little bits to master.

dnadlinger · 2016-06-18T23:43:59Z

Sure – I did acknowledge that a temporary fix might be valid in this situation, since we need to deprecate the type system-breaking API first. All I'm asking is that you add a comment there mentioning the fact that @trusted is incorrect but has been added for backwards-compatibility, and an issue number with more details on the API problem. Incidentally, I just found out that a report of this has existed since more than two years ago.

WalterBright · 2016-06-19T01:38:57Z

I just found out that a report of this has existed since more than two years ago.

I just came here to report the same thing! So what's stopping you from pulling this?

schveiguy · 2016-06-21T00:34:28Z

The correct marking of getHash is @system, period. No public API which takes a void * and assumes it points to valid memory of a given length should ever be marked @trusted. Whatever breaks should be then marked @trusted as follows:

If you are 100% certain that the void * you are passing into getHash is the correct type, then mark the function (or at least the call to getHash) @trusted.
Otherwise, mark it @system and find the next level of breakage. Repeat algorithm.

Note that item 1 doesn't include any of the calls being modified here.

Please, let's not obfuscate where the problem lies. If this breaks code, it's code worth breaking. Anyone who cares about @safe-ty doesn't want the compiler to just gloss over such details.

WalterBright · 2016-06-21T02:53:07Z

Redoing the interface is getHash and its consequences are way beyond the scope of this PR. This PR is solely about fixing the interface to hashOf, which already affects 13 files.

dnadlinger · 2016-06-21T09:50:08Z

@WalterBright: Yes, the ticket does exist, but you neither mentioned it in code, nor in the commit message (where I would really want to see it at the very least).

The fact that the issue lay dormant in Bugzilla for more than two years with no duplicates being reported – even though many people have used and even worked on TypeInfo – should be a pretty clear indication that nobody is used to think about @safe-ty, to the point of not recognising even a blatantly unsafe API. If that's not reason enough for being very cautious about adding new @trusted mis-uses and making sure the reasoning is well-documented, I don't know what else would be.

WalterBright · 2016-06-21T10:57:01Z

This PR does not change the interface to getHash. Trying to lump this PR in with another problem is a mistake and completely off-topic. If you want to raise the visibility of https://issues.dlang.org/show_bug.cgi?id=12516 that's fine, but it has nothing to do with this PR and should not block it.

WalterBright · 2016-06-21T11:00:02Z

Yes, the ticket does exist, but you neither mentioned it in code,

That's not what we do. Bug reports go in Bugzilla.

nor in the commit message (where I would really want to see it at the very least).

It has nothing to do with this PR. It is an entirely separate problem.

dnadlinger · 2016-06-21T11:08:09Z

That's not what we do. Bug reports go in Bugzilla.

Up to this, I've assumed that there a genuine misunderstanding here, but if you want to have a discussion on that level, I can also just close this PR and ask you to open a separate one for the single bug fix hidden among all the other refactorings.

It has nothing to do with this PR.

You are adding an incorrect @trusted attribute where there was none before. This is usually completely unacceptable and would block code from going in. Thus, the issue has everything to do with this PR, because it is the only justification you can possibly bring up for why we should accept the latter in its current form. Conversely, it should be easy for future readers to figure out why the broken code was introduced in the first place.

schveiguy · 2016-06-21T11:35:21Z

You are adding an incorrect @trusted attribute where there was none before.

IMO, a @trusted function that can be abused is no worse or better than a @safe function that can be abused. Whether getHash commits the egregious behavior or hashOf does, that is not the issue. This PR does nothing to further safety. Except for the actual bug fix, this should be closed.

WalterBright · 2016-06-21T19:34:50Z

why the broken code was introduced in the first place.

Broken code was not introduced by this PR. If you believe it does, please re-examine it.

Furthermore, this is a misunderstanding of @trusted. @trusted is a statement about the contents of the function, not the interface. The interface did not change, and the function was never safe, and it was never safe because of the unsafe interface to hashOf. Fixing hashOf, the lowest level code, is the first step to improving the safety of the call chain.

a @trusted function that can be abused is no worse or better than a @safe function that can be abused.

That is correct.

This PR does nothing to further safety.

This is incorrect. It fixes the unsafe interface of hashOf. One thing at a time per PR.

this should be closed.

An all-or-nothing approach to fixing safety problems means that zero progress can be made.

schveiguy · 2016-06-21T20:01:28Z

It fixes the unsafe interface of hashOf

I think you mean something different, like that it avoids using the unsafe interface?

But in any case, this doesn't push the problem to the right level. Passing a memory-invalid array to a @safe function isn't any safer. The right level is where you know the void * actually points at an int (or whatever). Intermediate fixes are rubble bouncing.

An all-or-nothing approach to fixing safety problems means that zero progress can be made.

There is no reason to avoid calling the unsafe interface when the operation itself is actually unsafe. IMO, this PR is already achieving zero progress.

It doesn't have to be all or nothing. Fix calls to getHash to be properly marked @trusted at the right level, and then you can properly mark getHash as @system, because it is.

dnadlinger · 2016-06-21T22:06:16Z

Broken code was not introduced by this PR. If you believe it does, please re-examine it.

I suppose it is only fair then to ask that you have another look at the proposed code yourself. The commit introduces the fragment p[0 .. initializer().length] in a function marked as @trusted. I think you'll find it hard to argue that this is not incorrect code.

In any case, would you mind separating out the actual one-line bug fix (zero length vs. seed) into a second PR so I can merge it right away? All the usual arguments you make for DMD PRs apply.

WalterBright · 2016-06-22T00:13:55Z

But in any case, this doesn't push the problem to the right level.

Of course it does. There's no reason for hashOf to be trusted instead of safe, and this fixes the problem with hashOf.

I think you'll find it hard to argue that this is not incorrect code.

The problem was always there, it was just hidden in the call to hashOf due to the unsafe interface to hashOf. This PR does not introduce a problem.

In any case, would you mind separating out the actual one-line bug fix (zero length vs. seed) into a second PR so I can merge it right away?

No, because it's time to settle this point, not defer it. The hashOf interface needs to be fixed, and this PR fixes it and does NOT degrade anything else.

Adding a requirement for PRs to only be "full stack" fixes is unworkable in general. What needs to happen is progress, piece by piece. Otherwise druntime/phobos will simply never get fixed to be safe.

It's exactly the same issue one runs into when trying to retrofit const-correctness into an existing code base. Doing it incrementally, bottom up, is the only practical way, and requires the introduction of temporary cast(char*) bits of code. If a principled stand is taken against that, the reality is the code will never become const-correct.

andralex · 2016-06-22T12:17:48Z

I have two questions about this:

Are there improvements in the compiler that detect @safe is misapplied to hashOf and render it incompilable?
What is the plan after this step?

dnadlinger · 2016-06-22T15:03:42Z

src/rt/util/hash.d


-@trusted pure nothrow
-size_t hashOf( const(void)* buf, size_t len, size_t seed = 0 )
+@trusted pure nothrow @nogc


(Somewhat tangentially to the main discussion, this should be changed to @system right away. rt.util is a druntime-internal module, and there is nothing to be gained from keeping the broken attribute around.)

(And unless you make this @system, you can hardly claim that it makes any piece of code more safe. More @safe, anyway – there is an argument to be made that removing the default value already makes client code more resilient against bugs.) (edited punctuation for clarity)

I left the old function in there because removing it may break existing code, adding system to it may break existing code, and breaking existing code should be done separately if possible. Not lumped in with other things. Even if hashOf is an internal function, there's nothing stopping anyone from using it. There being another hashOf in object.d is unfortunate and further obfuscates things, but again, that is not the topic of this PR.

Even if hashOf is an internal function, there's nothing stopping anyone from using it.

rt.* is not distributed to users.

@WalterBright response?

Anyone can import it - it's in import/core/internal/hash.d

Changing that is properly a separate PR.

Edit: Ah, you're right, I had it confused.

Ah, you're right, I had it confused.

No worries – all the functions having (almost) identical names make it easy to mix them up.

dnadlinger · 2016-06-22T16:08:19Z

@WalterBright:

The problem was always there, it was just hidden in the call to hashOf due to the unsafe interface to hashOf. This PR does not introduce a problem.

This is correct as far as the client (druntime user) is concerned. The function was mistakenly callable from safe code before, and continues to be so. However, this is completely besides the point – I already agreed that deprecating the old getHash interface is something we can handle separately.

Where this PR does introduce a problem, however, is for maintaining druntime. While the interface is still broken for the user just the same, your change spreads out the mistakenly @trusted bits further, increasing the number of places that we eventually have to remember to go back to and fix. With that one getHash definition still being @safe you would have gotten an error if you had removed the mistaken @trusted annotation from hashOf. Instead, we now have another broken piece of code that goes undetected by default and is hard to make sense of when browsing the source code. I'm afraid you'll have to explain to me why you think that documenting something like this is a bad idea, because I can't see a reason not to.

In case that was somehow lost in translation, this is literally what I have been asking for previously, and still am – adding a short comment describing why introducing a piece of code not that different from auto foo() @trusted { return (cast(void*)0xdeadbeef)[0 .. 1]; } is a justifiable thing to do. Again, I didn't ever disagree with you on it being a valid temporary measure. I just think you need to make obvious to other contributors what is going on there.

And besides, isn't the fact that it managed to spawn a discussion of this length between experts enough of a hint enough that there is something at odds with just "use array interface to hashOf()" as the sole explanation of this contribution? I'm honestly not sure how asking for more than five words of description for a subtle matter like this can even be controversial in the first place.

In any case, would you mind separating out the actual one-line bug fix (zero length vs. seed) into a second PR so I can merge it right away?

No, because it's time to settle this point, not defer it.

It was only a suggestion to resolve this impasse, as I find it silly to hold up a bug fix and spend absurd amounts of time on a discussion just because of your apparent unwillingness to even minimally document your work. If you want to continue this argument instead, fine by me.

Unless you insist, I'm not going to comment on the (non-)applicability of your comparison to const-correctness or the technical merits of resorting to BOLD ALL CAPS for fallacious arguments, since I don't think those are related to what my point is.

@andralex:
I'm not sure how your first question is relevant to this discussion. hashOf is a druntime-internal function that is marked @trusted, and as such I don't think proving memory safety violations in it is something high up on our list of priorities (although I suppose we could think about making known-unsafe operations in @trusted code errors too).

WalterBright · 2016-06-22T21:10:00Z

your change spreads out the mistakenly @trusted bits further,

No, it does not. It simply identifies code that should have been marked @trusted to begin with, because it was not @safe despite being marked @safe.

It was only a suggestion to resolve this impasse

The consequences will be to avoid making a decision and never fix hashOf.

I'm not going to comment on the (non-)applicability of your comparison to const-correctness

It's an apt comparison. Insisting on a "full stack" approach to fixing viral issues means nothing will get done.

your apparent unwillingness to even minimally document your work.

I see no purpose to documenting things that are listed as bugs in bugzilla. You're putting forth a new policy from existing practice. You might as well require that all phobos bugs in bugzilla also get PRs to document them in the source code. Remember, this PR is not about fixing bugs in the interface to getHash, it is about making the interface to hashOf safe. Conflating this with other problems is not how PRs are supposed to work. Grepping a codebase for FIXME is the old way of doing things. We use bugzilla now.

And besides, simply marking it as @trusted is supposed to be a flag for reviewers. It is not hidden.

WalterBright · 2016-06-22T21:14:20Z

Are there improvements in the compiler that detect @safe is misapplied to hashOf and render it incompilable?

The @safe was misapplied to getHash, not hashOf. hashOf was marked @trusted although it did not provide a trusted interface to what it was doing. It's like marking free(void*) as trusted. Switching to an array interface makes it possible to be trusted.

What is the plan after this step?

My plan was grepping the druntime sources for .ptr and reviewing/improving them one by one. I'd be surprised if there wasn't more crap in there like hashOf.

WalterBright · 2016-06-22T21:30:31Z

Keep in mind that this PR is very simple and straightforward:

It adds an overload to hashOf that provides a safe array interface rather than an unsafe ptr/length interface. All callers to it in druntime were fixed to use the new array interface.

It's a self-contained, incremental step towards improving the overall safety of druntime/Phobos.

dnadlinger · 2016-06-22T21:34:58Z

No, it does not. It simply identifies code that should have been marked @trusted to begin with, because it was not @safe despite being marked @safe.

[…]

The @safe was misapplied to getHash,

These are nonsensical statements. @safe cannot be misapplied [1], it can only be mistakenly allowed due to compiler bugs. Neither of it is the case here.

Also, the code shouldn't have been marked @trusted to begin with, but @system. I suspect (and dearly hope) that this is obvious to you, but it's not exactly helping your argument.

[1] In the sense that compilation will fail instead of it causing a type-system breaking bug.

schveiguy · 2016-06-23T14:09:41Z

It adds an overload to hashOf that provides a safe array interface rather than an unsafe ptr/length interface.

This PR does not do that. I already pulled #1593

andralex · 2016-06-24T23:48:00Z

src/rt/typeinfo/ti_Ag.d

    {
-        byte[] s = *cast(byte[]*)p;
-        return rt.util.hash.hashOf(s.ptr, s.length * byte.sizeof);
+        const s = *cast(void[]*)p;


please cast to const void[] instead

WalterBright · 2016-06-24T23:52:11Z

This PR does not do that. I already pulled #1593

You're right. This one fixes the calls to it.

it can only be mistakenly allowed due to compiler bugs. Neither of it is the case here.

This is incorrect. The problem was that hashOf was marked @trusted, even though hashOf did not have a safe interface. Trusted can be used to hide unsafety from the compiler, and was in this case, and there is no compiler technology that can detect that in the general case.

the code shouldn't have been marked @trusted to begin with, but @System. I suspect (and dearly hope) that this is obvious to you

It is obvious to me. But adding @System to getHash is an API-breaking change, and needs to be in a separate PR and be evaluated for its breaking effects before doing so. Such is beyond the scope of this PR, which is solely about hashOf.

andralex · 2016-06-24T23:55:22Z

This is a viable PR that marks progress on an issue. The intensity of the discussion is somewhat proportional; clearly something hanky-panky was going on in the virtual getHash method. Currently we can't fix that without breaking compatibility, but it's self-evidently good to mark that as @trusted instead of @safe (e.g. when searching for fishy code you grep for the former, not the latter).

@WalterBright please add this just before the getHash declaration:

// TODO: fix https://issues.dlang.org/show_bug.cgi?id=12516 e.g. by changing this to a truly safe interface.

I'll pull after @WalterBright addresses the nits.

WalterBright · 2016-06-25T00:05:26Z

@andralex done

andralex · 2016-06-25T20:58:28Z

alrighty then

WalterBright · 2016-06-25T21:22:08Z

thanks!

WalterBright reviewed Jun 17, 2016
View reviewed changes

WalterBright force-pushed the useHashOf branch 3 times, most recently from e7e2b5b to 67bf11a Compare June 17, 2016 08:35

WalterBright force-pushed the useHashOf branch 3 times, most recently from 537cf8b to 077c4f6 Compare June 17, 2016 09:18

WalterBright reviewed Jun 17, 2016
View reviewed changes

WalterBright force-pushed the useHashOf branch 4 times, most recently from 35f7183 to 01537a2 Compare June 17, 2016 09:56

dnadlinger reviewed Jun 22, 2016
View reviewed changes

andralex reviewed Jun 24, 2016
View reviewed changes

WalterBright force-pushed the useHashOf branch from 01537a2 to 71f0512 Compare June 25, 2016 00:03

WalterBright force-pushed the useHashOf branch from 71f0512 to 00d3156 Compare June 25, 2016 00:18

use array interface to hashOf()

267fa2a

WalterBright force-pushed the useHashOf branch from 00d3156 to 267fa2a Compare June 25, 2016 00:24

andralex merged commit 6251718 into dlang:master Jun 25, 2016

WalterBright deleted the useHashOf branch June 25, 2016 22:09

WalterWaldron mentioned this pull request Dec 4, 2016

[REG 2.072] Revert object.hashOf changes from "use array interface to hashOf()" #1707

Merged

Uh oh!

Conversation

WalterBright commented Jun 17, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dnadlinger Jun 17, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PetarKirov commented Jun 17, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dnadlinger commented Jun 17, 2016

Uh oh!

WalterBright commented Jun 17, 2016

Uh oh!

yebblies commented Jun 18, 2016

Uh oh!

dnadlinger commented Jun 18, 2016

Uh oh!

WalterBright commented Jun 19, 2016

Uh oh!

schveiguy commented Jun 21, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

WalterBright commented Jun 21, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dnadlinger commented Jun 21, 2016

Uh oh!

WalterBright commented Jun 21, 2016

Uh oh!

WalterBright commented Jun 21, 2016

Uh oh!

dnadlinger commented Jun 21, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

schveiguy commented Jun 21, 2016

Uh oh!

WalterBright commented Jun 21, 2016

Uh oh!

schveiguy commented Jun 21, 2016

Uh oh!

dnadlinger commented Jun 21, 2016

Uh oh!

WalterBright commented Jun 22, 2016

Uh oh!

andralex commented Jun 22, 2016

Uh oh!

dnadlinger Jun 22, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dnadlinger Jun 22, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WalterBright Jun 25, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

dnadlinger Jun 17, 2016 •

edited

Loading

schveiguy commented Jun 21, 2016 •

edited

Loading

WalterBright commented Jun 21, 2016 •

edited

Loading

dnadlinger commented Jun 21, 2016 •

edited

Loading

dnadlinger Jun 22, 2016 •

edited

Loading

dnadlinger Jun 22, 2016 •

edited

Loading

WalterBright Jun 25, 2016 •

edited

Loading

dnadlinger commented Jun 22, 2016 •

edited

Loading

dnadlinger commented Jun 22, 2016 •

edited

Loading

schveiguy commented Jun 23, 2016 •

edited

Loading