Skip to content
This repository was archived by the owner on Oct 12, 2022. It is now read-only.
/ druntime Public archive

use array interface to hashOf()#1595

Merged
andralex merged 1 commit intodlang:masterfrom
WalterBright:useHashOf
Jun 25, 2016
Merged

use array interface to hashOf()#1595
andralex merged 1 commit intodlang:masterfrom
WalterBright:useHashOf

Conversation

@WalterBright
Copy link
Member

Because the array interface is safer.

}

override size_t getHash(in void* p) @safe pure nothrow const
override size_t getHash(in void* p) @trusted pure nothrow const
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the slicing is an unsafe operation. The older code was unsafe as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can this interface (dereferencing a void*) even be @safe in the first place?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getHash is supposed to take a pointer to the object matching the TypeInfo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"is supposed" is something entirely different than "is guaranteed to". You could of course argue that the interface was broken before, but it still remains a safety violation that should at least be recorded in Bugzilla (and subsequently be deprecated/made un-@safe).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do add a bugzilla issue for it. Such is beyond the scope of this PR, however, and should not impede it.

Copy link
Contributor

@dnadlinger dnadlinger Jun 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do add a bugzilla issue for it. Such is beyond the scope of this PR, however, and should not impede it.

The PR adds a @trusted annotation that is clearly wrong. It's barely acceptable because the code was already broken before, but now it is your responsibility to make sure it is eventually cleaned up. Previously, the place would have been flagged automatically once the mistaken @trusted annotation on hashOf is removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @safe was equally wrong. This PR does not address (or change) the interface to getHash(), the change was made to reflect the unsafe operation inside the function. The issue you brought up is with the interface to getHash(), not the interface to hashOf().

your responsibility

It is the responsibility of people who find bugs to post them to bugzilla. The bug you found has nothing to do with this PR, which is to fix the interface to hashOf(). You should take the credit for finding the issue by posting it to bugzilla.

Meanwhile, this PR is only concerned with the interface to hashOf.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @safe was equally wrong.

No; @safe can never be wrong as in breaking the type system, it can just cause code not to compile. You are adding a new, incorrect instance of @trusted, which does add an extra type system violation.

As I'm sure you realise, this is a huge deal because instances where things are mistakenly marked as @safe can be detected automatically, which is impossible for mistaken uses of @trusted. The only way to find those is to painstakingly go through all occurrences of @trusted and manually audit each one of them.

Just trust me on that one. I've found a large share of the Phobos safety bugs you went through recently, this PR would add another such bug, I point it out, you fix/document it before it goes in. That's how pull request review works. You can't justify knowingly introducing an issue without any documentation, etc. by pointing out that the code base contained bugs before.

@WalterBright WalterBright force-pushed the useHashOf branch 3 times, most recently from e7e2b5b to 67bf11a Compare June 17, 2016 08:35
@PetarKirov
Copy link
Member

The changes LGTM. Interfaces with void* don't look @safe to me either, but I suggest we leave this topic for discussion in other PR, as this one is a clear improvement.

@WalterBright WalterBright force-pushed the useHashOf branch 3 times, most recently from 537cf8b to 077c4f6 Compare June 17, 2016 09:18
}

@trusted pure nothrow
size_t hashOf( const(void)* buf, size_t len, size_t seed = 0 )
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get rid of the default argument, it just causes bugs

@WalterBright WalterBright force-pushed the useHashOf branch 4 times, most recently from 35f7183 to 01537a2 Compare June 17, 2016 09:56
@dnadlinger
Copy link
Contributor

(This is currently blocked due to an incorrect addition of @trusted.)

@WalterBright
Copy link
Member Author

It is not incorrect. The revised code will not compile with @safe, even though the original, despite being marked @safe, was not detected as being unsafe. The unsafety was in the interface to hashOf(). By fixing the interface to hashOf(), the unsafety of the getHash() interface is exposed.

This is similar to a discussion I had with @yebblies some time back about adding const correctness incrementally. Adding const tends to be viral, i.e. an all-or-nothing proposition. But trying to const-correct the entire program in on PR is thoroughly impractical, so I was doing it incrementally by adding unsafe casts here and there to satisfy the type system.

As const spread through the program in successive PRs, the casts were removed.

@yebblies
Copy link
Contributor

As const spread through the program in successive PRs, the casts were removed.

Are they all gone now?

Of course, you could just do what every other contributor does and FINISH the changes in a branch before pushing little bits to master.

@dnadlinger
Copy link
Contributor

Sure – I did acknowledge that a temporary fix might be valid in this situation, since we need to deprecate the type system-breaking API first. All I'm asking is that you add a comment there mentioning the fact that @trusted is incorrect but has been added for backwards-compatibility, and an issue number with more details on the API problem. Incidentally, I just found out that a report of this has existed since more than two years ago.

@WalterBright
Copy link
Member Author

I just found out that a report of this has existed since more than two years ago.

I just came here to report the same thing! So what's stopping you from pulling this?

@schveiguy
Copy link
Member

schveiguy commented Jun 21, 2016

The correct marking of getHash is @system, period. No public API which takes a void * and assumes it points to valid memory of a given length should ever be marked @trusted. Whatever breaks should be then marked @trusted as follows:

  1. If you are 100% certain that the void * you are passing into getHash is the correct type, then mark the function (or at least the call to getHash) @trusted.
  2. Otherwise, mark it @system and find the next level of breakage. Repeat algorithm.

Note that item 1 doesn't include any of the calls being modified here.

Please, let's not obfuscate where the problem lies. If this breaks code, it's code worth breaking. Anyone who cares about @safe-ty doesn't want the compiler to just gloss over such details.

@WalterBright
Copy link
Member Author

WalterBright commented Jun 21, 2016

Redoing the interface is getHash and its consequences are way beyond the scope of this PR. This PR is solely about fixing the interface to hashOf, which already affects 13 files.

@dnadlinger
Copy link
Contributor

@WalterBright: Yes, the ticket does exist, but you neither mentioned it in code, nor in the commit message (where I would really want to see it at the very least).

The fact that the issue lay dormant in Bugzilla for more than two years with no duplicates being reported – even though many people have used and even worked on TypeInfo – should be a pretty clear indication that nobody is used to think about @safe-ty, to the point of not recognising even a blatantly unsafe API. If that's not reason enough for being very cautious about adding new @trusted mis-uses and making sure the reasoning is well-documented, I don't know what else would be.

@WalterBright
Copy link
Member Author

This PR does not change the interface to getHash. Trying to lump this PR in with another problem is a mistake and completely off-topic. If you want to raise the visibility of https://issues.dlang.org/show_bug.cgi?id=12516 that's fine, but it has nothing to do with this PR and should not block it.

@WalterBright
Copy link
Member Author

Yes, the ticket does exist, but you neither mentioned it in code,

That's not what we do. Bug reports go in Bugzilla.

nor in the commit message (where I would really want to see it at the very least).

It has nothing to do with this PR. It is an entirely separate problem.

@dnadlinger
Copy link
Contributor

dnadlinger commented Jun 21, 2016

That's not what we do. Bug reports go in Bugzilla.

Up to this, I've assumed that there a genuine misunderstanding here, but if you want to have a discussion on that level, I can also just close this PR and ask you to open a separate one for the single bug fix hidden among all the other refactorings.

It has nothing to do with this PR.

You are adding an incorrect @trusted attribute where there was none before. This is usually completely unacceptable and would block code from going in. Thus, the issue has everything to do with this PR, because it is the only justification you can possibly bring up for why we should accept the latter in its current form. Conversely, it should be easy for future readers to figure out why the broken code was introduced in the first place.

@schveiguy
Copy link
Member

You are adding an incorrect @trusted attribute where there was none before.

IMO, a @trusted function that can be abused is no worse or better than a @safe function that can be abused. Whether getHash commits the egregious behavior or hashOf does, that is not the issue. This PR does nothing to further safety. Except for the actual bug fix, this should be closed.

@WalterBright
Copy link
Member Author

why the broken code was introduced in the first place.

Broken code was not introduced by this PR. If you believe it does, please re-examine it.

Furthermore, this is a misunderstanding of @trusted. @trusted is a statement about the contents of the function, not the interface. The interface did not change, and the function was never safe, and it was never safe because of the unsafe interface to hashOf. Fixing hashOf, the lowest level code, is the first step to improving the safety of the call chain.

a @trusted function that can be abused is no worse or better than a @safe function that can be abused.

That is correct.

This PR does nothing to further safety.

This is incorrect. It fixes the unsafe interface of hashOf. One thing at a time per PR.

this should be closed.

An all-or-nothing approach to fixing safety problems means that zero progress can be made.

@schveiguy
Copy link
Member

It fixes the unsafe interface of hashOf

I think you mean something different, like that it avoids using the unsafe interface?

But in any case, this doesn't push the problem to the right level. Passing a memory-invalid array to a @safe function isn't any safer. The right level is where you know the void * actually points at an int (or whatever). Intermediate fixes are rubble bouncing.

An all-or-nothing approach to fixing safety problems means that zero progress can be made.

There is no reason to avoid calling the unsafe interface when the operation itself is actually unsafe. IMO, this PR is already achieving zero progress.

It doesn't have to be all or nothing. Fix calls to getHash to be properly marked @trusted at the right level, and then you can properly mark getHash as @system, because it is.

@dnadlinger
Copy link
Contributor

Broken code was not introduced by this PR. If you believe it does, please re-examine it.

I suppose it is only fair then to ask that you have another look at the proposed code yourself. The commit introduces the fragment p[0 .. initializer().length] in a function marked as @trusted. I think you'll find it hard to argue that this is not incorrect code.

In any case, would you mind separating out the actual one-line bug fix (zero length vs. seed) into a second PR so I can merge it right away? All the usual arguments you make for DMD PRs apply.

@WalterBright
Copy link
Member Author

But in any case, this doesn't push the problem to the right level.

Of course it does. There's no reason for hashOf to be trusted instead of safe, and this fixes the problem with hashOf.

I think you'll find it hard to argue that this is not incorrect code.

The problem was always there, it was just hidden in the call to hashOf due to the unsafe interface to hashOf. This PR does not introduce a problem.

In any case, would you mind separating out the actual one-line bug fix (zero length vs. seed) into a second PR so I can merge it right away?

No, because it's time to settle this point, not defer it. The hashOf interface needs to be fixed, and this PR fixes it and does NOT degrade anything else.

Adding a requirement for PRs to only be "full stack" fixes is unworkable in general. What needs to happen is progress, piece by piece. Otherwise druntime/phobos will simply never get fixed to be safe.

It's exactly the same issue one runs into when trying to retrofit const-correctness into an existing code base. Doing it incrementally, bottom up, is the only practical way, and requires the introduction of temporary cast(char*) bits of code. If a principled stand is taken against that, the reality is the code will never become const-correct.

@andralex
Copy link
Member

I have two questions about this:

  • Are there improvements in the compiler that detect @safe is misapplied to hashOf and render it incompilable?
  • What is the plan after this step?


@trusted pure nothrow
size_t hashOf( const(void)* buf, size_t len, size_t seed = 0 )
@trusted pure nothrow @nogc
Copy link
Contributor

@dnadlinger dnadlinger Jun 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Somewhat tangentially to the main discussion, this should be changed to @system right away. rt.util is a druntime-internal module, and there is nothing to be gained from keeping the broken attribute around.)

Copy link
Contributor

@dnadlinger dnadlinger Jun 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(And unless you make this @system, you can hardly claim that it makes any piece of code more safe. More @safe, anyway – there is an argument to be made that removing the default value already makes client code more resilient against bugs.) (edited punctuation for clarity)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left the old function in there because removing it may break existing code, adding system to it may break existing code, and breaking existing code should be done separately if possible. Not lumped in with other things. Even if hashOf is an internal function, there's nothing stopping anyone from using it. There being another hashOf in object.d is unfortunate and further obfuscates things, but again, that is not the topic of this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if hashOf is an internal function, there's nothing stopping anyone from using it.

rt.* is not distributed to users.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WalterBright response?

Copy link
Member Author

@WalterBright WalterBright Jun 25, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyone can import it - it's in import/core/internal/hash.d

Changing that is properly a separate PR.

Edit: Ah, you're right, I had it confused.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you're right, I had it confused.

No worries – all the functions having (almost) identical names make it easy to mix them up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you're right, I had it confused.

No worries – all the functions having (almost) identical names make it easy to mix them up.

@dnadlinger
Copy link
Contributor

dnadlinger commented Jun 22, 2016

@WalterBright:

The problem was always there, it was just hidden in the call to hashOf due to the unsafe interface to hashOf. This PR does not introduce a problem.

This is correct as far as the client (druntime user) is concerned. The function was mistakenly callable from safe code before, and continues to be so. However, this is completely besides the point – I already agreed that deprecating the old getHash interface is something we can handle separately.

Where this PR does introduce a problem, however, is for maintaining druntime. While the interface is still broken for the user just the same, your change spreads out the mistakenly @trusted bits further, increasing the number of places that we eventually have to remember to go back to and fix. With that one getHash definition still being @safe you would have gotten an error if you had removed the mistaken @trusted annotation from hashOf. Instead, we now have another broken piece of code that goes undetected by default and is hard to make sense of when browsing the source code. I'm afraid you'll have to explain to me why you think that documenting something like this is a bad idea, because I can't see a reason not to.

In case that was somehow lost in translation, this is literally what I have been asking for previously, and still am – adding a short comment describing why introducing a piece of code not that different from auto foo() @trusted { return (cast(void*)0xdeadbeef)[0 .. 1]; } is a justifiable thing to do. Again, I didn't ever disagree with you on it being a valid temporary measure. I just think you need to make obvious to other contributors what is going on there.

And besides, isn't the fact that it managed to spawn a discussion of this length between experts enough of a hint enough that there is something at odds with just "use array interface to hashOf()" as the sole explanation of this contribution? I'm honestly not sure how asking for more than five words of description for a subtle matter like this can even be controversial in the first place.

In any case, would you mind separating out the actual one-line bug fix (zero length vs. seed) into a second PR so I can merge it right away?

No, because it's time to settle this point, not defer it.

It was only a suggestion to resolve this impasse, as I find it silly to hold up a bug fix and spend absurd amounts of time on a discussion just because of your apparent unwillingness to even minimally document your work. If you want to continue this argument instead, fine by me.

Unless you insist, I'm not going to comment on the (non-)applicability of your comparison to const-correctness or the technical merits of resorting to BOLD ALL CAPS for fallacious arguments, since I don't think those are related to what my point is.


@andralex:
I'm not sure how your first question is relevant to this discussion. hashOf is a druntime-internal function that is marked @trusted, and as such I don't think proving memory safety violations in it is something high up on our list of priorities (although I suppose we could think about making known-unsafe operations in @trusted code errors too).

@WalterBright
Copy link
Member Author

your change spreads out the mistakenly @trusted bits further,

No, it does not. It simply identifies code that should have been marked @trusted to begin with, because it was not @safe despite being marked @safe.

It was only a suggestion to resolve this impasse

The consequences will be to avoid making a decision and never fix hashOf.

I'm not going to comment on the (non-)applicability of your comparison to const-correctness

It's an apt comparison. Insisting on a "full stack" approach to fixing viral issues means nothing will get done.

your apparent unwillingness to even minimally document your work.

I see no purpose to documenting things that are listed as bugs in bugzilla. You're putting forth a new policy from existing practice. You might as well require that all phobos bugs in bugzilla also get PRs to document them in the source code. Remember, this PR is not about fixing bugs in the interface to getHash, it is about making the interface to hashOf safe. Conflating this with other problems is not how PRs are supposed to work. Grepping a codebase for FIXME is the old way of doing things. We use bugzilla now.

And besides, simply marking it as @trusted is supposed to be a flag for reviewers. It is not hidden.

@WalterBright
Copy link
Member Author

Are there improvements in the compiler that detect @safe is misapplied to hashOf and render it incompilable?

The @safe was misapplied to getHash, not hashOf. hashOf was marked @trusted although it did not provide a trusted interface to what it was doing. It's like marking free(void*) as trusted. Switching to an array interface makes it possible to be trusted.

What is the plan after this step?

My plan was grepping the druntime sources for .ptr and reviewing/improving them one by one. I'd be surprised if there wasn't more crap in there like hashOf.

@WalterBright
Copy link
Member Author

Keep in mind that this PR is very simple and straightforward:

It adds an overload to hashOf that provides a safe array interface rather than an unsafe ptr/length interface. All callers to it in druntime were fixed to use the new array interface.

It's a self-contained, incremental step towards improving the overall safety of druntime/Phobos.

@dnadlinger
Copy link
Contributor

dnadlinger commented Jun 22, 2016

No, it does not. It simply identifies code that should have been marked @trusted to begin with, because it was not @safe despite being marked @safe.

[…]

The @safe was misapplied to getHash,

These are nonsensical statements. @safe cannot be misapplied [1], it can only be mistakenly allowed due to compiler bugs. Neither of it is the case here.

Also, the code shouldn't have been marked @trusted to begin with, but @system. I suspect (and dearly hope) that this is obvious to you, but it's not exactly helping your argument.

[1] In the sense that compilation will fail instead of it causing a type-system breaking bug.

@schveiguy
Copy link
Member

schveiguy commented Jun 23, 2016

It adds an overload to hashOf that provides a safe array interface rather than an unsafe ptr/length interface.

This PR does not do that. I already pulled #1593

{
byte[] s = *cast(byte[]*)p;
return rt.util.hash.hashOf(s.ptr, s.length * byte.sizeof);
const s = *cast(void[]*)p;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please cast to const void[] instead

@WalterBright
Copy link
Member Author

This PR does not do that. I already pulled #1593

You're right. This one fixes the calls to it.

it can only be mistakenly allowed due to compiler bugs. Neither of it is the case here.

This is incorrect. The problem was that hashOf was marked @trusted, even though hashOf did not have a safe interface. Trusted can be used to hide unsafety from the compiler, and was in this case, and there is no compiler technology that can detect that in the general case.

the code shouldn't have been marked @trusted to begin with, but @System. I suspect (and dearly hope) that this is obvious to you

It is obvious to me. But adding @System to getHash is an API-breaking change, and needs to be in a separate PR and be evaluated for its breaking effects before doing so. Such is beyond the scope of this PR, which is solely about hashOf.

@andralex
Copy link
Member

This is a viable PR that marks progress on an issue. The intensity of the discussion is somewhat proportional; clearly something hanky-panky was going on in the virtual getHash method. Currently we can't fix that without breaking compatibility, but it's self-evidently good to mark that as @trusted instead of @safe (e.g. when searching for fishy code you grep for the former, not the latter).

@WalterBright please add this just before the getHash declaration:

// TODO: fix https://issues.dlang.org/show_bug.cgi?id=12516 e.g. by changing this to a truly safe interface.

I'll pull after @WalterBright addresses the nits.

@WalterBright
Copy link
Member Author

@andralex done

@andralex andralex merged commit 6251718 into dlang:master Jun 25, 2016
@andralex
Copy link
Member

alrighty then

@WalterBright
Copy link
Member Author

thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants