Skip to content

Comments

Fix issue 19065: don't check struct invariant before destructor runs.#8462

Closed
FeepingCreature wants to merge 2 commits intodlang:masterfrom
FeepingCreature:fix/Issue-19065-skip-invariant-before-struct-dtor
Closed

Fix issue 19065: don't check struct invariant before destructor runs.#8462
FeepingCreature wants to merge 2 commits intodlang:masterfrom
FeepingCreature:fix/Issue-19065-skip-invariant-before-struct-dtor

Conversation

@FeepingCreature
Copy link
Contributor

@FeepingCreature FeepingCreature commented Jul 6, 2018

This follows up on dlang/dlang.org#2410

Implicit in the implementation of moveEmplace is that T.init is always a valid value of T.

This isn't necessarily the case, but it isn't actually necessary for it to be the case - all that's necessary is that T.init always be a valid value to call ~this() on. This is because ~this() is the only struct method whose invocation is utterly unavoidable.

Note that moveEmplace already makes this assumption.

Maybe a better fix would be checking for this == typeof(this).init, and running the invariant otherwise - but I don't know how to do that.

@dlang-bot
Copy link
Contributor

Thanks for your pull request and interest in making D better, @FeepingCreature! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the annotated coverage diff directly on GitHub with CodeCov's browser extension
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Auto-close Bugzilla Severity Description
19065 normal Struct invariant violated in @safe with T.init

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub fetch digger
dub run digger -- build "master + dmd#8462"

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Jul 6, 2018

  1. Tests are failing.
  2. Comment out tests.
  3. Tests are no longer failing!

😄

Joke aside, I think this is the most compact change that makes Nullable actually usable for all types. (See dlang/phobos#6619 (comment) , which prompted it.)

@WalterBright
Copy link
Member

This is an enhancement, not a bug report. See 19065. Needs a compelling rationale.

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Jul 8, 2018

See also forum thread for attempt at running discussion.

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Jul 9, 2018

Would it help if I made a DIP?

@wilzbach
Copy link
Contributor

wilzbach commented Jul 9, 2018

Yes, for a DIP there's a formal layout and procedure which will ensure a decision/reply.
However, there might be more immediate things that need fixing too :/

@FeepingCreature
Copy link
Contributor Author

Yeah I'll try to get generic Nullable merged and probably work around this weird design decision in userland, As One Does In D.

Getting ready to write a whole lot of

invariant
{
    if (this.initialized)
    {
        ...

@wilzbach
Copy link
Contributor

wilzbach commented Jul 9, 2018

BTW as you are looking into D's nullable Lukas (from the C++ Munich Meetup) shared a very interesting finding with me.
D's codegen for nullable is a lot worse than C++ and Rust (given the same llvm backend):

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Jul 9, 2018

Okay, I think part of it is that for some reason ldc decides it's a good idea to fill the upper 7 bytes of Nullable (not taken up by the bool) with zeroes manually using a memcpy, and LLVM can't handle this. After that I don't know. I think possibly because Optional is standardized (in Rust as well), clang++ is smart enough to decide that if an optional is null, it doesn't matter what's in the value field. I don't know if D can compete with tricks like that; we'd probably need an Optional type in the compiler.

edit: Note the asm looks a lot better if you inline Nullable.

@yshui
Copy link
Contributor

yshui commented Jul 13, 2018

I don't like this change.

I mean, if this is what you want, you can always do that in your invariant:

invariant {
    assert(this == typeof(this).init || your actual check here);
}

@FeepingCreature
Copy link
Contributor Author

On reflection, this is probably not the way to go.

The thing that I actually need is a way to decouple struct value lifetime from struct variable lifetime. This is necessary so you can implement container types containing arbitrary types in a way that doesn't end up with the container types crashing when they go out of scope and D tries to destruct the variables that they contain, which may not even be initialized to values but to, say, T.init. Nullable is the primary example of this, but it affects any type where the lifetime of a contained value may not match the lifetime of the container value.

@yshui
Copy link
Contributor

yshui commented Jul 13, 2018

@FeepingCreature This is not the problem here. Destructor of T should always be able to handle T.init.

One of @WalterBright's argument against struct default constructor is that not having it will force the programmer to consider T.init a valid value of T.

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Jul 13, 2018

@FeepingCreature This is not the problem here. Destructor of T should always be able to handle T.init.

One of @WalterBright's argument against struct default constructor is that not having it will force the programmer to consider T.init a valid value of T.

You can @disable this though. And you can put invariants in structs. What the hell else does that mean rather than "you can define structs, not all of whose possible data values are legitimate struct values"? Then combine with the fact that ime, 90% of invariants forbid .init state, ie. null...

I mean, if we're deciding that structs aren't allowed to hold values that mustn't be null that indicates a pretty hefty step backwards for the language for me, since it'll force us to go back to classes for the majority of domain values, resulting in a massive tree of tiny values connected by a sprawling web of references, with all the GC overhead that implies.

My impression of .init was never that it was supposed to be a valid domain value. I thought the point was that it would fail in a deterministic fashion, ie. not randomly corrupt data but rather crash with a null pointer access. Hence the init of float being NaN, which is the definition of "not a domain value".

Copy link
Member

@UplinkCoder UplinkCoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this PR.
It's elegantly written and has a compelling rational behind it.

@yshui
Copy link
Contributor

yshui commented Jul 24, 2018

@UplinkCoder It doesn't compel me.

@FeepingCreature If you want a struct that never holds a null value. Than its .init can't be null either. Otherwise T.init will hold a null value and break your assumption that T is never null. T.init is always assumed to be a valid value (according to @WalterBright)

In fact doing what you are suggesting in this PR in fact semi solidifies that T.init is a valid value: you made it always valid for destructor calls. User has no way to prevent the variable from holding T.init if no other methods than the destructor is called.

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Jul 24, 2018

@FeepingCreature If you want a struct that never holds a null value. Than its .init can't be null either.

That doesn't work. You can't give a struct a default value of an object that only makes sense at runtime. To say that its init can't be null is to say I can't have an invariant that a struct must hold a valid object; that is, I can't put an object reference in a struct and be assured it's not null. With that restriction, @disable this() makes zero sense -- why bother disabling the default constructor if the language forces me to leave another default constructor wide open?

In fact doing what you are suggesting in this PR in fact semi solidifies that T.init is a valid value: you made it always valid for destructor calls.

Yes, that's the point. I'm defining struct valid values to be {a value that passes the invariant, T.init}, then restricting the methods defined to be callable on an "extended valid value" to be only the destructor. Again, the only reason for this is that the destructor is the only struct method that is always, inevitably called and cannot be skipped. (Discounting the union hack.)

Semantically, I'm separating "struct values" from "struct variables". I'm saying, "any struct variable can, but doesn't have to, hold a struct value; however, it must be either holding a struct value or init; and if it holds init, only its destructor may be called." This is consistent with how it works right now - in fact, it's more consistent than it works right now - because the destructor is called lexically and has no relation to struct construction.

I understand that this is subtle, but you need to understand that barring hacks, it's impossible to write a (pointerless) Nullable that works for structs without it. It's a genuine hole in the language.

@yshui
Copy link
Contributor

yshui commented Jul 24, 2018

@FeepingCreature If so, instead of not check invariants at all before destructor, you should check if the variable is T.init, and only skip invariants in that case.

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Jul 24, 2018

@yshui I agree that implementation would be superior, but I'm not good enough at DMD internals to implement it.

@yshui
Copy link
Contributor

yshui commented Jul 24, 2018

Also

Barring hacks

I don't know what do you mean by "hacks". Using union to circumvent destructor is not a hack. A language like D should provide a way to the user that allows them to decide whether the destructor should be called. The code can have information that the compiler doesn't to decide that skipping destructor can be safe.

And I want to backpaddle a bit on my argument. This change forces the destructor to handle T.init, on the other hand, if you allow the library to skip the destructor in case of T.init, then the user code doesn't need to worry about it at all.

All in all I just dislike changes that adds more "subtlety" to language. Especially when the added "subtlety" can be done equally well in the library. (In that case, Nullable needs to be fixed)

Edit: Just noticed in the forum thread you discussed how union doesn't work. I don't quite see why?

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Jul 24, 2018

I don't know what do you mean by "hacks". Using union to circumvent destructor is not a hack.

I don't understand how you can call that "not a hack". It's a massive hack. Unions cannot support destructors because they don't know what's in them, and that's inherent to unions, sure, but the expected answer to that isn't "so I guess unions can't call destructors", the expected and proper answer is "types with destructors cannot be put in unions." It is mere luck that D does not work that way.

My first preferred solution would be a language level way to mark a variable as "manually destroyed" (@__explicitLifetime?), with T.init being a valid initializer for those variables only. My second preferred solution is destructor invariants being skipped if the type's value is T.init, which I don't know enough about DMD to implement. My third preferred solution is either this PR or the union hack, and the union hack is already in the language. But it's still down the list of how I'd want to handle this, because regardless of the fact that unions work this way, I do believe that they shouldn't work this way.

Edit: Just noticed in the forum thread you discussed how union doesn't work. I don't quite see why?

I don't remember that, link please? In any case, my Nullable PR does rely on the union hack, so I am using it already. I just don't like it.

@yshui
Copy link
Contributor

yshui commented Jul 24, 2018

@FeepingCreature There're a lot of things in D that just "happen to work". And because of that, we already have a way to create "manually destroyed" variables, that is, by using unions. So a language change to add another way to do that is doing to face resistance. (BTW I don't see why only manually destroyed variable can be initialized by T.init: you totally can accept T.init in your invariant and handle it properly.)

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Jul 24, 2018

@FeepingCreature There're a lot of things in D that just "happen to work". And because of that, we already have a way to create "manually destroyed" variables, that is, by using unions.

Yes and if we rely on this too much, we'll never be able to get this unfortunate design decision fixed. It'll just be with us forever, because it'll be too much work to change. So I'm conflicted.

(BTW I don't see why only manually destroyed variable can be initialized by T.init: you totally can accept T.init in your invariant and handle it properly.)

Because language-managed variables should require that a constructor call be matched up with a destructor call because that's what constructor/destructor means - one makes a value, the other unmakes it. The proper way to create a variable is that its value is created by a constructor call and destroyed by a destructor call. The only case where T.init is relevant is for variables that may start out in an unconstructed state - which is what T.init is - in which case it's up to you to ensure that values that are constructed and put inside it are also destructed before the variable goes out of scope. To allow T.init to be destructors breaks the symmetry between constructor and destructor - it says "these are totally different kind of things." And that makes the language uglier for no gain. After all, if you want a simple way to construct a value of a type, you can always use the default constructor T(). T.init is only necessary if the default constructor has been @disabled - in which case you presumably care about the difference between a variable and a value and don't want all variables to be implicitly destructed either, especially if doing so requires a valid invariant.

@yshui
Copy link
Contributor

yshui commented Jul 24, 2018

@FeepingCreature First off, constructor is definitely not the only way to "make" a value, and people are doing that all the time. Like, what about POD structs?

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Jul 24, 2018

POD structs don't have destructors either.

Besides, you can just consider them to have "default constructors/destructors", which they actually will if you put a type with a constructor or destructor in them, and if you don't, well, how would you know?

Lots of types don't run into this issue. But when they do, it would be good if the language was internally consistent.

@safe void main()
{
S s = S.init;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the opposite fail? i.e.

struct S {
  bool value = false;
  invariant {
    assert(value == false);
  }
  @disable this();
}

void main() {
  S s = S.init;
  s.value = true;
}

I ask because I wonder if the check to see if the invariant needs to be called before the destructor depends on a by value comparison of this and T.init or an address?

If it's a by value comparison then will it work for classes where this == other compares addresses?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no such check. That was something that'd been proposed, but I don't know how to write it.

I'm going to close this PR. It doesn't do anything good or worthwhile, and I found a better way. (Unions skip destructors, apparently intentionally??)

@FeepingCreature
Copy link
Contributor Author

Closed because union hack lets us work around it.

@radcapricorn
Copy link
Contributor

I'd rather you revived that; the discussion on the forum shows clearly this is pretty much a requirement at this point. One thought though: shouldn't that extend to classes as well?

@radcapricorn
Copy link
Contributor

Unless a substantial language improvement can be made in terms of moving values around (thereby informing the compiler to elide destructors), precluding invariant in a destructor should be the way to go. T.init must be destructible state, but not necessarily useful state.

@radcapricorn
Copy link
Contributor

I wonder... how hard would it be to achieve an opt-in alternative? E.g.

struct S {
    invariant() { /* ... */ }
    ~this() @noinvariant { /* ... */ }
}

The wolves would be fed, the sheep would be safe.

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Nov 25, 2018

I'm all for it in principle, but absent developer (meaning Walter/Andrei) commentary stating otherwise, it seems like the union hack is what we're supposed to use to manage values that may be initialized or not.

I don't like it, but I can live with it.

@aliak00
Copy link

aliak00 commented Nov 26, 2018

I wonder... how hard would it be to achieve an opt-in alternative? E.g.

But it still feels more correct to not destruct anything that's in a .init state no? Or is there some behavior that DMD relies on that would not allow for that?

I'm still not sure how 19065 is an enhancement and not a bug, as @WalterBright is arguing that a compile-time value should be a valid runtime value? (if i understood correctly).

Edit: also wouldn't it just be better code for the destructor not to be called if no type of constructor was called?

@radcapricorn
Copy link
Contributor

It's simple, really. The language is allowed to freely move values on an assumption they're not self-referencing. Ergo, moving values by user code should also be legal. This should work:

struct Container(T) {
    T data;
    void put(T value) {
        move(value, data);
    }
}

And it does so long as there are no invariants involved. But what that does is (potentially) reset value to T.init, of course without eliding destruction, as there's no way in the language to inform the compiler of such intent. This means that any T.init must be destructible. Requiring it to also be invariant is severely limiting.

struct NotNull(T) {
    private T* ptr;
    private bool owned;
    invariant() { assert(ptr); }
    this(T* src, bool own) @safe { ptr = src; owned = own; }
    @disable this(this);
    ~this() @trusted {
        if (ptr && own) free(ptr);
    }
    ref get() inout @safe { return *ptr; }
}

This struct is non-copyable, but it can be moved, thereby passing ownership to someone else. Moving it via standard utilities (move, moveEmplace) will break invariant. Which is great to assert logic errors. Not so great when the moved-from instance goes out of scope. Destructors shouldn't be constrained by the invariant, or at least there must be a way to enable them to not be.

also wouldn't it just be better code for the destructor not to be called if no type of constructor was called?

We can't enforce that. Especially when dealing with custom allocators.

@aliak00
Copy link

aliak00 commented Nov 27, 2018

Thanks for taking the time @radcapricorn :). Ah ok, so the problem is that the compiler cannot know if an object is in its init state so this becomes impossible to do? But, isn't the init state a compile time value?

And as for custom allocators, if the compiler knew which objects were in their init state, then for this code:

{
  A a = customAlloc();
}

When a goes out of scope, if it is not in its .init state, the destructor could be called (since customAlloc probably invoked a constructor. If OTOH customAlloc was just: { return A.init; } then the compiler would know that a is in an .init state because (again, IIUC) DMD would just perform a move in that scenario. So it would be the same as A a; move(a, A.init);. And in that case the target's destructor is not called because it is in an init state and the source's destructor is not called because it is in an init state.

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Nov 27, 2018

Yes, one of the proposed solutions for the issue was to check for the init state specifically and if so, skip the invariant check before the destructor or just skip the entire destructor call. But I'm not versed enough in dmdfe details to code that, and I suspect W&A would consider it too magical. (ping @andralex )

edit: The destructor should probably be skipped in the minimally-initialized case too (memory all zero) so that minimallyInitializedArray becomes actually usable for types with a destructor.

@thewilsonator
Copy link
Contributor

I don't think checking the init state is feasible

{
    A a; 
    if (someCond())
          a = A(wi,th,some,params);
} // is a == A.init? who knows!

@radcapricorn
Copy link
Contributor

So the problem is that the compiler cannot know if an object is in its init state so this becomes impossible to do? But, isn't the init state a compile time value?

It is, but we're talking about what happens at runtime.

And as for custom allocators, if the compiler knew which objects were in their init state, then for this code:

{
    A a = customAlloc();
}

That code does not even touch the tip of the iceberg as far as custom allocators go.

When a goes out of scope, if it is not in its .init state, the destructor could be called (since customAlloc probably invoked a constructor.

You can't know that at compile time in D. It's not Rust, it doen't do that kind of flow analysis :)
customAlloc may be separately compiled, or like Nicholas is showing, the call may be conditioned.

@FeepingCreature
Copy link
Contributor Author

The nice thing about implicitly checking if the object is in its init state, is if an object in its init state goes out of scope there probably isn't anything useful the destructor can do anyways.

@radcapricorn
Copy link
Contributor

Yeah but that would be a runtime check for every object with a destructor at the end of every scope.

@aliak00
Copy link

aliak00 commented Nov 29, 2018

@radcapricorn @thewilsonator ok, I got it now I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants