add support for Unrestricted Unions#5830
Conversation
127f809 to
a7f254e
Compare
|
Ready 2 rock. |
|
Nice! |
|
Please add a change log entry, as marketing this change is important :) |
|
This change concerns me. Not that the ability to do this is bad, but shouldn't it be distinguishable from a standard union? How's it interplay with other aspects of D. Just because it exists in c++ doesn't mean it's automatically good for D. |
|
What's the justification behind this? What's the use case? Does this make any un- |
|
@yebblies refer to https://issues.dlang.org/show_bug.cgi?id=16104. The problem with the current definition of |
|
My problem with this is that while unions had limitations before, this opens a gigantic can of usability problems without any mechanisms to warn the user. Before unions were a fancy reinterpret casting construct. Now you've got one that also allows the inclusion of types that must have helper functions be invoked in exactly the right places with no warning that you've done that or that any of the types involve have potentially been changed from pod behavior to non-pod behavior. Yes, the limitations are too restrictive for some very interesting use cases, but they made unions a lot less likely to be used improperly. I'm not at all opposed to the new capabilities, as they are useful, I just want them separated or at least the previous protections kept available (and probably the default). |
a7f254e to
fa7fd2f
Compare
|
Added changelog.dd entry |
fa7fd2f to
3725a66
Compare
|
I agree with @braddr and disapprove of this change (feature is good, but it must be a different type of union). Union restrictiveness helped me found some nasty bugs while porting code from D1 to D2 and that was very appreciated. I'd like to keep that safety tool in my arsenal. |
|
@mihails-strasuns-sociomantic I've been improving the safety checks to prevent more unsafe uses of unions. |
|
We've been light on process on this one because the C++ community has been through the same stages and has done all the ground work of discussing the relaxation, going through all arguments. etc. (Those might be worth looking at: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2544.pdf.) I myself have lived through the pain of pre-C++11 unions. There is no doubt this relaxation improves the language. Disallowing types with destructors in a union provides at best a false sense of security. The moment two distinct types are allocated at the same address indiscriminately, a core tenet of the language (i.e. memory is typed, each object has a distinct address) is broken. So I'll pull this now. If, after perusing http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2544.pdf, the conclusion is reached that that work does not apply to D, feel free to raise the issue. |
|
Auto-merge toggled on |
|
Grump, in a major way. This pull doesn't seem to have any of the extra checks and balances that the c++ world has. The c++ version, as described in that doc, I'd have no problems with. |
What in particular would you like to see? |
|
Thinking about this more, this feature would have been a great candidate for the new DIP process that is being discussed in the forums. Especially since that new language features of this nature are supposed to go through DIPs, which this would have benefited from considering only Andrei and Walter think that merging as is was a good idea and three other core team members disagree. |
|
Somewhat annoying to only find this b/c the changelog entry left so many questions. As stated before the implementation is fairly simplistic, and we should try to add a few safety mechanisms. C++ also has several so the comparison w/ C++'s unrestricted unions is incorrect. We don't seem to have the initialization problem of C++.
We do have a destructor/postblit/invariant problem
I can think of 3 kinds of unions, untagged, externally tagged, and internally tagged (e.g. using lower pointer bits) ones. Based on that, here is an idea:
Just rough ideas, but this feature clearly needs to make a small design roundtrip. |
What would be C++'s safety mechanisms with unions, and how do they render the comparison incorrect? Thanks.
Could you please put this another way? I didn't get it.
Construction itself (e.g. on the stack or as a member inside another object) requires the destructor. The point is to allow that. What other operations do you have in mind?
Trying to understand this. Does this mean letting a stack-allocated union go out of scope is unsafe?
Consider: struct Optional(T) {
union { T value; }
bool isValid;
...
}The goal here is to make sure Optional can implement the usual semantics (if isValid is true then the value is a T object, otherwise it's just trash). Would the non-copyability of the union make things difficult for the Optional?
That sounds like a lot of busywork for no good outcome. To put it simply, unless we want to "get real" with built-in tagged unions, I think we're wasting time embellishing unions (which currently are untagged by the language). The C, C++, and D If we were to support unions tagged implicitly by the language, the situation would be very different. I see at least one good use in better GC traceability, not to mention dynamic introspection etc. That kind of effort might be worth getting into. |
|
This is the actual design work somebody should've done before opening this PR, but not us in last minute before a release. This alone should be reason enough to throw this out of the release.
From the C++ specs you quoted (Unrestricted Unions (Revision 2))
This is a good idea to avoid silent leaking or unsafe behavior. I tried to emulate that with the
The compiler cannot generate such functions, it doesn't know which field to destruct/postblit. struct SomeExample
{
static union U
{
int fd;
File file; // RefCounted resource
}
bool file;
U u;
~this() // you need to write this destructor b/c u.~this is @disabled
{
// destroy the correct u field
if (file)
.destroy(file);
else
.close(fd);
__nodtor(u); // tells the compiler to skip u.~this
/** It doesn't seem required to make __nodtor unsafe, but not using it correctly can result in memory/resource leaks. A closer analysis of __nodtor @safety would be required.
*/
}
this(this) // you need to write this postblit b/c u.this(this) is @disabled
{
if (file)
u.file.__xpostblit; // just postblit the correct field
else
// not the best example :)
throw new AssertException("Illegal copy, raw file descriptors aren't refcounted.");
}
}
That was a hasty conclusion, seems like __nodtor must not be unsafe (though it's dangerous b/c you can easily leak).
I won't ask why you want to use a union for that (instead of simply struct Optional(T)
{
union { T value = void; } // disable default construction of first field (doesn't work see Bugzilla 11331)
bool isValid = false; // just to be explicit
this(T value)
{
this.value = value;
isValid = true;
}
~this()
{
// mmh, seems my __nodtor idea doesn't integrate that nicely with anonymous unions,
// b/c they have no name you should prolly have to call it on all union fields w/ dtor
if (isValid)
destroy(value);
__nodtor(value);
}
this(this)
{
if (isValid)
value.__xpostblit;
}
}Here is what the current compiler 2.071.2-b1 does for the File example. The solution I proposed is somewhat complex (comes from translating C++'s default deleted operators), and would require at least 2 new compiler features (__nodtor, defining ~this()/this(this) even though one of the fields has those @disabled).
Sure tagged unions could be fully managed by the compiler and have a lot of interesting properties, e.g. very efficient pattern matching (for D maybe allowing some final switch). Tagged unions are OT though. That was 45-50 min. instead of 5, means 40-45 min. less sleep for me today. |
Bummer about your losing sleep. I really don't know what to say or do. I think no restriction is the way to go and we're really good, and am out of new ways to explain so. I think we'll have to agree to disagree. All that cleverness and __nodtor and subtleties, it's just not proportional response. It's not the kind of language design we want to do. Look over it again - is really The question is now that we just have an irreducible disagreement what is the next step. Please advise. |
As said above, I think it's complex myself, and it was just an attempt to emulate C++'s spec. union U
{
RC rc;
int num;
}
void leak(ref U a, ref U b)
{
a = b; // assignments only bitblit and might leak w/ non-POD fields, why allow it
// either do
a.rc = b.rc;
// or
a.num = b.num;
}
struct S
{
// not implementing `this(this)` and `~this()` in S is wrong and might leak
U u;
}Those are diagnostic errors and could be added later on. |
|
See dlang/dlang.org#1503 for changelog entry. |
|
I believe this may have been a widely unnoticed language change, as it has lead to a nasty memory leak bug in Weka's code that was only discovered by lucky coincidence... Below I've put my current addition to prevent bugs when people accidentally add a dtor to some type that ends up as part of a union (possibly nested). Seems very useful to add to Phobos IMO. template reportDangerousMembers(T, string text, OuterUnion, bool outputMsg)
{
import std.meta;
import std.traits;
alias FilteredDangerousMembers = Filter!(ApplyRight!(.reportDangerousInsideUnion, OuterUnion, outputMsg), FieldTypeTuple!T);
enum bool reportDangerousMembers = !is(FilteredDangerousMembers == AliasSeq!());
static if (reportDangerousMembers)
{
alias DangerousType = FilteredDangerousMembers[0];
enum dangerousMemberIndex = staticIndexOf!(DangerousType, FieldTypeTuple!T);
static if (outputMsg)
pragma(msg, "Member " ~ FieldNameTuple!(T)[dangerousMemberIndex] ~ " of type " ~ DangerousType.stringof ~ " in " ~ text ~ " is dangerous inside union " ~ OuterUnion.stringof ~ ".");
}
}
template reportDangerousInsideUnion(T, OuterUnion, bool outputMsg)
{
import std.traits;
static if (is(T == union))
{
// Recurse deeper
enum bool reportDangerousInsideUnion = reportDangerousMembers!(T, "union " ~ T.stringof, T, outputMsg);
static if (reportDangerousInsideUnion && outputMsg)
pragma(msg, "Union " ~ T.stringof ~ " has dangerous members.");
}
else static if (isStaticArray!T && T.length)
{
// Recurse with element type
enum bool reportDangerousInsideUnion = reportDangerousInsideUnion!(typeof(T.init[0]), OuterUnion, outputMsg);
static if (reportDangerousInsideUnion && outputMsg)
pragma(msg, "Static array " ~ T.stringof ~ " has dangerous element type");
}
else static if (is(T == struct))
{
enum isDangerous = (hasElaborateCopyConstructor!T && isCopyable!T) || hasElaborateDestructor!T;
enum anyMemberIsDangerous = reportDangerousMembers!(T, "struct " ~ T.stringof, OuterUnion, outputMsg);
static if (isDangerous && outputMsg)
pragma(msg, "Struct " ~ T.stringof ~ " is dangerous inside union " ~ OuterUnion.stringof ~ ".");
enum bool reportDangerousInsideUnion = isDangerous || anyMemberIsDangerous;
}
else
{
// Basic types are not dangerous.
// Dynamic objects like classes and dynamic arrays with dtors, etc, are not dangerous inside a union.
enum bool reportDangerousInsideUnion = false;
}
}
// Returns true when union is dangerous.
// A dangerous union is a union where at least one member has a destructor or postblit.
template reportDangerousUnion(U, bool outputMsg = false)
{
static if (is(U == union))
{
enum bool reportDangerousUnion = reportDangerousInsideUnion!(U, U, outputMsg);
}
else
{
enum bool reportDangerousUnion = false;
}
}Simple testcase: struct WithDtor
{
~this() {}
}
struct WithPostblit
{
this(this) {}
}
struct WithDtor2 {
union A2Union {
int b;
WithDtor[4] a2a;
}
A2Union a2union;
}
union B
{
int i;
WithDtor2 a;
// B is dangerous. Also output pragma messages.
static assert(reportDangerousUnion!(B, true) == true);
}
union C
{
int i;
WithPostblit a;
// C is dangerous
static assert(reportDangerousUnion!C == true);
}
union D
{
int i;
ulong a;
static assert(reportDangerousUnion!D == false);
}
struct NoPostblit {
int i;
@disable this(this);
}
union E
{
int i;
NoPostblit a;
static assert(reportDangerousUnion!E == false);
} |
Unrestricted Unions are a feature added to C++11. Unrestricted unions can have fields that have destructors, postblits, or invariants. It's up to the user to call them correctly, the compiler does not automatically generate calls to them.
I added the "Needs Work" label because it is a bit unfinished, I want to see how it passes the autotester so far.