Fix union layout and initialization#1846
Conversation
|
Doesn't fix #1829. |
|
The _init type should now be always identical to the normal type (the used fields for the normal type are based on the field initializers). Only the initializer of globals may use a different type, based on which fields are set (this could be optimized to use the normal type in most cases - most aggregates don't feature any unions anyway). |
dnadlinger
left a comment
There was a problem hiding this comment.
Love the red (although I'm still not too big a fan of removing braces from block statements).
I haven't had too close a look at the priorities management code, but assuming the test coverage is good, I think this is good to go. As far as I can see, the whole _init type scheme was based on the idea that we could deterministically construct that type without looking at the values themselves. This would be very desirable for forward-references, etc. – but unfortunately, we've already figured out that that this can't be done in general (e.g. for unions). Thus, we should just remove the excess baggage/potential for bugs.
| } | ||
|
|
||
| namespace { | ||
| enum FieldPriority { |
There was a problem hiding this comment.
I'd have had to implement the < and -- operators then, so I changed my initial enum class to enum. ;)
ir/irtypeaggr.cpp
Outdated
| } | ||
|
|
||
| void AggrTypeBuilder::addAggregate(AggregateDeclaration *ad) { | ||
| addAggregate(ad, nullptr, /* addAliases = */ true); |
There was a problem hiding this comment.
Just make the parameter an enum then to make it self-documenting?
ir/irtypeaggr.cpp
Outdated
| continue; | ||
| if (v_begin < f_end && v_end > f_begin) { | ||
| if (addAliases && v_begin == f_begin && | ||
| DtoMemType(data[i]->type) == DtoMemType(field->type)) { |
There was a problem hiding this comment.
Couldn't this just be vd? In fact, if you loop over all of data anyway, can't you just turn the whole thing into a range-based for loop?
ir/irtypeaggr.cpp
Outdated
| if (field->_init != nullptr && !field->_init->isVoidInitializer()) { | ||
| IF_LOG Logger::println("adding explicit initializer for struct field %s", | ||
| field->toChars()); | ||
| unsigned errors = global.errors; |
There was a problem hiding this comment.
(nitpick in old code: const)
By refactoring IrAggr::addFieldInitializers() and making it use an extended and refactored AggrTypeBuilder::addAggregate(). AggrTypeBuilder::addAggregate() can now optionally detect alias fields in unions (same offset and LL type as a dominant union field) and add those to the variable-to-GEP-index mapping. These alias fields can then be indexed directly with a GEP instead of resorting to casting the pointer and applying the byte offset.
c39a298 to
f675a3c
Compare
tests/codegen/union.d
Outdated
| char[2][4] multidim; // multidimensional init based on a single 0xff char | ||
| } | ||
| // CHECK-DAG: %union.S = type { i8, [3 x i8], i32, [2 x i8], [4 x [2 x i8]], [2 x i8] } | ||
| // CHECK-DAG: @_D5union1S6__initZ = constant %union.S { i8 -1, [3 x i8] zeroinitializer, i32 0, [2 x i8] zeroinitializer, [4 x [2 x i8]] {{\[}}[2 x i8] c"\FF\FF", [2 x i8] c"\FF\FF", [2 x i8] c"\FF\FF", [2 x i8] c"\FF\FF"], [2 x i8] zeroinitializer } |
There was a problem hiding this comment.
It was really necessary to escape the [[ (FileCheck variable matcher) in [4 x [2 x i8]] [[2 x i8] c"\FF\FF", ...] via {{\[}}[. ;(
|
Alright, the 2nd commit fixes #1829 for DMD at least, as expected. It should restrict that bug to only a few exotic cases. |
ir/iraggr.cpp
Outdated
|
|
||
| auto llStructType = getLLStructType(); | ||
| bool notCompatible = (types.size() != llStructType->getNumElements()); | ||
| if (!notCompatible) { |
There was a problem hiding this comment.
Style nitpick: I'd remove the negation here and call the variable just compatible – my brain parsed !notCompatible parsed in interesting ways when just skimming the source. (Or at least, incompatible.)
|
Feel free to merge once tests are green. Codegen should effectively be the same, and there shouldn't be any compile time pessimisations either, right? (Of course, avoiding to switch out the global in general would be a slight improvement, even.) I'm still hoping somebody will sit down and nicely abstract away all the bool zext stuff soon, though. |
There's no <Type>_init type for aggregates (structs and classes) anymore, effectively eliminating a *lot* of named LLVM types, some bitcasts as well as replacements of globals etc. To get there, it was even required to use the regular type for compatible literals too, otherwise structs embedded as fields in other aggregates had an anonymous type (well, the LLVM constant for the field initializer had) and so the container initializer wasn't compatible with the regular type anymore. What was also necessary was a fix wrt. static arrays of bools (LLVM constant of type `[N x i1]` vs. `[N x i8]` for regular type). I also had to change the initializer for `char[2][3] x = 0xff` from `[6 x i8]` to `[3 x [2 x i8]]`, i.e., NOT flattening multi-dimensional inits from a scalar. So only literals with overlapping (union) fields and an explicit initializer initializing dominated non-alias union fields should still have a mismatching anonymous type - i.e., very, very few cases.
|
Thanks for the blazingly fast review. ;)
Based on the admittedly vague CI results, there seem to be no significant changes.
I was thinking the same, it's a pain. Using |
Boolean values (true/false/comparison results/…) are
Was there an actual bug – as in, wrong codegen – anywhere? |
|
We might also want to rebase this onto 0.17.x to keep DMD hostable with a C++-based compiler. |
|
I also had to zext scalar bools to prevent type mismatches. Furthermore, I've had a look at how the dynamic struct literals are generated, noticed an issue and fixed + refactored that part too. I hope this PR now fixes our union-related bugs for good and gets stuff like #1719 working again. |
I did a quick test with bools with an intermediate commit, which seemed to suggest that LLVM didn't pack scalar Apart from that, there was at least one major issue wrt. potentially diverging LL type layouts (even in size, and according initializer garbage) between _init and regular types for unions, as I posted here. |
| // initialize any padding so struct comparisons work | ||
| if (vd->offset != offset) | ||
| if (vd->offset != offset) { | ||
| assert(vd->offset > offset); |
There was a problem hiding this comment.
Damn, this assertion doesn't hold for the code in #1324. :(
|
I've amended the last commit to match DMD behavior for DMD issue 16471 referenced by #1324 (which is resolved by this, as we depend on a front-end fix to properly fix this). The PR should now be complete. ;) |
The front-end fills in missing init expressions (except for the nested
context pointer in most cases), so there are no implicitly initialized
fields for dynamically initialized struct literals.
The front-end is also supposed to make sure there are no overlapping
initializers by using null-expressions for overridden fields, but doesn't
in some cases (DMD issue 16471).
Instead of preferring the first one in lexical field order, now use the
lexically last one to mimic DMD.
The previous code iterated over the fields in lexical order and ignored
the initializer for a field if there were earlier declared fields with
greater offset (and an initializer expression), which is wrong for
anonymous structs inside a union:
union {
struct { int i1; long l = 123; }
struct { int i2; int x = 1; }
}
`x` was previously initialized with 0 (treated as padding).
No description provided.