Skip to content

sema: Support reinterpreting extern/packed unions at comptime via field access#17352

Merged
andrewrk merged 3 commits intoziglang:masterfrom
kcbanner:extern_union_comptime_memory
Oct 3, 2023
Merged

sema: Support reinterpreting extern/packed unions at comptime via field access#17352
andrewrk merged 3 commits intoziglang:masterfrom
kcbanner:extern_union_comptime_memory

Conversation

@kcbanner
Copy link
Contributor

@kcbanner kcbanner commented Oct 1, 2023

Closes #17311

My previous change for reading / writing to unions at comptime did not handle union field read / writes correctly in all cases. Previously, if a field was written to a union, it would overwrite the entire value. This is problematic when a field of a larger size is subsequently read, because the value would not be long enough, causing a panic.

Additionally, the writing behaviour itself was incorrect. Writing to a field of a packed or extern union should only overwrite the bits corresponding to that field, allowing for memory reintepretation via field writes / reads.

I addressed these problems as follows:

Add the concept of a "backing type" for extern / packed unions (Type.unionBackingType). For extern unions, this is a u8 array, for packed unions it's an integer matching the bitSize of the union. Whenever union memory is read at comptime, it's read as this type.

When union memory is written at comptime, the tag may still be known. If so, the memory is written using the tagged type. If the tag is unknown (because this union had previously been read from memory), it's simply written back out as the backing type.

I added write_packed to the reinterpret field of ComptimePtrMutationKit. This causes writes of the operand to be packed - which is necessary when writing to a field of a packed union. Without this, writing a value to a u1 field would overwrite the entire byte it occupied.

The final case to address was reading a different (potentially larger) field from a union when it was written with a known tag. To handle this, a new kind of bitcast was introduced (bitCastUnionFieldVal) which supports reading a larger field by using a backing buffer that has the unwritten bits set to undefined. The reason to support this (vs always just writing the union as it's backing type), is that no reads to larger fields ever occur at comptime, it would be strictly worse to have spent time writing the full backing type.

I added new tests cases to cover these cases. I originally wrote them to just run at comptime, but I thought it would be useful to check this behaviour at runtime as well, and discovered a couple issues there:

I've skipped some of the runtime portions of these new tests.

Other changes:

  • Payload.Union now has an optional tag value, to correctly support uninterning .none-tagged unions
  • TypedValue now prints the value of the backing storage of the union (either as a byte array, or an integer)

ie.

  %13!= dbg_var_val(<union.test.memset extern union at comptime.U, .{ (unknown tag) = "\x00" }>, "u")

…ld access

My previous change for reading / writing to unions at comptime did not handle
union field read/writes correctly in all cases. Previously, if a field was
written to a union, it would overwrite the entire value. This is problematic
when a field of a larger size is subsequently read, because the value would not
be long enough, causing a panic.

Additionally, the writing behaviour itself was incorrect. Writing to a field of
a packed or extern union should only overwrite the bits corresponding to that
field, allowing for memory reintepretation via field writes / reads.

I addressed these problems as follows:

Add the concept of a "backing type" for extern / packed unions
(`Type.unionBackingType`).  For extern unions, this is a `u8` array, for packed
unions it's an integer matching the `bitSize` of the union. Whenever union
memory is read at comptime, it's read as this type.

When union memory is written at comptime, the tag may still be known. If so, the
memory is written using the tagged type. If the tag is unknown (because this
union had previously been read from memory), it's simply written back out as the
backing type.

I added `write_packed` to the `reinterpret` field of
`ComptimePtrMutationKit`. This causes writes of the operand to be packed - which
is necessary when writing to a field of a packed union. Without this, writing a
value to a `u1` field would overwrite the entire byte it occupied.

The final case to address was reading a different (potentially larger) field
from a union when it was written with a known tag. To handle this, a new kind of
bitcast was introduced (`bitCastUnionFieldVal`) which supports reading a larger
field by using a backing buffer that has the unwritten bits set to
undefined. The reason to support this (vs always just writing the union as it's
backing type), is that no reads to larger fields ever occur at comptime, it
would be strictly worse to have spent time writing the full backing type.
@kcbanner kcbanner force-pushed the extern_union_comptime_memory branch from f5adcc2 to 3c317c6 Compare October 2, 2023 17:15
… fields

Updated the tests to also run at runtime, and moved them to union.zig
@kcbanner kcbanner force-pushed the extern_union_comptime_memory branch from 3c317c6 to fb33bc9 Compare October 2, 2023 17:29
@andrewrk
Copy link
Member

andrewrk commented Oct 3, 2023

Excellent work.

@andrewrk
Copy link
Member

The new behavior test caused #19389.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OOB panic when reading inactive field of a comptime var extern union

2 participants