Limit immediate address and alignment#354
Conversation
I figure neither should be infinite!
|
An offset greater than memory_size would already just be OOB at runtime. Do you wish to propose different semantics here? The current text doesn't specify a type for the offset and I agree that it probably should; int32 for wasm32 and int64 for wasm64 seem like the obvious choices. There is currently no semantic consequence to specifying an exorbitant alignment value. Do you wish to propose different semantics here? Regarding specific possible alignment limit values, |
|
I'm also in favor of bounding alignment by the natural alignment as this should give the most flexibility for a simple, compact binary encoding. |
|
The semantics I'm proposing is "not an infinite precision integer". I'm OK bounding it to unsigned I'm not OK with limiting alignment value to natural alignment: we're dropping information that a smart compiler can take advantage of, and which I want to optimize. In particular there are interesting optimizations to be had within a single cacheline, for auto-vectorization, and for auto-magic blocking of array computations in general. One can over-align a base address and then index using it in a loop, allowing us to track each iteration's alignment precisely. |
|
I updated to "no bigger than the largest pointer size", which I believe conveys what @sunfishcode suggested. |
|
@jfbastien Do you anticipate code generators specifically giving these rich alignment guarantees (and over-aligning) but not simply emitting the SIMD code themselves? The use case of "dumb compiler, but gives super-natural alignment hints, but doesn't emit vector code" seems rather narrow. Moreover, and this depends on the exact details of the binary format so we can hold off the discussion until we can do more precise measurements, I expect there will be a negative effect in both decode speed and binary size from splitting a small number of super-common naturally-aligned load/store ops into a larger set of load/store+alignment pairings. |
|
For the offset, instead of "no bigger than the largest pointer size", I think we want "the same type as the address' index" or so, so that we unambiguously rule out mixed-type index+offset calculations. |
|
@lukewagner said:
Before we standardize SIMD, I definitely expect this! Even after standardizing SIMD, I think code may want to avoid feature detection, or may only vectorize well with instructions that wasm doesn't have.
It's pretty trivial to give big alignment guarantees, and not so trivial to vectorize well. I'm thinking about more than the GCC loop tests here.
The example I suggested (super-aligned access at the head, followed by accesses from the same base in the loop) only has two types of access: a single super aligned one, and the loop contains accesses without specified alignment. Admittedly it's just one example, but I don't see how that can affect binary size. @sunfishcode good point, will update to what you suggest. |
My worry was more with compilers going alignment-annotation-crazy (e.g., all return values of |
Compilers going overly crazy and bloating code size can happen in so many ways! This is but one, and a wonderful one at that! |
|
Yeah, lgtm then. |
Limit immediate address and alignment
I figure neither should be infinite!