Skip to content

Limit immediate address and alignment#354

Merged
jfbastien merged 3 commits intomasterfrom
jfbastien-patch-3
Sep 15, 2015
Merged

Limit immediate address and alignment#354
jfbastien merged 3 commits intomasterfrom
jfbastien-patch-3

Conversation

@jfbastien
Copy link
Member

I figure neither should be infinite!

I figure neither should be infinite!
@sunfishcode
Copy link
Member

An offset greater than memory_size would already just be OOB at runtime. Do you wish to propose different semantics here? The current text doesn't specify a type for the offset and I agree that it probably should; int32 for wasm32 and int64 for wasm64 seem like the obvious choices.

There is currently no semantic consequence to specifying an exorbitant alignment value. Do you wish to propose different semantics here?

Regarding specific possible alignment limit values, memory_size is dynamic, so limiting alignments to it would be a new condition that implementations would have to enforce dynamically. I agree with @JSStats that there's no foreseeable benefit to alignments greater than page_size, however wasm producers don't know the value of page_size. Natural alignment is an interesting option -- it's constant and known by wasm producers, so it could be checked at program startup. And on current and reasonably anticipated hardware platforms, there's no benefit to supernatural alignments. There are theoretical optimization advantages, however they're of debatable utility. Natural alignment would give us the greatest flexibility in the binary encoding.

@lukewagner
Copy link
Member

I'm also in favor of bounding alignment by the natural alignment as this should give the most flexibility for a simple, compact binary encoding.

@jfbastien
Copy link
Member Author

The semantics I'm proposing is "not an infinite precision integer". I'm OK bounding it to unsigned i32 for wasm32 and i64 for wasm64 (instead of memory_size). I'm explicitly not changing semantics, but rather tightening something that was obviously ill-specified.

I'm not OK with limiting alignment value to natural alignment: we're dropping information that a smart compiler can take advantage of, and which I want to optimize. In particular there are interesting optimizations to be had within a single cacheline, for auto-vectorization, and for auto-magic blocking of array computations in general. One can over-align a base address and then index using it in a loop, allowing us to track each iteration's alignment precisely.

@jfbastien
Copy link
Member Author

I updated to "no bigger than the largest pointer size", which I believe conveys what @sunfishcode suggested.

@lukewagner
Copy link
Member

@jfbastien Do you anticipate code generators specifically giving these rich alignment guarantees (and over-aligning) but not simply emitting the SIMD code themselves? The use case of "dumb compiler, but gives super-natural alignment hints, but doesn't emit vector code" seems rather narrow. Moreover, and this depends on the exact details of the binary format so we can hold off the discussion until we can do more precise measurements, I expect there will be a negative effect in both decode speed and binary size from splitting a small number of super-common naturally-aligned load/store ops into a larger set of load/store+alignment pairings.

@sunfishcode
Copy link
Member

For the offset, instead of "no bigger than the largest pointer size", I think we want "the same type as the address' index" or so, so that we unambiguously rule out mixed-type index+offset calculations.

@jfbastien
Copy link
Member Author

@lukewagner said:

@jfbastien Do you anticipate code generators specifically giving these rich alignment guarantees (and over-aligning) but not simply emitting the SIMD code themselves?

Before we standardize SIMD, I definitely expect this!

Even after standardizing SIMD, I think code may want to avoid feature detection, or may only vectorize well with instructions that wasm doesn't have.

The use case of "dumb compiler, but gives super-natural alignment hints, but doesn't emit vector code" seems rather narrow.

It's pretty trivial to give big alignment guarantees, and not so trivial to vectorize well. I'm thinking about more than the GCC loop tests here.

Moreover, and this depends on the exact details of the binary format so we can hold off the discussion until we can do more precise measurements, I expect there will be a negative effect in both decode speed and binary size from splitting a small number of super-common naturally-aligned load/store ops into a larger set of load/store+alignment pairings.

The example I suggested (super-aligned access at the head, followed by accesses from the same base in the loop) only has two types of access: a single super aligned one, and the loop contains accesses without specified alignment. Admittedly it's just one example, but I don't see how that can affect binary size.

@sunfishcode good point, will update to what you suggest.

@lukewagner
Copy link
Member

Admittedly it's just one example, but I don't see how that can affect binary size.

My worry was more with compilers going alignment-annotation-crazy (e.g., all return values of malloc could be super-naturally aligned). I guess we just say "don't do that"; as long as we expect most loads/stores will be naturally aligned in practice the binary size/decode-speed shouldn't be affected.

@jfbastien
Copy link
Member Author

Admittedly it's just one example, but I don't see how that can affect binary size.

My worry was more with compilers going alignment-annotation-crazy (e.g., all return values of malloc could be super-naturally aligned). I guess we just say "don't do that"; as long as we expect most loads/stores will be naturally aligned in practice the binary size/decode-speed shouldn't be affected.

Compilers going overly crazy and bloating code size can happen in so many ways! This is but one, and a wonderful one at that!

@lukewagner
Copy link
Member

Yeah, lgtm then.

jfbastien added a commit that referenced this pull request Sep 15, 2015
Limit immediate address and alignment
@jfbastien jfbastien merged commit 51cdb75 into master Sep 15, 2015
@jfbastien jfbastien deleted the jfbastien-patch-3 branch September 15, 2015 21:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants