Limit immediate address and alignment by jfbastien · Pull Request #354 · WebAssembly/design

jfbastien · 2015-09-15T05:54:33Z

I figure neither should be infinite!

sunfishcode · 2015-09-15T13:57:50Z

An offset greater than memory_size would already just be OOB at runtime. Do you wish to propose different semantics here? The current text doesn't specify a type for the offset and I agree that it probably should; int32 for wasm32 and int64 for wasm64 seem like the obvious choices.

There is currently no semantic consequence to specifying an exorbitant alignment value. Do you wish to propose different semantics here?

Regarding specific possible alignment limit values, memory_size is dynamic, so limiting alignments to it would be a new condition that implementations would have to enforce dynamically. I agree with @JSStats that there's no foreseeable benefit to alignments greater than page_size, however wasm producers don't know the value of page_size. Natural alignment is an interesting option -- it's constant and known by wasm producers, so it could be checked at program startup. And on current and reasonably anticipated hardware platforms, there's no benefit to supernatural alignments. There are theoretical optimization advantages, however they're of debatable utility. Natural alignment would give us the greatest flexibility in the binary encoding.

lukewagner · 2015-09-15T14:03:17Z

I'm also in favor of bounding alignment by the natural alignment as this should give the most flexibility for a simple, compact binary encoding.

jfbastien · 2015-09-15T15:29:58Z

The semantics I'm proposing is "not an infinite precision integer". I'm OK bounding it to unsigned i32 for wasm32 and i64 for wasm64 (instead of memory_size). I'm explicitly not changing semantics, but rather tightening something that was obviously ill-specified.

I'm not OK with limiting alignment value to natural alignment: we're dropping information that a smart compiler can take advantage of, and which I want to optimize. In particular there are interesting optimizations to be had within a single cacheline, for auto-vectorization, and for auto-magic blocking of array computations in general. One can over-align a base address and then index using it in a loop, allowing us to track each iteration's alignment precisely.

jfbastien · 2015-09-15T15:32:49Z

I updated to "no bigger than the largest pointer size", which I believe conveys what @sunfishcode suggested.

lukewagner · 2015-09-15T15:50:19Z

@jfbastien Do you anticipate code generators specifically giving these rich alignment guarantees (and over-aligning) but not simply emitting the SIMD code themselves? The use case of "dumb compiler, but gives super-natural alignment hints, but doesn't emit vector code" seems rather narrow. Moreover, and this depends on the exact details of the binary format so we can hold off the discussion until we can do more precise measurements, I expect there will be a negative effect in both decode speed and binary size from splitting a small number of super-common naturally-aligned load/store ops into a larger set of load/store+alignment pairings.

sunfishcode · 2015-09-15T15:51:28Z

For the offset, instead of "no bigger than the largest pointer size", I think we want "the same type as the address' index" or so, so that we unambiguously rule out mixed-type index+offset calculations.

jfbastien · 2015-09-15T15:55:59Z

@lukewagner said:

@jfbastien Do you anticipate code generators specifically giving these rich alignment guarantees (and over-aligning) but not simply emitting the SIMD code themselves?

Before we standardize SIMD, I definitely expect this!

Even after standardizing SIMD, I think code may want to avoid feature detection, or may only vectorize well with instructions that wasm doesn't have.

The use case of "dumb compiler, but gives super-natural alignment hints, but doesn't emit vector code" seems rather narrow.

It's pretty trivial to give big alignment guarantees, and not so trivial to vectorize well. I'm thinking about more than the GCC loop tests here.

Moreover, and this depends on the exact details of the binary format so we can hold off the discussion until we can do more precise measurements, I expect there will be a negative effect in both decode speed and binary size from splitting a small number of super-common naturally-aligned load/store ops into a larger set of load/store+alignment pairings.

The example I suggested (super-aligned access at the head, followed by accesses from the same base in the loop) only has two types of access: a single super aligned one, and the loop contains accesses without specified alignment. Admittedly it's just one example, but I don't see how that can affect binary size.

@sunfishcode good point, will update to what you suggest.

lukewagner · 2015-09-15T16:12:33Z

Admittedly it's just one example, but I don't see how that can affect binary size.

My worry was more with compilers going alignment-annotation-crazy (e.g., all return values of malloc could be super-naturally aligned). I guess we just say "don't do that"; as long as we expect most loads/stores will be naturally aligned in practice the binary size/decode-speed shouldn't be affected.

jfbastien · 2015-09-15T16:18:24Z

Admittedly it's just one example, but I don't see how that can affect binary size.

My worry was more with compilers going alignment-annotation-crazy (e.g., all return values of malloc could be super-naturally aligned). I guess we just say "don't do that"; as long as we expect most loads/stores will be naturally aligned in practice the binary size/decode-speed shouldn't be affected.

Compilers going overly crazy and bloating code size can happen in so many ways! This is but one, and a wonderful one at that!

lukewagner · 2015-09-15T21:22:19Z

Yeah, lgtm then.

Limit immediate address and alignment

Limit immediate address and alignment

0716a33

I figure neither should be infinite!

Largest pointer size is now the limit

69e11c4

"the same type as the address' index"

73ef22f

jfbastien added a commit that referenced this pull request Sep 15, 2015

Merge pull request #354 from WebAssembly/jfbastien-patch-3

51cdb75

Limit immediate address and alignment

jfbastien merged commit 51cdb75 into master Sep 15, 2015

jfbastien deleted the jfbastien-patch-3 branch September 15, 2015 21:27

jfbastien mentioned this pull request Sep 15, 2015

Elaborate on the semantics of page_size. #296

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit immediate address and alignment#354

Limit immediate address and alignment#354
jfbastien merged 3 commits intomasterfrom
jfbastien-patch-3

jfbastien commented Sep 15, 2015

Uh oh!

sunfishcode commented Sep 15, 2015

Uh oh!

lukewagner commented Sep 15, 2015

Uh oh!

jfbastien commented Sep 15, 2015

Uh oh!

jfbastien commented Sep 15, 2015

Uh oh!

lukewagner commented Sep 15, 2015

Uh oh!

sunfishcode commented Sep 15, 2015

Uh oh!

jfbastien commented Sep 15, 2015

Uh oh!

lukewagner commented Sep 15, 2015

Uh oh!

jfbastien commented Sep 15, 2015

Uh oh!

lukewagner commented Sep 15, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jfbastien commented Sep 15, 2015

Uh oh!

sunfishcode commented Sep 15, 2015

Uh oh!

lukewagner commented Sep 15, 2015

Uh oh!

jfbastien commented Sep 15, 2015

Uh oh!

jfbastien commented Sep 15, 2015

Uh oh!

lukewagner commented Sep 15, 2015

Uh oh!

sunfishcode commented Sep 15, 2015

Uh oh!

jfbastien commented Sep 15, 2015

Uh oh!

lukewagner commented Sep 15, 2015

Uh oh!

jfbastien commented Sep 15, 2015

Uh oh!

lukewagner commented Sep 15, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants