-
Notifications
You must be signed in to change notification settings - Fork 701
Add Rationale.md #390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Rationale.md #390
Changes from all commits
a85c0fd
71e29a9
2336195
661ac0a
3eb06ed
898721a
13dd5a0
c447523
753724d
42bf4a3
26b5815
e204616
9357074
05a7c61
e3f7bb5
537d6e8
0cf9b0e
a8a981e
02c0c2b
b92f590
f488d99
b06176c
55d8757
152cfc8
94638c4
2781f71
0e53992
881a6cd
18b1dbb
bf97af2
31a4250
694bcc5
7fdd2f2
659d9c9
0e698ef
14712cc
6e91da7
f1a34fb
de90fca
68006a2
935ca4a
38348f3
38dbd0f
df34622
1ab46f4
70daaa1
370e161
62a8bb1
5152df1
b62e311
4d9c8a7
bd82bde
7221b7b
1d96a85
caea462
b9bad6a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,26 +1,20 @@ | ||
| # Abstract Syntax Tree Semantics | ||
|
|
||
| WebAssembly code is represented as an abstract syntax tree | ||
| that has a basic division between statements and | ||
| expressions. Each function body consists of exactly one statement. | ||
| All expressions and operations are typed, with no implicit conversions or | ||
| overloading rules. | ||
| WebAssembly code is represented as an Abstract Syntax Tree (AST) that has a | ||
| basic division between statements and expressions. Each function body consists | ||
| of exactly one statement. All expressions and operations are typed, with no | ||
| implicit conversions or overloading rules. | ||
|
|
||
| Verification of WebAssembly code requires only a single pass with constant-time | ||
| type checking and well-formedness checking. | ||
|
|
||
| Why not a stack-, register- or SSA-based bytecode? | ||
| * Trees allow a smaller binary encoding: [JSZap][], [Slim Binaries][]. | ||
| * [Polyfill prototype][] shows simple and efficient translation to asm.js. | ||
|
|
||
| [JSZap]: https://research.microsoft.com/en-us/projects/jszap/ | ||
| [Slim Binaries]: https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.108.1711 | ||
| [Polyfill prototype]: https://github.com/WebAssembly/polyfill-prototype-1 | ||
|
|
||
| WebAssembly offers a set of operations that are language-independent but closely | ||
| match operations in many programming languages and are efficiently implementable | ||
| on all modern computers. | ||
|
|
||
| The [rationale](Rationale.md) document details why WebAssembly is designed as | ||
| detailed in this document. | ||
|
|
||
| ## Traps | ||
|
|
||
| Some operations may *trap* under some conditions, as noted below. In the MVP, | ||
|
|
@@ -148,8 +142,7 @@ The semantics of out-of-bounds accesses are discussed | |
| The use of infinite-precision in the effective address computation means that | ||
| the addition of the offset to the address never causes wrapping, so if the | ||
| address for an access is out-of-bounds, the effective address will always also | ||
| be out-of-bounds. This is intended to simplify folding of offsets into complex | ||
| address modes in hardware, and to simplify bounds checking optimizations. | ||
| be out-of-bounds. | ||
|
|
||
| In wasm32, address operands and offset attributes have type `i32`, and linear | ||
| memory sizes are limited to 4 GiB (of course, actual sizes are further limited | ||
|
|
@@ -158,6 +151,7 @@ offsets have type `i64`. The MVP only includes wasm32; subsequent versions | |
| will add support for wasm64 and thus | ||
| [>4 GiB linear memory](FutureFeatures.md#linear-memory-bigger-than-4-gib). | ||
|
|
||
|
|
||
| ### Alignment | ||
|
|
||
| Each linear memory access operation also has an immediate positive integer power | ||
|
|
@@ -170,53 +164,26 @@ when considering alignment. | |
| If the effective address of a memory access is a multiple of the alignment | ||
| attribute value of the memory access, the memory access is considered *aligned*, | ||
| otherwise it is considered *misaligned*. Aligned and misaligned accesses have | ||
| the same behavior. Alignment affects performance as follows: | ||
| the same behavior. | ||
|
|
||
| Alignment affects performance as follows: | ||
|
|
||
| * Aligned accesses with at least natural alignment are fast. | ||
| * Aligned accesses with less than natural alignment may be somewhat slower | ||
| (think: implementation makes multiple accesses, either in software or | ||
| in hardware). | ||
| * Misaligned access of any kind may be *massively* slower | ||
| (think: implementation takes a signal and fixes things up). | ||
|
|
||
| Thus, it is recommend that WebAssembly producers align frequently-used data | ||
| to permit the use of natural alignment access, and use loads and stores with | ||
| the greatest alignment values practical, while always avoiding misaligned | ||
| accesses. | ||
|
|
||
| Either tooling or an explicit opt-in "debug mode" in the spec should allow | ||
| execution of a module in a mode that threw exceptions on misaligned access. | ||
| (This mode would incur some runtime cost for branching on most platforms which | ||
| is why it isn't the specified default.) | ||
|
|
||
| ### Out of bounds | ||
|
|
||
| The ideal semantics is for out-of-bounds accesses to trap, but the implications | ||
| are not yet fully clear. | ||
|
|
||
| There are several possible variations on this design being discussed and | ||
| experimented with. More measurement is required to understand the associated | ||
| tradeoffs. | ||
|
|
||
| * After an out-of-bounds access, the instance can no longer execute code and any | ||
| outstanding JavaScript [ArrayBuffer][] aliasing the linear memory are detached. | ||
| * This would primarily allow hoisting bounds checks above effectful | ||
| operations. | ||
| * This can be viewed as a mild security measure under the assumption that | ||
| while the sandbox is still ensuring safety, the instance's internal state | ||
| is incoherent and further execution could lead to Bad Things (e.g., XSS | ||
| attacks). | ||
| * To allow for potentially more-efficient memory sandboxing, the semantics could | ||
| allow for a nondeterministic choice between one of the following when an | ||
| out-of-bounds access occurred. | ||
| * The ideal trap semantics. | ||
| * Loads return an unspecified value. | ||
| * Stores are either ignored or store to an unspecified location in the linear memory. | ||
| * Either tooling or an explicit opt-in "debug mode" in the spec should allow | ||
| execution of a module in a mode that threw exceptions on out-of-bounds | ||
| access. | ||
|
|
||
| [ArrayBuffer]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ArrayBuffer | ||
| (think: implementation makes multiple accesses, either in software or in | ||
| hardware). | ||
| * Misaligned access of any kind may be *massively* slower (think: | ||
| implementation takes a signal and fixes things up). | ||
|
|
||
| Thus, it is recommend that WebAssembly producers align frequently-used data to | ||
| permit the use of natural alignment access, and use loads and stores with the | ||
| greatest alignment values practical, while always avoiding misaligned accesses. | ||
|
|
||
|
|
||
| ### Out of Bounds | ||
|
|
||
| Out of bounds accesses trap. | ||
|
|
||
|
|
||
| ### Resizing | ||
|
|
||
|
|
@@ -234,17 +201,18 @@ MVP, there are [future features](FutureFeatures.md#finer-grained-control-over-me | |
| proposed to allow setting protection and creating mappings within the | ||
| contiguous linear memory. | ||
|
|
||
| In the MVP, memory can only be grown. After the MVP, a memory shrinking operation | ||
| may be added. However, due to normal fragmentation, applications are instead | ||
| expected release unused physical pages from the working set using the | ||
| In the MVP, memory can only be grown. After the MVP, a memory shrinking | ||
| operation may be added. However, due to normal fragmentation, applications are | ||
| instead expected release unused physical pages from the working set using the | ||
| [`discard`](FutureFeatures.md#finer-grained-control-over-memory) future feature. | ||
|
|
||
| The result type of `page_size` is `int32` for wasm32 and `int64` for wasm64. | ||
| The result value of `page_size` is an unsigned integer which is a power of 2. | ||
|
|
||
| (Note that the `page_size` value need not reflect the actual internal page size | ||
| of the implementation; it just needs to be a value suitable for use with | ||
| `grow_memory`) | ||
| The `page_size` value need not reflect the actual internal page size of the | ||
| implementation; it just needs to be a value suitable for use with | ||
| `resize_memory`. | ||
|
|
||
|
|
||
| ## Local variables | ||
|
|
||
|
|
@@ -262,6 +230,7 @@ The details of index space for local variables and their types will be further c | |
| e.g. whether locals with type `i32` and `i64` must be contiguous and separate from | ||
| others, etc. | ||
|
|
||
|
|
||
| ## Control flow structures | ||
|
|
||
| WebAssembly offers basic structured control flow. All control flow structures | ||
|
|
@@ -305,10 +274,9 @@ Each function has a *signature*, which consists of: | |
| * Return types, which are a sequence of local types | ||
| * Argument types, which are a sequence of local types | ||
|
|
||
| Note that WebAssembly itself does not support variable-length argument lists | ||
| (aka varargs). C and C++ compilers are expected to implement this functionality | ||
| by storing arguments in a buffer in linear memory and passing a pointer to the | ||
| buffer. | ||
| WebAssembly doesn't support variable-length argument lists (aka | ||
| varargs). Compilers targetting WebAssembly can instead support them through | ||
| explicit accesses to linear memory. | ||
|
|
||
| In the MVP, the length of the return types sequence may only be 0 or 1. This | ||
| restriction may be lifted in the future. | ||
|
|
@@ -329,25 +297,27 @@ mismatched signature is a module verification error. | |
| Indirect calls allow calling target functions that are unknown at compile time. | ||
| The target function is an expression of local type `i32` and is always the first | ||
| input into the indirect call. | ||
|
|
||
| A `call_indirect` specifies the *expected* signature of the target function with | ||
| an index into a *signature table* defined by the module. | ||
| An indirect call to a function with a mismatched signature causes a trap. | ||
| an index into a *signature table* defined by the module. An indirect call to a | ||
| function with a mismatched signature causes a trap. | ||
|
|
||
| * `call_indirect`: call function indirectly | ||
|
|
||
| Functions from the main function table are made addressable by defining an | ||
| *indirect function table* that consists of a sequence of indices | ||
| into the module's main function table. A function from the main table may appear more than | ||
| once in the indirect function table. Functions not appearing in the indirect function | ||
| table cannot be called indirectly. | ||
|
|
||
| In the MVP, indices into the indirect function table are local to a single module, so wasm | ||
| modules may use `i32` constants to refer to entries in their own indirect function table. The | ||
| [dynamic linking](DynamicLinking.md) feature is necessary for two modules to pass function | ||
| pointers back and forth. This will mean concatenating indirect function tables | ||
| and adding an operation `address_of` that computes the absolute index into the concatenated | ||
| table from an index in a module's local indirect table. JITing may also mean appending more | ||
| functions to the end of the indirect function table. | ||
| Functions from the main function table are made addressable by defining an | ||
| *indirect function table* that consists of a sequence of indices into the | ||
| module's main function table. A function from the main table may appear more | ||
| than once in the indirect function table. Functions not appearing in the | ||
| indirect function table cannot be called indirectly. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is also footnote-like material which feels like it belongs here, though perhaps it could be fully parenthesized to indicate the informative nature.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
|
|
||
| In the MVP, indices into the indirect function table are local to a single | ||
| module, so wasm modules may use `i32` constants to refer to entries in their own | ||
| indirect function table. The [dynamic linking](DynamicLinking.md) feature is | ||
| necessary for two modules to pass function pointers back and forth. This will | ||
| mean concatenating indirect function tables and adding an operation `address_of` | ||
| that computes the absolute index into the concatenated table from an index in a | ||
| module's local indirect table. JITing may also mean appending more functions to | ||
| the end of the indirect function table. | ||
|
|
||
| Multiple return value calls will be possible, though possibly not in the | ||
| MVP. The details of multiple-return-value calls needs clarification. Calling a | ||
|
|
@@ -366,17 +336,7 @@ supported (including NaN values of all possible bit patterns). | |
| * `f32.const`: produce the value of an f32 immediate | ||
| * `f64.const`: produce the value of an f64 immediate | ||
|
|
||
| ## Expressions with control flow | ||
|
|
||
| Expression trees offer significant size reduction by avoiding the need for | ||
| `set_local`/`get_local` pairs in the common case of an expression with only one, | ||
| immediate use. The following primitives provide AST nodes that express control | ||
| flow and thus allow more opportunities to build bigger expression trees and | ||
| further reduce `set_local`/`get_local` usage (which constitute 30-40% of total | ||
| bytes in the | ||
| [polyfill prototype](https://github.com/WebAssembly/polyfill-prototype-1)). | ||
| Additionally, these primitives are useful building blocks for | ||
| WebAssembly-generators (including the JavaScript polyfill prototype). | ||
| ## Expressions with Control Flow | ||
|
|
||
| * `comma`: evaluate and ignore the result of the first operand, evaluate and | ||
| return the second operand | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The discussion of the affects of alignment wasn't intended as rationale; it's guidance to developers to know what to expect from alignment fields, and to implementors to know what they're expected to do with alignment fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.