diff --git a/AstSemantics.md b/AstSemantics.md index 3c265fa2..5045ef4a 100644 --- a/AstSemantics.md +++ b/AstSemantics.md @@ -1,26 +1,20 @@ # Abstract Syntax Tree Semantics -WebAssembly code is represented as an abstract syntax tree -that has a basic division between statements and -expressions. Each function body consists of exactly one statement. -All expressions and operations are typed, with no implicit conversions or -overloading rules. +WebAssembly code is represented as an Abstract Syntax Tree (AST) that has a +basic division between statements and expressions. Each function body consists +of exactly one statement. All expressions and operations are typed, with no +implicit conversions or overloading rules. Verification of WebAssembly code requires only a single pass with constant-time type checking and well-formedness checking. -Why not a stack-, register- or SSA-based bytecode? -* Trees allow a smaller binary encoding: [JSZap][], [Slim Binaries][]. -* [Polyfill prototype][] shows simple and efficient translation to asm.js. - - [JSZap]: https://research.microsoft.com/en-us/projects/jszap/ - [Slim Binaries]: https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.108.1711 - [Polyfill prototype]: https://github.com/WebAssembly/polyfill-prototype-1 - WebAssembly offers a set of operations that are language-independent but closely match operations in many programming languages and are efficiently implementable on all modern computers. +The [rationale](Rationale.md) document details why WebAssembly is designed as +detailed in this document. + ## Traps Some operations may *trap* under some conditions, as noted below. In the MVP, @@ -148,8 +142,7 @@ The semantics of out-of-bounds accesses are discussed The use of infinite-precision in the effective address computation means that the addition of the offset to the address never causes wrapping, so if the address for an access is out-of-bounds, the effective address will always also -be out-of-bounds. This is intended to simplify folding of offsets into complex -address modes in hardware, and to simplify bounds checking optimizations. +be out-of-bounds. In wasm32, address operands and offset attributes have type `i32`, and linear memory sizes are limited to 4 GiB (of course, actual sizes are further limited @@ -158,6 +151,7 @@ offsets have type `i64`. The MVP only includes wasm32; subsequent versions will add support for wasm64 and thus [>4 GiB linear memory](FutureFeatures.md#linear-memory-bigger-than-4-gib). + ### Alignment Each linear memory access operation also has an immediate positive integer power @@ -170,53 +164,26 @@ when considering alignment. If the effective address of a memory access is a multiple of the alignment attribute value of the memory access, the memory access is considered *aligned*, otherwise it is considered *misaligned*. Aligned and misaligned accesses have -the same behavior. Alignment affects performance as follows: +the same behavior. + +Alignment affects performance as follows: * Aligned accesses with at least natural alignment are fast. * Aligned accesses with less than natural alignment may be somewhat slower - (think: implementation makes multiple accesses, either in software or - in hardware). - * Misaligned access of any kind may be *massively* slower - (think: implementation takes a signal and fixes things up). - -Thus, it is recommend that WebAssembly producers align frequently-used data -to permit the use of natural alignment access, and use loads and stores with -the greatest alignment values practical, while always avoiding misaligned -accesses. - -Either tooling or an explicit opt-in "debug mode" in the spec should allow -execution of a module in a mode that threw exceptions on misaligned access. -(This mode would incur some runtime cost for branching on most platforms which -is why it isn't the specified default.) - -### Out of bounds - -The ideal semantics is for out-of-bounds accesses to trap, but the implications -are not yet fully clear. - -There are several possible variations on this design being discussed and -experimented with. More measurement is required to understand the associated -tradeoffs. - - * After an out-of-bounds access, the instance can no longer execute code and any - outstanding JavaScript [ArrayBuffer][] aliasing the linear memory are detached. - * This would primarily allow hoisting bounds checks above effectful - operations. - * This can be viewed as a mild security measure under the assumption that - while the sandbox is still ensuring safety, the instance's internal state - is incoherent and further execution could lead to Bad Things (e.g., XSS - attacks). - * To allow for potentially more-efficient memory sandboxing, the semantics could - allow for a nondeterministic choice between one of the following when an - out-of-bounds access occurred. - * The ideal trap semantics. - * Loads return an unspecified value. - * Stores are either ignored or store to an unspecified location in the linear memory. - * Either tooling or an explicit opt-in "debug mode" in the spec should allow - execution of a module in a mode that threw exceptions on out-of-bounds - access. - - [ArrayBuffer]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ArrayBuffer + (think: implementation makes multiple accesses, either in software or in + hardware). + * Misaligned access of any kind may be *massively* slower (think: + implementation takes a signal and fixes things up). + +Thus, it is recommend that WebAssembly producers align frequently-used data to +permit the use of natural alignment access, and use loads and stores with the +greatest alignment values practical, while always avoiding misaligned accesses. + + +### Out of Bounds + +Out of bounds accesses trap. + ### Resizing @@ -234,17 +201,18 @@ MVP, there are [future features](FutureFeatures.md#finer-grained-control-over-me proposed to allow setting protection and creating mappings within the contiguous linear memory. -In the MVP, memory can only be grown. After the MVP, a memory shrinking operation -may be added. However, due to normal fragmentation, applications are instead -expected release unused physical pages from the working set using the +In the MVP, memory can only be grown. After the MVP, a memory shrinking +operation may be added. However, due to normal fragmentation, applications are +instead expected release unused physical pages from the working set using the [`discard`](FutureFeatures.md#finer-grained-control-over-memory) future feature. The result type of `page_size` is `int32` for wasm32 and `int64` for wasm64. The result value of `page_size` is an unsigned integer which is a power of 2. -(Note that the `page_size` value need not reflect the actual internal page size -of the implementation; it just needs to be a value suitable for use with -`grow_memory`) +The `page_size` value need not reflect the actual internal page size of the +implementation; it just needs to be a value suitable for use with +`resize_memory`. + ## Local variables @@ -262,6 +230,7 @@ The details of index space for local variables and their types will be further c e.g. whether locals with type `i32` and `i64` must be contiguous and separate from others, etc. + ## Control flow structures WebAssembly offers basic structured control flow. All control flow structures @@ -305,10 +274,9 @@ Each function has a *signature*, which consists of: * Return types, which are a sequence of local types * Argument types, which are a sequence of local types -Note that WebAssembly itself does not support variable-length argument lists -(aka varargs). C and C++ compilers are expected to implement this functionality -by storing arguments in a buffer in linear memory and passing a pointer to the -buffer. +WebAssembly doesn't support variable-length argument lists (aka +varargs). Compilers targetting WebAssembly can instead support them through +explicit accesses to linear memory. In the MVP, the length of the return types sequence may only be 0 or 1. This restriction may be lifted in the future. @@ -329,25 +297,27 @@ mismatched signature is a module verification error. Indirect calls allow calling target functions that are unknown at compile time. The target function is an expression of local type `i32` and is always the first input into the indirect call. + A `call_indirect` specifies the *expected* signature of the target function with -an index into a *signature table* defined by the module. -An indirect call to a function with a mismatched signature causes a trap. +an index into a *signature table* defined by the module. An indirect call to a +function with a mismatched signature causes a trap. * `call_indirect`: call function indirectly -Functions from the main function table are made addressable by defining an -*indirect function table* that consists of a sequence of indices -into the module's main function table. A function from the main table may appear more than -once in the indirect function table. Functions not appearing in the indirect function -table cannot be called indirectly. - -In the MVP, indices into the indirect function table are local to a single module, so wasm -modules may use `i32` constants to refer to entries in their own indirect function table. The -[dynamic linking](DynamicLinking.md) feature is necessary for two modules to pass function -pointers back and forth. This will mean concatenating indirect function tables -and adding an operation `address_of` that computes the absolute index into the concatenated -table from an index in a module's local indirect table. JITing may also mean appending more -functions to the end of the indirect function table. +Functions from the main function table are made addressable by defining an +*indirect function table* that consists of a sequence of indices into the +module's main function table. A function from the main table may appear more +than once in the indirect function table. Functions not appearing in the +indirect function table cannot be called indirectly. + +In the MVP, indices into the indirect function table are local to a single +module, so wasm modules may use `i32` constants to refer to entries in their own +indirect function table. The [dynamic linking](DynamicLinking.md) feature is +necessary for two modules to pass function pointers back and forth. This will +mean concatenating indirect function tables and adding an operation `address_of` +that computes the absolute index into the concatenated table from an index in a +module's local indirect table. JITing may also mean appending more functions to +the end of the indirect function table. Multiple return value calls will be possible, though possibly not in the MVP. The details of multiple-return-value calls needs clarification. Calling a @@ -366,17 +336,7 @@ supported (including NaN values of all possible bit patterns). * `f32.const`: produce the value of an f32 immediate * `f64.const`: produce the value of an f64 immediate -## Expressions with control flow - -Expression trees offer significant size reduction by avoiding the need for -`set_local`/`get_local` pairs in the common case of an expression with only one, -immediate use. The following primitives provide AST nodes that express control -flow and thus allow more opportunities to build bigger expression trees and -further reduce `set_local`/`get_local` usage (which constitute 30-40% of total -bytes in the -[polyfill prototype](https://github.com/WebAssembly/polyfill-prototype-1)). -Additionally, these primitives are useful building blocks for -WebAssembly-generators (including the JavaScript polyfill prototype). +## Expressions with Control Flow * `comma`: evaluate and ignore the result of the first operand, evaluate and return the second operand diff --git a/Nondeterminism.md b/Nondeterminism.md index 65567dd2..36dfda12 100644 --- a/Nondeterminism.md +++ b/Nondeterminism.md @@ -8,6 +8,9 @@ local, nondeterminism. * *Local*: when nondeterministic execution occurs, the effect is local, there is no "spooky action at a distance". +The [rationale](Rationale.md) document details why WebAssembly is designed as +detailed in this document. + The limited, local, nondeterministic model implies: * Applications can't access data outside the sandbox without going through appropriate APIs, or otherwise escape the sandbox. @@ -17,11 +20,6 @@ The limited, local, nondeterministic model implies: [Control Flow Integrity](https://research.microsoft.com/apps/pubs/default.aspx?id=64250). * WebAssembly has no [nasal demons](https://en.wikipedia.org/w/index.php?title=Nasal_demons). -Ideally, WebAssembly would be fully deterministic (except where nondeterminism -was essential to the API, like random number generators, date/time functions or -input events). Nondeterminism is only specified as a compromise when there is no -other practical way to achieve [portable](Portability.md) native performance. - The following is a list of the places where the WebAssembly specification currently admits nondeterminism: diff --git a/Rationale.md b/Rationale.md new file mode 100644 index 00000000..2065ff6a --- /dev/null +++ b/Rationale.md @@ -0,0 +1,230 @@ +# Design Rationale + + +## Why AST? + +Why not a stack-, register- or SSA-based bytecode? +* Trees allow a smaller binary encoding: [JSZap][], [Slim Binaries][]. +* [Polyfill prototype][] shows simple and efficient translation to asm.js. + + [JSZap]: https://research.microsoft.com/en-us/projects/jszap/ + [Slim Binaries]: https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.108.1711 + [Polyfill prototype]: https://github.com/WebAssembly/polyfill-prototype-1 + + +## Basic Types Only + +WebAssembly only represents [a few types](AstSemantics.md#Types). + +* More complex types can be formed from these basic types. It's up to the source + language compiler to express its own types in terms of the basic machine + types. This allows WebAssembly to present itself as a virtual ISA, and lets + compilers target it as they would any other ISA. +* These types are efficiently executed by all modern architectures. +* Smaller types (such as `i8` and `i16`) are usually no more efficient and in + languages like C/C++ are only semantically meaningful for memory accesses + since arithmetic get widened to `i32` or `i64`. Avoiding them at least for MVP + makes it easier to implement a WebAssembly VM. +* Other types (such as `f16`, `i128`) aren't widely supported by existing + hardware and can be supported by runtime libraries if developers wish to use + them. Hardware support is sometimes uneven, e.g. some support load/store of + `f16` only whereas other hardware also supports scalar arithmetic on `f16`, + and yet other hardware only supports SIMD arithmetic on `f16`. They can be + added to WebAssembly later without compromising MVP. +* More complex object types aren't semantically useful for MVP: WebAssembly + seeks to provide the primitive building blocks upon which higher-level + constructs can be built. They may become useful to support other languages, + especially when considering [garbage collection](GC.md). + + +## Load/Store Addressing + +Load/store instructions include an immediate offset used for +[addressing](AstSemantics.md#Addressing). This is intended to simplify folding +of offsets into complex address modes in hardware, and to simplify bounds +checking optimizations. It offloads some of the optimization work to the +compiler that targets WebAssembly, executing on the developer's machine, instead +of performing that work in the WebAssembly compiler on the user's machine. + + +## Alignment Hints + +Load/store instructions contain +[alignment hints](AstSemantics.md#Alignment). This makes it easier to generate +efficient code on certain hardware architectures. + +Either tooling or an explicit opt-in "debug mode" in the spec could allow +execution of a module in a mode that threw exceptions on misaligned access. +This mode would incur some runtime cost for branching on most platforms which is +why it isn't the specified default. + + +## Out of Bounds + +The ideal semantics is for +[out-of-bounds accesses](AstSemantics.md#Out-of-Bounds) to trap, but the +implications are not yet fully clear. + +There are several possible variations on this design being discussed and +experimented with. More measurement is required to understand the associated +tradeoffs. + + * After an out-of-bounds access, the instance can no longer execute code and + any outstanding JavaScript [ArrayBuffer][] aliasing the linear memory are + detached. + * This would primarily allow hoisting bounds checks above effectful + operations. + * This can be viewed as a mild security measure under the assumption that + while the sandbox is still ensuring safety, the instance's internal state + is incoherent and further execution could lead to Bad Things (e.g., XSS + attacks). + * To allow for potentially more-efficient memory sandboxing, the semantics + could allow for a nondeterministic choice between one of the following when + an out-of-bounds access occurred. + * The ideal trap semantics. + * Loads return an unspecified value. + * Stores are either ignored or store to an unspecified location in the + linear memory. + * Either tooling or an explicit opt-in "debug mode" in the spec should allow + execution of a module in a mode that threw exceptions on out-of-bounds + access. + + [ArrayBuffer]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ArrayBuffer + + +## Resizing + +Implementations provide a `page_size` operation which allows them to efficiently +map the underlying OS's capabilities to the WebAssembly application, as well as +to communicate their own implementation details in a useful manner to the +developer. + + +## Control Flow + +See [#299](https://github.com/WebAssembly/design/pull/299). + + +## Locals + +C/C++ makes it possible to take the address of a function's local values and +pass this pointer to callees or to other threads. Since WebAssembly's local +variables are outside the address space, C/C++ compilers implement address-taken +variables by creating a separate stack data structure within linear memory. This +stack is sometimes called the "aliased" stack, since it is used for variables +which may be pointed to by pointers. + +This prevents WebAssembly from performing clever optimizations on the stack and +liveness of such variables, but this loss isn't expected to be +consequential. Common C compiler optimizations such as LLVM's global value +numbering effectively split address-taken variables into parts, shrinking the +range where they actually need to have their address taken, and creating new +ranges where they can be allocated as local variables. + +Conversely, non-address taken values which are usually on the stack are instead +represented as locals inside functions. This effectively means that WebAssembly +has an infinite set of registers, and can choose to spill values as it sees fit +in a manner unobservable to the hosted code. This implies that there's a +separate stack, unaddressable from hosted code, which is also used to spill +return values. This allows strong security properties to be enforced, but does +mean that two stacks are maintained (one by the VM, the other by the compiler +which targets WebAssembly) which can lead to some inefficiencies. + +Local variables are not in Static Single Assignment (SSA) form, meaning that +multiple incoming SSA values which have separate liveness can "share" the +storage represented by a local through the `set_local` operation. From an SSA +perspective, this means that multiple independent values can share a local +variable in WebAssembly, which is effectively a kind of pre-coloring that clever +producers can use to pre-color variables and give hints to a WebAssembly VM's +register allocation algorithms, offloading some of the optimization work from +the WebAssembly VM. + + +## Variable-Length Argument Lists + +C and C++ compilers are expected to implement variable-length argument lists by +storing arguments in a buffer in linear memory and passing a pointer to the +buffer. This greatly simplifies WebAssembly VM implementations by punting this +ABI consideration to the front-end compiler. It does negatively impact +performance, but variable-length calls are already somewhat slow. + + +## Multiple Return Values + +TODO + + +## Indirect Calls + +The exact semantics of indirect function calls, function pointers, and what +happens when calling the wrong function, are still being discussed. + +Fundamentally linear memory is a simple collection of bytes, which means that +some integral representation of function pointers must exist. It's desirable to +hide the actual address of generated code from untrusted code because that would +be an unfortunate information leak which could have negative security +implications. Indirection is therefore desired. + +One extra concern is that existing C++ code sometimes stores data inside of what +is usually a function pointer. This is expected to keep working. + +Dynamic linking further complicates this: WebAssembly cannot simply standardize +on fixed-size function tables since dynamically linked code can add new +functions, as well as remove them. + + +## Expressions with Control Flow + +Expression trees offer significant size reduction by avoiding the need for +`set_local`/`get_local` pairs in the common case of an expression with only one, +immediate use. The `comma` and `conditional` primitives provide AST nodes that +express control flow and thus allow more opportunities to build bigger +expression trees and further reduce `set_local`/`get_local` usage (which +constitute 30-40% of total bytes in the +[polyfill prototype](https://github.com/WebAssembly/polyfill-prototype-1)). +Additionally, these primitives are useful building blocks for +WebAssembly-generators (including the JavaScript polyfill prototype). + + +## Limited Local Nondeterminism + +There are a few obvious cases where nondeterminism is essential to the API, such +as random number generators, date/time functions or input events. The +WebAssembly specification is strict when it comes to other sources of +[limited local nondeterminism](Nondeterminism.md) of operations: it specifies +all possible corner cases, and specifies a single outcome when this can be done +reasonably. + +Ideally, WebAssembly would be fully deterministic because a fully deterministic +platform is easier to: + +* Reason about. +* Implement. +* Test portably. + +Nondeterminism is only specified as a compromise when there is no other +practical way to: + +* Achieve [portable](Portability.md) native performance. +* Lower resource usage. +* Reduce implementation complexity (both of WebAssembly VMs as well as compilers + generating WebAssembly binaries). +* Allow usage of new hardware features. +* Allows implementations to security-harden certain usecases. + +When nondeterminism is allowed into WebAssembly it is always done in a limited +and local manner. This prevents the entire program from being invalid, as would +be the case with C++ undefined behavior. + +As WebAssembly gets implemented and tested with multiple languages on multiple +architectures there may be a need to revisit some of the decisions: + +* When all relevant hardware implement features the same way then there's no + need to add nondeterminism to WebAssembly when realistically there's only one + mapping from WebAssenbly expression to ISA-specific operations. One such + example is floating-point: at a high-level most basic instructions follow + IEEE-754 semantics, it is therefore not necessary to specify WebAssembly's + floating-point operations differently from IEEE-754. +* When different languages have different expectations then it's unfortunate if + WebAssembly measurably penalizes one's performance by enforcing determinism + which that language doesn't care about, but which another language may want.