From 0ef4e93fe3bea4356fdee2c647b2dbc54cc2d3d0 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Tue, 4 Aug 2015 09:02:31 -1000 Subject: [PATCH 1/5] Clarify, rename, and FAQ memory allocation --- AstSemantics.md | 28 ++++++++++++++++++++-------- FAQ.md | 33 +++++++++++++++++++++++++++++++++ FutureFeatures.md | 25 +++++++++++++++++++++---- Modules.md | 5 +++-- Nondeterminism.md | 1 + 5 files changed, 78 insertions(+), 14 deletions(-) diff --git a/AstSemantics.md b/AstSemantics.md index 35f71b2c..9bba34e9 100644 --- a/AstSemantics.md +++ b/AstSemantics.md @@ -71,17 +71,18 @@ Global variables and linear memory accesses use memory types. The main storage of a WebAssembly module, called the *linear memory*, is a contiguous, byte-addressable range of memory spanning from offset `0` and -extending for `memory_size` bytes. The linear memory can be considered to be a -untyped array of bytes. The linear memory is sandboxed; it does not alias the -execution engine's internal data structures, the execution stack, local +extending for `memory_size` bytes which can be dynamically adjusted by +[`resize_memory`](Modules.md#resizing). The linear memory can be considered to +be an untyped array of bytes. The linear memory is sandboxed; it does not alias +the execution engine's internal data structures, the execution stack, local variables, global variables, or other process memory. The initial state of linear memory is specified by the [module](Modules.md#initial-state-of-linear-memory). -In the MVP, linear memory is not shared between threads of execution or -modules: every module has its own separate linear memory. It will, -however, be possible to share linear memory between separate modules and -threads once [threads](PostMVP.md#threads) and -[dynamic linking](FutureFeatures.md#dynamic-inking) are added as features. +In the MVP, linear memory is not shared between threads of execution. Separate +modules can execute in separate threads but have their own linear memory and can +only communicate through messaging, e.g. in browsers using `postMessage`. It +will be possible to share linear memory between threads of execution when +[threads](PostMVP.md#threads) are added. ### Linear Memory Operations @@ -210,6 +211,17 @@ tradeoffs. execution of a module in a mode that threw exceptions on out-of-bounds access. +### Resizing + +As stated [above](AstSemantics.md#linear-memory), linear memory can be resized +by a `resize_memory` builtin operation. The resize delta is required to be a +multiple of a global `page_size` constant. Also as stated +[above](AstSemantics.md#linear-memory), linear memory is contiguous, meaning +there are no "holes" in the linear address space. After the MVP, there are +[future features](FutureFeatures.md#finer-grained-control-over-memory) proposed +to allow setting protection and creating mappings within the contiguous +linear memory. + ## Local variables Each function has a fixed, pre-declared number of local variables which occupy a single diff --git a/FAQ.md b/FAQ.md index 42fefcfd..1e0fb5ec 100644 --- a/FAQ.md +++ b/FAQ.md @@ -229,3 +229,36 @@ WebAssembly implementations run on the user side, so there is no opportunity for * Most of the individual floating point operations that WebAssembly does have already map to individual fast instructions in hardware. Telling `add`, `sub`, or `mul` they don't have to worry about NaN for example doesn't make them any faster, because NaN is handled quickly and transparently in hardware on all modern platforms. * WebAssembly has no floating point traps, status register, dynamic rounding modes, or signalling NaNs, so optimizations that depend on the absence of these features are all safe. + +## What about `mmap`? + +The [`mmap`](http://pubs.opengroup.org/onlinepubs/009695399/functions/mmap.html) +syscall has many useful features. While these are all packed into one +overloaded syscall in POSIX, WebAssembly unpacks this functionality into +multiple builtins: +* the MVP starts with the ability to resize linear memory via a + [`resize_memory`](AstSemantics.md#resizing) builtin operation; +* proposed [future features](FutureFeatures.md#finer-grained-control-over-memory) + would allow the application to change the protection and mappings for pages + in the contiguous range set by `resize_memory`. + +A significant feature of `mmap` that is missing from the above list is the +ability to allocate disjoint virtual address ranges. The reasoning for this +omission is: +* The above functionality is sufficient to allow a user-level libc to + implement full, compatible `mmap` with what appears to be noncontiguous + memory allocation (but, under the hood is just coordinated use of + `memory_resize` and `mprotect`/`map_file`/`map_shmem`/`madvise`). +* The benefit of allowing noncontiguous virtual address allocation would be if + it allowed the engine to interleave a WebAssembly module's linear memory with + other memory allocations in the same process (in order to mitigate virtual + address space fragmentation). There are two problems with this: + * This interleaving with unrelated allocations does not currently admit + efficient security checks to prevent one module from corrupting data outside + its heap (see discussion in #285). + * This interleaving would require making allocation nondeterministic. + Nondeterminism is something that WebAssemgly generally + [tries to avoid](Nondeterminism.md) and in this particular case, history + has clear examples of memory allocator nondeterminism leading to real-world + bustage ([[1](https://technet.microsoft.com/en-us/magazine/ff625273.aspx)], + [[2](http://lxr.free-electrons.com/source/include/linux/personality.h?v=3.2#L31)]). diff --git a/FutureFeatures.md b/FutureFeatures.md index f2320b1c..f8a8397d 100644 --- a/FutureFeatures.md +++ b/FutureFeatures.md @@ -40,10 +40,27 @@ possible to use a non-standard ABI for specialized purposes. ## Finer-grained control over memory -* `mmap` of files. -* `madvise(MADV_DONTNEED)`. -* Shared memory, where a physical address range is mapped to multiple physical - pages in a single WebAssembly module as well as across modules. +Provide access to safe OS-provided functionality including: +* `map_file(addr, length, Blob, file-offset)`: semantically, this operation + copies the specified range from `Blob` into the range `[addr, addr+length)` + (where `addr+length <= memory_size`) but implementations are encouraged + to `mmap(addr, length, MAP_FIXED | MAP_PRIVATE, fd)` +* `dont_need(addr, length)`: semantically, this operation zeroes the given range + but the implementation is encouraged to `madvise(addr, length, MADV_DONTNEED)` +* `shmem_create(length)`: create a memory object that can be simultaneously + shared between multiple linear memories +* `map_shmem(addr, length, shmem, shmem-offset)`: like `map_file` except + `MAP_SHARED`, which isn't otherwise valid on read-only Blobs +* `mprotect(addr, length, prot-flags)`: change protection on the range + `[addr, addr+length)` (where `addr+length <= memory_size`) + +The `addr` and `length` parameters above would be required to be multiples of +the [`page_size`](AstSemantics.md#resizing) global constant. + +The above list of functionality mostly covers the set of functionality +provided by the `mmap` OS primitive. One significant exception is that `mmap` +can allocate noncontiguous virtual address ranges. See the +[FAQ](FAQ.md#what-about-mmap) for rationale. ## More expressive control flow diff --git a/Modules.md b/Modules.md index 435788ff..b91aba23 100644 --- a/Modules.md +++ b/Modules.md @@ -108,8 +108,9 @@ to allow *explicitly* sharing linear memory between multiple modules. ## Initial state of linear memory A module will contain a section declaring the linear memory size (initial and -maximum size allowed by `sbrk`) and the initial contents of memory (analogous -to `.data`, `.rodata`, `.bss` sections in native executables). +maximum size allowed by [`resize_memory`](AstSemantics.md#resizing) and the +initial contents of memory (analogous to `.data`, `.rodata`, `.bss` sections in +native executables). ## Code section diff --git a/Nondeterminism.md b/Nondeterminism.md index e3296cb8..7685216f 100644 --- a/Nondeterminism.md +++ b/Nondeterminism.md @@ -31,6 +31,7 @@ currently admits nondeterminism: nondeterministic. * Out of bounds heap accesses *may* want [some flexibility](AstSemantics.md#out-of-bounds) + * The [`page_size` global constant](AstSemantics.md#resizing) * NaN bit patterns in floating point [operations](AstSemantics.md#floating-point-operations) and [conversions](AstSemantics.md#datatype-conversions-truncations-reinterpretations-promotions-and-demotions) From ea8763fe83b98e47af39d0cb2c16e5c8db9746d3 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Wed, 5 Aug 2015 13:09:25 -1000 Subject: [PATCH 2/5] Address first round of comments --- FAQ.md | 2 +- FutureFeatures.md | 3 +++ Nondeterminism.md | 5 ++++- 3 files changed, 8 insertions(+), 2 deletions(-) diff --git a/FAQ.md b/FAQ.md index 1e0fb5ec..6c930fdf 100644 --- a/FAQ.md +++ b/FAQ.md @@ -255,7 +255,7 @@ omission is: address space fragmentation). There are two problems with this: * This interleaving with unrelated allocations does not currently admit efficient security checks to prevent one module from corrupting data outside - its heap (see discussion in #285). + its heap (see discussion in [#285](https://github.com/WebAssembly/design/pull/285)). * This interleaving would require making allocation nondeterministic. Nondeterminism is something that WebAssemgly generally [tries to avoid](Nondeterminism.md) and in this particular case, history diff --git a/FutureFeatures.md b/FutureFeatures.md index f8a8397d..07b4b055 100644 --- a/FutureFeatures.md +++ b/FutureFeatures.md @@ -47,6 +47,9 @@ Provide access to safe OS-provided functionality including: to `mmap(addr, length, MAP_FIXED | MAP_PRIVATE, fd)` * `dont_need(addr, length)`: semantically, this operation zeroes the given range but the implementation is encouraged to `madvise(addr, length, MADV_DONTNEED)` + (this allows applications to be good citizens and release unused physical + pages back to the OS, thereby reducing their RSS and avoiding OOM-killing on + mobile) * `shmem_create(length)`: create a memory object that can be simultaneously shared between multiple linear memories * `map_shmem(addr, length, shmem, shmem-offset)`: like `map_file` except diff --git a/Nondeterminism.md b/Nondeterminism.md index 7685216f..8a5e956e 100644 --- a/Nondeterminism.md +++ b/Nondeterminism.md @@ -31,7 +31,10 @@ currently admits nondeterminism: nondeterministic. * Out of bounds heap accesses *may* want [some flexibility](AstSemantics.md#out-of-bounds) - * The [`page_size` global constant](AstSemantics.md#resizing) + * The `page_size` global constant is device-dependent. The arguments to the + [`resize_memory`](AstSemantics.md#resizing) and other + [future memory management builtins](FutureFeatures.md#finer-grained-control-over-memory) + are required to be multiples of `page_size`. * NaN bit patterns in floating point [operations](AstSemantics.md#floating-point-operations) and [conversions](AstSemantics.md#datatype-conversions-truncations-reinterpretations-promotions-and-demotions) From fee4183555ccef8adda43d343d11de42fdbe77f4 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Thu, 6 Aug 2015 08:44:20 -1000 Subject: [PATCH 3/5] Tweak FAQ wording --- FAQ.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/FAQ.md b/FAQ.md index 6c930fdf..9ee4ed79 100644 --- a/FAQ.md +++ b/FAQ.md @@ -258,7 +258,7 @@ omission is: its heap (see discussion in [#285](https://github.com/WebAssembly/design/pull/285)). * This interleaving would require making allocation nondeterministic. Nondeterminism is something that WebAssemgly generally - [tries to avoid](Nondeterminism.md) and in this particular case, history - has clear examples of memory allocator nondeterminism leading to real-world - bustage ([[1](https://technet.microsoft.com/en-us/magazine/ff625273.aspx)], + [tries to avoid](Nondeterminism.md) and history has clear examples of + memory allocator almost-determinism leading to real-world bustage + ([[1](https://technet.microsoft.com/en-us/magazine/ff625273.aspx)], [[2](http://lxr.free-electrons.com/source/include/linux/personality.h?v=3.2#L31)]). From 6f691e94f1053f0e2d5378d9b6cd756db3de4838 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Wed, 12 Aug 2015 11:36:58 -1000 Subject: [PATCH 4/5] Rephrase page_size as nullary operator, not global --- AstSemantics.md | 21 +++++++++++++-------- FutureFeatures.md | 2 +- Nondeterminism.md | 2 +- 3 files changed, 15 insertions(+), 10 deletions(-) diff --git a/AstSemantics.md b/AstSemantics.md index 9bba34e9..bc21c3de 100644 --- a/AstSemantics.md +++ b/AstSemantics.md @@ -213,14 +213,19 @@ tradeoffs. ### Resizing -As stated [above](AstSemantics.md#linear-memory), linear memory can be resized -by a `resize_memory` builtin operation. The resize delta is required to be a -multiple of a global `page_size` constant. Also as stated -[above](AstSemantics.md#linear-memory), linear memory is contiguous, meaning -there are no "holes" in the linear address space. After the MVP, there are -[future features](FutureFeatures.md#finer-grained-control-over-memory) proposed -to allow setting protection and creating mappings within the contiguous -linear memory. +Linear memory can be resized by a `resize_memory` builtin operation. The +`resize_memory` operation requires its operand to be a multiple of the system +page size. To determine page size, a nullary `page_size` operation is provided. + + * `resize_memory` : grow or shrink linear memory by a given delta which + must be a multiple of `page_size` + * `page_size` : nullary constant function returning page size + +Also as stated [above](AstSemantics.md#linear-memory), linear memory is +contiguous, meaning there are no "holes" in the linear address space. After the +MVP, there are [future features](FutureFeatures.md#finer-grained-control-over-memory) +proposed to allow setting protection and creating mappings within the +contiguous linear memory. ## Local variables diff --git a/FutureFeatures.md b/FutureFeatures.md index 07b4b055..2caa7208 100644 --- a/FutureFeatures.md +++ b/FutureFeatures.md @@ -58,7 +58,7 @@ Provide access to safe OS-provided functionality including: `[addr, addr+length)` (where `addr+length <= memory_size`) The `addr` and `length` parameters above would be required to be multiples of -the [`page_size`](AstSemantics.md#resizing) global constant. +[`page_size`](AstSemantics.md#resizing). The above list of functionality mostly covers the set of functionality provided by the `mmap` OS primitive. One significant exception is that `mmap` diff --git a/Nondeterminism.md b/Nondeterminism.md index 8a5e956e..e32fce99 100644 --- a/Nondeterminism.md +++ b/Nondeterminism.md @@ -31,7 +31,7 @@ currently admits nondeterminism: nondeterministic. * Out of bounds heap accesses *may* want [some flexibility](AstSemantics.md#out-of-bounds) - * The `page_size` global constant is device-dependent. The arguments to the + * The value returned by `page_size` is system-dependent. The arguments to the [`resize_memory`](AstSemantics.md#resizing) and other [future memory management builtins](FutureFeatures.md#finer-grained-control-over-memory) are required to be multiples of `page_size`. From 83d1f09542cf643db9bc47acddc990bd09490ea6 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Wed, 12 Aug 2015 15:17:22 -1000 Subject: [PATCH 5/5] Remove the useful anecdotes --- FAQ.md | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/FAQ.md b/FAQ.md index 9ee4ed79..d5603ad6 100644 --- a/FAQ.md +++ b/FAQ.md @@ -256,9 +256,6 @@ omission is: * This interleaving with unrelated allocations does not currently admit efficient security checks to prevent one module from corrupting data outside its heap (see discussion in [#285](https://github.com/WebAssembly/design/pull/285)). - * This interleaving would require making allocation nondeterministic. - Nondeterminism is something that WebAssemgly generally - [tries to avoid](Nondeterminism.md) and history has clear examples of - memory allocator almost-determinism leading to real-world bustage - ([[1](https://technet.microsoft.com/en-us/magazine/ff625273.aspx)], - [[2](http://lxr.free-electrons.com/source/include/linux/personality.h?v=3.2#L31)]). + * This interleaving would require making allocation nondeterministic and + nondeterminism is something that WebAssemgly generally + [tries to avoid](Nondeterminism.md).