Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 25 additions & 8 deletions AstSemantics.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,17 +71,18 @@ Global variables and linear memory accesses use memory types.

The main storage of a WebAssembly module, called the *linear memory*, is a
contiguous, byte-addressable range of memory spanning from offset `0` and
extending for `memory_size` bytes. The linear memory can be considered to be a
untyped array of bytes. The linear memory is sandboxed; it does not alias the
execution engine's internal data structures, the execution stack, local
extending for `memory_size` bytes which can be dynamically adjusted by
[`resize_memory`](Modules.md#resizing). The linear memory can be considered to
be an untyped array of bytes. The linear memory is sandboxed; it does not alias
the execution engine's internal data structures, the execution stack, local
variables, global variables, or other process memory. The initial state of
linear memory is specified by the [module](Modules.md#initial-state-of-linear-memory).

In the MVP, linear memory is not shared between threads of execution or
modules: every module has its own separate linear memory. It will,
however, be possible to share linear memory between separate modules and
threads once [threads](PostMVP.md#threads) and
[dynamic linking](FutureFeatures.md#dynamic-inking) are added as features.
In the MVP, linear memory is not shared between threads of execution. Separate
modules can execute in separate threads but have their own linear memory and can
only communicate through messaging, e.g. in browsers using `postMessage`. It
will be possible to share linear memory between threads of execution when
[threads](PostMVP.md#threads) are added.

### Linear Memory Operations

Expand Down Expand Up @@ -210,6 +211,22 @@ tradeoffs.
execution of a module in a mode that threw exceptions on out-of-bounds
access.

### Resizing

Linear memory can be resized by a `resize_memory` builtin operation. The
`resize_memory` operation requires its operand to be a multiple of the system
page size. To determine page size, a nullary `page_size` operation is provided.

* `resize_memory` : grow or shrink linear memory by a given delta which
must be a multiple of `page_size`
* `page_size` : nullary constant function returning page size

Also as stated [above](AstSemantics.md#linear-memory), linear memory is
contiguous, meaning there are no "holes" in the linear address space. After the
MVP, there are [future features](FutureFeatures.md#finer-grained-control-over-memory)
proposed to allow setting protection and creating mappings within the
contiguous linear memory.

## Local variables

Each function has a fixed, pre-declared number of local variables which occupy a single
Expand Down
30 changes: 30 additions & 0 deletions FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -229,3 +229,33 @@ WebAssembly implementations run on the user side, so there is no opportunity for
* Most of the individual floating point operations that WebAssembly does have already map to individual fast instructions in hardware. Telling `add`, `sub`, or `mul` they don't have to worry about NaN for example doesn't make them any faster, because NaN is handled quickly and transparently in hardware on all modern platforms.

* WebAssembly has no floating point traps, status register, dynamic rounding modes, or signalling NaNs, so optimizations that depend on the absence of these features are all safe.

## What about `mmap`?

The [`mmap`](http://pubs.opengroup.org/onlinepubs/009695399/functions/mmap.html)
syscall has many useful features. While these are all packed into one
overloaded syscall in POSIX, WebAssembly unpacks this functionality into
multiple builtins:
* the MVP starts with the ability to resize linear memory via a
[`resize_memory`](AstSemantics.md#resizing) builtin operation;
* proposed [future features](FutureFeatures.md#finer-grained-control-over-memory)
would allow the application to change the protection and mappings for pages
in the contiguous range set by `resize_memory`.

A significant feature of `mmap` that is missing from the above list is the
ability to allocate disjoint virtual address ranges. The reasoning for this
omission is:
* The above functionality is sufficient to allow a user-level libc to
implement full, compatible `mmap` with what appears to be noncontiguous
memory allocation (but, under the hood is just coordinated use of
`memory_resize` and `mprotect`/`map_file`/`map_shmem`/`madvise`).
* The benefit of allowing noncontiguous virtual address allocation would be if
it allowed the engine to interleave a WebAssembly module's linear memory with
other memory allocations in the same process (in order to mitigate virtual
address space fragmentation). There are two problems with this:
* This interleaving with unrelated allocations does not currently admit
efficient security checks to prevent one module from corrupting data outside
its heap (see discussion in [#285](https://github.com/WebAssembly/design/pull/285)).
* This interleaving would require making allocation nondeterministic and
nondeterminism is something that WebAssemgly generally
[tries to avoid](Nondeterminism.md).
28 changes: 24 additions & 4 deletions FutureFeatures.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,10 +40,30 @@ possible to use a non-standard ABI for specialized purposes.

## Finer-grained control over memory

* `mmap` of files.
* `madvise(MADV_DONTNEED)`.
* Shared memory, where a physical address range is mapped to multiple physical
pages in a single WebAssembly module as well as across modules.
Provide access to safe OS-provided functionality including:
* `map_file(addr, length, Blob, file-offset)`: semantically, this operation
copies the specified range from `Blob` into the range `[addr, addr+length)`
(where `addr+length <= memory_size`) but implementations are encouraged
to `mmap(addr, length, MAP_FIXED | MAP_PRIVATE, fd)`
* `dont_need(addr, length)`: semantically, this operation zeroes the given range
but the implementation is encouraged to `madvise(addr, length, MADV_DONTNEED)`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to emphasize this way more: this allows applications to be good citizens and release physical memory. That's a huge win on memory-constrained devices, and is a pretty important feature for many platforms (otherwise the OS is stuck playing favorites with an OOM killer).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point

(this allows applications to be good citizens and release unused physical
pages back to the OS, thereby reducing their RSS and avoiding OOM-killing on
mobile)
* `shmem_create(length)`: create a memory object that can be simultaneously
shared between multiple linear memories
* `map_shmem(addr, length, shmem, shmem-offset)`: like `map_file` except
`MAP_SHARED`, which isn't otherwise valid on read-only Blobs
* `mprotect(addr, length, prot-flags)`: change protection on the range
`[addr, addr+length)` (where `addr+length <= memory_size`)

The `addr` and `length` parameters above would be required to be multiples of
[`page_size`](AstSemantics.md#resizing).

The above list of functionality mostly covers the set of functionality
provided by the `mmap` OS primitive. One significant exception is that `mmap`
can allocate noncontiguous virtual address ranges. See the
[FAQ](FAQ.md#what-about-mmap) for rationale.

## More expressive control flow

Expand Down
5 changes: 3 additions & 2 deletions Modules.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,8 +108,9 @@ to allow *explicitly* sharing linear memory between multiple modules.
## Initial state of linear memory

A module will contain a section declaring the linear memory size (initial and
maximum size allowed by `sbrk`) and the initial contents of memory (analogous
to `.data`, `.rodata`, `.bss` sections in native executables).
maximum size allowed by [`resize_memory`](AstSemantics.md#resizing) and the
initial contents of memory (analogous to `.data`, `.rodata`, `.bss` sections in
native executables).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to dictate these, or let wasm handle them but restrict where they can go? As discussed in #285 we can get a no-signal non-page-table but highly-efficient zero-page and .rodata implementation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a sense, .rodata is letting wasm handle them but restricting where they can go. Having the data declared in a global section also allows the wasm compiler to bake in more constant address info compared to a fully dynamic API that returned dynamic pointers to the data (both with and without dynamic loading). That all being said, there's a lot more design space to consider on this subject, but it seems like a topic for a separate PR/issue.


## Code section

Expand Down
4 changes: 4 additions & 0 deletions Nondeterminism.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,10 @@ currently admits nondeterminism:
nondeterministic.
* Out of bounds heap accesses *may* want
[some flexibility](AstSemantics.md#out-of-bounds)
* The value returned by `page_size` is system-dependent. The arguments to the
[`resize_memory`](AstSemantics.md#resizing) and other
[future memory management builtins](FutureFeatures.md#finer-grained-control-over-memory)
are required to be multiples of `page_size`.
* NaN bit patterns in floating point
[operations](AstSemantics.md#floating-point-operations) and
[conversions](AstSemantics.md#datatype-conversions-truncations-reinterpretations-promotions-and-demotions)
Expand Down