From 85873a605d893da16cef481c4f6b3295a219482f Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Mon, 18 Apr 2022 15:39:46 -0500 Subject: [PATCH 01/27] Split core definitions out into separate index spaces --- design/mvp/Binary.md | 344 ++++--- design/mvp/CanonicalABI.md | 161 ++-- design/mvp/Explainer.md | 868 ++++++++++-------- design/mvp/FutureFeatures.md | 25 +- design/mvp/Subtyping.md | 6 +- design/mvp/canonical-abi/definitions.py | 76 +- design/mvp/canonical-abi/run_tests.py | 4 +- .../SharedEverythingDynamicLinking.md | 20 +- 8 files changed, 819 insertions(+), 685 deletions(-) diff --git a/design/mvp/Binary.md b/design/mvp/Binary.md index 4db0707f..a37f4c59 100644 --- a/design/mvp/Binary.md +++ b/design/mvp/Binary.md @@ -3,7 +3,7 @@ This document defines the binary format for the AST defined in the [explainer](Explainer.md). The top-level production is `component` and the convention is that a file suffixed in `.wasm` may contain either a -[`core:module`] *or* a `component`, using the `kind` field to discriminate +[`core:module`] *or* a `component`, using the `layer` field to discriminate between the two in the first 8 bytes (see [below](#component-definitions) for more details). @@ -17,197 +17,234 @@ and validation will be present in the [formal specification](../../spec/). (See [Component Definitions](Explainer.md#component-definitions) in the explainer.) ``` -component ::= s*:
* => (component flatten(s*)) -preamble ::= +component ::= s*:
* => (component flatten(s*)) +preamble ::= magic ::= 0x00 0x61 0x73 0x6D version ::= 0x0a 0x00 -kind ::= 0x01 0x00 -section ::= section_0() => ϵ - | t*:section_1(vec()) => t* - | i*:section_2(vec()) => i* - | f*:section_3(vec()) => f* - | m: section_4() => m - | c: section_5() => c - | i*:section_6(vec()) => i* - | e*:section_7(vec()) => e* - | s: section_8() => s - | a*:section_9(vec()) => a* +layer ::= 0x01 0x00 +section ::= section_0() => ϵ + | m*:section_1() => [core-prefix(m)] + | i*:section_2(vec()) => core-prefix(i)* + | a*:section_3(vec()) => core-prefix(a)* + | t*:section_4(vec()) => core-prefix(t)* + | c: section_5() => [c] + | i*:section_6(vec()) => i* + | a*:section_7(vec()) => a* + | t*:section_8(vec()) => t* + | c*:section_9(vec()) => c* + | s: section_10() => [s] + | i*:section_11(vec()) => i* + | e*:section_12(vec()) => e* ``` Notes: * Reused Core binary rules: [`core:section`], [`core:custom`], [`core:module`] +* The `core-prefix(t)` meta-function inserts a `core` token after the leftmost + paren of `t` (e.g., `core-prefix( (module (func)) )` is `(core module (func))`). * The `version` given above is pre-standard. As the proposal changes before final standardization, `version` will be bumped from `0xa` upwards to coordinate prototypes. When the standard is finalized, `version` will be changed one last time to `0x1`. (This mirrors the path taken for the Core WebAssembly 1.0 spec.) -* The `kind` field is meant to distinguish modules from components early in the - binary format. (Core WebAssembly modules already implicitly have a `kind` - field of `0x0` in their 4 byte [`core:version`] field.) +* The `layer` field is meant to distinguish modules from components early in + the binary format. (Core WebAssembly modules already implicitly have a + `layer` field of `0x0` in their 4 byte [`core:version`] field.) ## Instance Definitions (See [Instance Definitions](Explainer.md#instance-definitions) in the explainer.) ``` -instance ::= ie: => (instance ie) -instanceexpr ::= 0x00 0x00 m: a*:vec() => (instantiate (module m) (with a)*) - | 0x00 0x01 c: a*:vec() => (instantiate (component c) (with a)*) - | 0x01 e*:vec() => e* - | 0x02 e*:vec() => e* -modulearg ::= n: 0x02 i: => n (instance i) -componentarg ::= n: 0x00 m: => n (module m) - | n: 0x01 c: => n (component c) - | n: 0x02 i: => n (instance i) - | n: 0x03 f: => n (func f) - | n: 0x04 v: => n (value v) - | n: 0x05 t: => n (type t) -export ::= a: => (export a) -name ::= n: => n +core:instance ::= ie: => (instance ie) +core:instanceexpr ::= 0x00 m: arg*:vec() => (instantiate m arg*) + | 0x01 e*:vec() => e* +core:instantiatearg ::= n: si: => (with n si) +core:sortidx ::= sort: idx: => (sort idx) +core:sort ::= 0x00 => func + | 0x01 => table + | 0x02 => memory + | 0x03 => global + | 0x04 => type + | 0x10 => module + | 0x11 => instance +core:export ::= n: si: => (export n si) + +instance ::= ie: => (instance ie) +instanceexpr ::= 0x00 c: arg*:vec() => (instantiate c arg*) + | 0x01 e*:vec() => e* +instantiatearg ::= n: si: => (with n si) +sortidx ::= sort: idx: => (sort idx) +sort ::= 0x00 si: => si + | 0x01 => func + | 0x02 => value + | 0x03 => type + | 0x04 => component + | 0x05 => instance +export ::= n: si: => (export n si) ``` Notes: -* Reused Core binary rules: [`core:export`], [`core:name`] -* The indices in `modulearg`/`componentarg` are validated according to their - respective index space, which are built incrementally as each definition is - validated. In general, unlike core modules, which supports cyclic references - between (function) definitions, component definitions are strictly acyclic - and validated in a linear incremental manner, like core wasm instructions. -* The arguments supplied by `instantiate` are validated against the consuming - module/component according to the [subtyping](Subtyping.md) rules. - +* Reused Core binary rules: [`core:name`] +* The `core:sort` values are chosen to match the discriminant opcodes of + [`core:importdesc`] so that `core:exportdesc` (below) is identical. +* `type` is added to `core:sort` in anticipation of the [type-imports] proposal. Until that + proposal, core modules won't be able to actually import or export types, however, the + `type` sort is allowed as part of outer aliases (below). +* `module` and `instance` are added to `core:sort` in anticipation of the [module-linking] + proposal, which would add these types to Core WebAssembly. Again, core modules won't be + able to actually import or export modules/instances, but they are used for aliases. +* The indices in `sortidx` are validated according to their `sort`'s index + spaces, which are built incrementally as each definition is validated. +* The types of arguments supplied by `instantiate` are validated against the + types of the matching import according to the [subtyping](Subtyping.md) rules. ## Alias Definitions (See [Alias Definitions](Explainer.md#alias-definitions) in the explainer.) ``` -alias ::= 0x00 0x00 i: n: => (alias export i n (module)) - | 0x00 0x01 i: n: => (alias export i n (component)) - | 0x00 0x02 i: n: => (alias export i n (instance)) - | 0x00 0x03 i: n: => (alias export i n (func)) - | 0x00 0x04 i: n: => (alias export i n (value)) - | 0x01 0x00 i: n: => (alias export i n (func)) - | 0x01 0x01 i: n: => (alias export i n (table)) - | 0x01 0x02 i: n: => (alias export i n (memory)) - | 0x01 0x03 i: n: => (alias export i n (global)) - | ... other Post-MVP Core definition kinds - | 0x02 0x00 ct: i: => (alias outer ct i (module)) - | 0x02 0x01 ct: i: => (alias outer ct i (component)) - | 0x02 0x05 ct: i: => (alias outer ct i (type)) +core:alias ::= sort: target: => (core alias target (sort)) +core:aliastarget ::= 0x00 i: n: => export i n + | 0x01 ct: idx: => outer ct idx + +alias ::= sort: target: => (alias target (sort)) +aliastarget ::= 0x00 i: n: => export i n + | 0x01 ct: idx: => outer ct idx ``` Notes: -* For instance-export aliases (opcodes `0x00` and `0x01`), `i` is validated to - refer to an instance in the instance index space that exports `n` with the - specified definition kind. -* For outer aliases (opcode `0x02`), `ct` is validated to be *less or equal - than* the number of enclosing components and `i` is validated to be a valid - index in the specified definition's index space of the enclosing component - indicated by `ct` (counting outward, starting with `0` referring to the - current component). +* For `export` aliases, `i` is validated to refer to an instance in the + instance index space that exports `n` with the specified `sort`. +* For `outer` aliases, `ct` is validated to be *less or equal than* the number + of enclosing components and `i` is validated to be a valid + index in the `sort` index space of the `i`th enclosing component (counting + outward, starting with `0` referring to the current component). +* For `outer` aliases, validation restricts the `sort` of the `aliastarget` + to one of `type`, `module` or `component`. ## Type Definitions (See [Type Definitions](Explainer.md#type-definitions) in the explainer.) ``` -type ::= dt: => dt - | it: => it -deftype ::= mt: => mt - | ct: => ct - | it: => it - | ft: => ft - | vt: => vt -moduletype ::= 0x4f mtd*:vec() => (module mtd*) -moduletype-def ::= 0x01 dt: => dt - | 0x02 i: => i - | 0x07 n: d: => (export n d) -core:deftype ::= ft: => ft - | ... Post-MVP additions => ... -componenttype ::= 0x4e ctd*:vec() => (component ctd*) -instancetype ::= 0x4d itd*:vec() => (instance itd*) -componenttype-def ::= itd: => itd - | 0x02 i: => i -instancetype-def ::= 0x01 t: => t - | 0x07 n: dt: => (export n dt) - | 0x09 a: => a -import ::= n: dt: => (import n dt) -deftypeuse ::= i: => type-index-space[i] (must be ) -functype ::= 0x4c param*:vec() t: => (func param* (result t)) -param ::= 0x00 t: => (param t) - | 0x01 n: t: => (param n t) -valuetype ::= 0x4b t: => (value t) -intertypeuse ::= i: => type-index-space[i] (must be ) - | pit: => pit -primintertype ::= 0x7f => unit - | 0x7e => bool - | 0x7d => s8 - | 0x7c => u8 - | 0x7b => s16 - | 0x7a => u16 - | 0x79 => s32 - | 0x78 => u32 - | 0x77 => s64 - | 0x76 => u64 - | 0x75 => float32 - | 0x74 => float64 - | 0x73 => char - | 0x72 => string -intertype ::= pit: => pit - | 0x71 field*:vec() => (record field*) - | 0x70 case*:vec() => (variant case*) - | 0x6f t: => (list t) - | 0x6e t*:vec() => (tuple t*) - | 0x6d n*:vec() => (flags n*) - | 0x6c n*:vec() => (enum n*) - | 0x6b t*:vec() => (union t*) - | 0x6a t: => (option t) - | 0x69 t: u: => (expected t u) -field ::= n: t: => (field n t) -case ::= n: t: 0x0 => (case n t) - | n: t: 0x1 i: => (case n t (refines case-label[i])) +core:type ::= dt: => (type dt) (GC proposal) +core:deftype ::= ft: => ft (WebAssembly 1.0) + | st: => st (GC proposal) + | at: => at (GC proposal) + | mt: => mt +core:moduletype ::= 0x50 md*:vec() => (module md*) +core:moduledecl ::= 0x00 i: => i + | 0x01 t: => t + | 0x02 a: => a + | 0x03 e: => e +core:import ::= m: f: ed: => (import m f ed) (WebAssembly 1.0) +core:externdesc ::= id: => id (WebAssembly 1.0) +core:exportdecl ::= n: ed: => (export n ed) +``` +Notes: +* Reused Core binary rules: [`core:importdesc`], [`core:functype`] +* `core:import` as written above is binary-compatible with [`core:import`]. +* Validation of `core:moduledecl` (currently) rejects `core:moduletype` definitions + inside `type` declarators (i.e., nested core module types). +* Validation of `core:moduledecl` (currently) only allows `outer` `type` + `alias` declarators. +* As described in the explainer, each module type is validated with an + initially-empty type index space. Outer aliases can be used to pull + in type definitions from containing components. + +``` +type ::= dt: => (type dt) +deftype ::= vt: => vt + | ft: => ft + | ct: => ct + | it: => it +functype ::= 0x40 param*:vec() t: => (func param* (result t)) +param ::= 0x00 t: => (param t) + | 0x01 n: t: => (param n t) +componenttype ::= 0x41 cd*:vec() => (component cd*) +instancetype ::= 0x42 id*:vec() => (instance id*) +componentdecl ::= 0x00 id: => id + | id: => id +instancedecl ::= 0x01 t: => t + | 0x02 a: => a + | 0x03 ed: => ed +importdecl ::= n: ed: => (import n ed) +exportdecl ::= n: ed: => (export n ed) +externdesc ::= 0x00 i: => core-type-index-space[i] (must be moduletype) + | 0x01 i: => type-index-space[i] (must be func|instance|componenttype) + | 0x02 t: => (value t) + | 0x03 tb: => (type tb) +typebound ::= 0x00 i: => (eq type-index-space[i]) (any deftype) + | 0x00 t: => (eq t) +valtype ::= i: => type-index-space[i] (must be valtype) + | 0x7f => unit + | 0x7e => bool + | 0x7d => s8 + | 0x7c => u8 + | 0x7b => s16 + | 0x7a => u16 + | 0x79 => s32 + | 0x78 => u32 + | 0x77 => s64 + | 0x76 => u64 + | 0x75 => float32 + | 0x74 => float64 + | 0x73 => char + | 0x72 => string + | 0x71 field*:vec() => (record field*) + | 0x70 case*:vec() => (variant case*) + | 0x6f t: => (list t) + | 0x6e t*:vec() => (tuple t*) + | 0x6d n*:vec() => (flags n*) + | 0x6c n*:vec() => (enum n*) + | 0x6b t*:vec() => (union t*) + | 0x6a t: => (option t) + | 0x69 t: u: => (expected t u) +field ::= n: t: => (field n t) +case ::= n: t: 0x0 => (case n t) + | n: t: 0x1 i: => (case n t (refines case-label[i])) ``` Notes: -* Reused Core binary rules: [`core:import`], [`core:importdesc`], [`core:functype`] * The type opcodes follow the same negative-SLEB128 scheme as Core WebAssembly, with type opcodes starting at SLEB128(-1) (`0x7f`) and going down, reserving the nonnegative SLEB128s for type indices. -* The (`module`|`component`|`instance`)`type-def` opcodes match the corresponding - section numbers. -* Module, component and instance types create fresh type index spaces that are - populated and referenced by their contained definitions. E.g., for a module - type that imports a function, the `import` `moduletype-def` must be preceded - by either a `type` or `alias` `moduletype-def` that adds the function type to - the type index space. -* Currently, the only allowed form of `alias` in instance and module types - is `(alias outer ct li (type))`. In the future, other kinds of aliases - will be needed and this restriction will be relaxed. +* Validation of `moduledecl` (currently) only allows `outer` `type` `alias` + declarators. +* As described in the explainer, each component and instance type is validated + with an initially-empty type index space. Outer aliases can be used to pull + in type definitions from containing components. +* The rule for `typebound` contains both an unrestricted `` case and, + within `valtype`, a `valtype`-restricted `` case. Since the former + is a strict generalization of the latter, there is no ambiguity. The net + effect is that `eq` accepts all types. -## Function Definitions +## Canonical Definitions -(See [Function Definitions](Explainer.md#function-definitions) in the explainer.) +(See [Canonical Definitions](Explainer.md#canonical-definitions) in the explainer.) ``` -func ::= body: => (func body) -funcbody ::= 0x00 ft: opt*:vec() f: => (canon.lift ft opt* f) - | 0x01 opt*:* f: => (canon.lower opt* f) -canonopt ::= 0x00 => string-encoding=utf8 - | 0x01 => string-encoding=utf16 - | 0x02 => string-encoding=latin1+utf16 - | 0x03 m: => (memory m) - | 0x04 f: => (realloc f) - | 0x05 f: => (post-return f) +canon ::= 0x00 0x00 f: ft: opts: => (canon lift f type-index-space[ft] opts (func)) + | 0x01 0x00 f: opts: => (canon lower f opts (core func)) +opts ::= opt*:vec() => opt* +canonopt ::= 0x00 => string-encoding=utf8 + | 0x01 => string-encoding=utf16 + | 0x02 => string-encoding=latin1+utf16 + | 0x03 m: => (memory m) + | 0x04 f: => (realloc f) + | 0x05 f: => (post-return f) ``` Notes: -* Validation prevents duplicate or conflicting options. -* Validation of `canon.lift` requires `f` to have type `flatten(ft)` (defined +* The second `0x00` byte in `canon` stands for the `func` sort and thus the + `0x00 ` pair standards for a `func` `sortidx` or `core:sortidx`. +* Validation prevents duplicate or conflicting `canonopt`. +* Validation of `canon lift` requires `f` to have type `flatten(ft)` (defined by the [Canonical ABI](CanonicalABI.md#flattening)). The function being defined is given type `ft`. -* Validation of `canon.lower` requires `f` to be a component function. The +* Validation of `canon lower` requires `f` to be a component function. The function being defined is given core function type `flatten(ft)` where `ft` is the `functype` of `f`. -* If the lifting/lowering operations implied by `canon.lift` or `canon.lower` - require access to `memory` or `realloc`, then validation requires these - options to be present. If present, `realloc` must have type +* If the lifting/lowering operations implied by `lift` or `lower` require + access to `memory` or `realloc`, then validation requires these options to be + present. If present, `realloc` must have core type `(func (param i32 i32 i32 i32) (result i32))`. -* `post-return` is always optional, but, if present, must have type `(func)`. +* `post-return` is always optional, but, if present, must have core type + `(func)`. ## Start Definitions @@ -233,24 +270,25 @@ flags are set. ## Import and Export Definitions -(See [Import and Export Definitions](Explainer.md#import-and-export-definitions) in the explainer.) - -As described in the explainer, the binary decode rules of `import` and `export` -have already been defined above. - +(See [Import and Export Definitions](Explainer.md#import-and-export-definitions) +in the explainer.) +``` +import ::= n: ed: => (import n ed) +export ::= n: si: => (export n si) +``` Notes: * Validation requires all import and export `name`s are unique. -[`core:version`]: https://webassembly.github.io/spec/core/binary/modules.html#binary-version [`core:section`]: https://webassembly.github.io/spec/core/binary/modules.html#binary-section [`core:custom`]: https://webassembly.github.io/spec/core/binary/modules.html#custom-section [`core:module`]: https://webassembly.github.io/spec/core/binary/modules.html#binary-module -[`core:export`]: https://webassembly.github.io/spec/core/binary/modules.html#binary-export +[`core:version`]: https://webassembly.github.io/spec/core/binary/modules.html#binary-version [`core:name`]: https://webassembly.github.io/spec/core/binary/values.html#binary-name [`core:import`]: https://webassembly.github.io/spec/core/binary/modules.html#binary-import [`core:importdesc`]: https://webassembly.github.io/spec/core/binary/modules.html#binary-importdesc [`core:functype`]: https://webassembly.github.io/spec/core/binary/types.html#binary-functype -[Future Core Type]: https://github.com/WebAssembly/gc/blob/master/proposals/gc/MVP.md#type-definitions +[type-imports]: https://github.com/WebAssembly/proposal-type-imports/blob/master/proposals/type-imports/Overview.md +[module-linking]: https://github.com/WebAssembly/module-linking/blob/main/proposals/module-linking/Explainer.md diff --git a/design/mvp/CanonicalABI.md b/design/mvp/CanonicalABI.md index 96c45923..02173fc6 100644 --- a/design/mvp/CanonicalABI.md +++ b/design/mvp/CanonicalABI.md @@ -1,7 +1,7 @@ # Canonical ABI Explainer -This explainer walks through the Canonical ABI used by [function definitions] -to convert between high-level interface-typed values and low-level Core +This explainer walks through the Canonical ABI used by [canonical definitions] +to convert between high-level Component Model values and low-level Core WebAssembly values. * [Supporting definitions](#supporting-definitions) @@ -14,16 +14,16 @@ WebAssembly values. * [Flat Lifting](#flat-lifting) * [Flat Lowering](#flat-lowering) * [Lifting and Lowering](#lifting-and-lowering) -* [Canonical ABI built-ins](#canonical-abi-built-ins) - * [`canon.lift`](#canonlift) - * [`canon.lower`](#canonlower) +* [Canonical definitions](#canonical-definitions) + * [`lift`](#lift) + * [`lower`](#lower) ## Supporting definitions -The Canonical ABI specifies, for each interface-typed function signature, a +The Canonical ABI specifies, for each component function signature, a corresponding core function signature and the process for reading -interface-typed values into and out of linear memory. While a full formal +component-level values into and out of linear memory. While a full formal specification would specify the Canonical ABI in terms of macro-expansion into Core WebAssembly instructions augmented with a new set of (spec-internal) [administrative instructions], the informal presentation here instead specifies @@ -52,19 +52,19 @@ necessary to support recovery in the middle of nested allocations. In the MVP, for large allocations that can OOM, [streams](Explainer.md#TODO) would usually be the appropriate type to use and streams will be able to explicitly express failure in their type. Post-MVP, [adapter functions] would allow fully custom -OOM handling for all interface types, allowing a toolchain to intentionally -propagate OOM into the appropriate explicit return value of the function's -declared return type. +OOM handling for all component-level types, allowing a toolchain to +intentionally propagate OOM into the appropriate explicit return value of the +function's declared return type. ### Despecialization -[In the explainer][Type Definitions], interface types are classified as either *fundamental* or -*specialized*, where the specialized interface types are defined by expansion -into fundamental interface types. In most cases, the canonical ABI of a -specialized interface type is the same as its expansion so, to avoid +[In the explainer][Type Definitions], component value types are classified as +either *fundamental* or *specialized*, where the specialized value types are +defined by expansion into fundamental value types. In most cases, the canonical +ABI of a specialized value type is the same as its expansion so, to avoid repetition, the other definitions below use the following `despecialize` -function to replace specialized interface types with their expansion: +function to replace specialized value types with their expansion: ```python def despecialize(t): match t: @@ -76,14 +76,14 @@ def despecialize(t): case Expected(ok, error) : return Variant([ Case("ok", ok), Case("error", error) ]) case _ : return t ``` -The specialized interface types `string` and `flags` are missing from this list +The specialized value types `string` and `flags` are missing from this list because they are given specialized canonical ABI representations distinct from their respective expansions. ### Alignment -Each interface type is assigned an [alignment] which is used by subsequent +Each value type is assigned an [alignment] which is used by subsequent Canonical ABI definitions. Presenting the definition of `alignment` piecewise, we start with the top-level case analysis: ```python @@ -141,8 +141,8 @@ def alignment_flags(labels): ### Size -Each interface type is also assigned a `size`, measured in bytes, which -corresponds the `sizeof` operator in C: +Each value type is also assigned a `size`, measured in bytes, which corresponds +the `sizeof` operator in C: ```python def size(t): match despecialize(t): @@ -191,10 +191,10 @@ def num_i32_flags(labels): ### Loading -The `load` function defines how to read a value of a given interface type `t` -out of linear memory starting at offset `ptr`, returning a interface-typed -value (here, as a Python value). The `Opts`/`opts` class/parameter contains the -[`canonopt`] immediates supplied as part of `canon.lift`/`canon.lower`. +The `load` function defines how to read a value of a given value type `t` +out of linear memory starting at offset `ptr`, returning the value represented +as a Python value. The `Opts`/`opts` class/parameter contains the +[`canonopt`] immediates supplied as part of `canon lift`/`canon lower`. Presenting the definition of `load` piecewise, we start with the top-level case analysis: ```python @@ -280,10 +280,10 @@ def i32_to_char(opts, i): Strings are loaded from two `i32` values: a pointer (offset in linear memory) and a number of bytes. There are three supported string encodings in [`canonopt`]: [UTF-8], [UTF-16] and `latin1+utf16`. This last options allows a *dynamic* -choice between [Latin-1] and UTF-16, indicated by the high bit of the second `i32`. -String interface values include their original encoding and byte length as a +choice between [Latin-1] and UTF-16, indicated by the high bit of the second +`i32`. String values include their original encoding and byte length as a "hint" that enables `store_string` (defined below) to make better up-front -allocation size choices in many cases. Thus, the interface value produced by +allocation size choices in many cases. Thus, the value produced by `load_string` isn't simply a Python `str`, but a *tuple* containing a `str`, the original encoding and the original byte length. ```python @@ -398,7 +398,7 @@ def unpack_flags_from_int(i, labels): ### Storing -The `store` function defines how to write a value `v` of a given interface type +The `store` function defines how to write a value `v` of a given value type `t` into linear memory starting at offset `ptr`. Presenting the definition of `store` piecewise, we start with the top-level case analysis: ```python @@ -465,9 +465,9 @@ not to do. To avoid multiple passes, the canonical ABI instead uses a `realloc` approach to update the allocation size during the single copy. A blind `realloc` approach would normally suffer from multiple reallocations per string (e.g., using the standard doubling-growth strategy). However, as already shown -in `load_string` above, interface-typed strings come with two useful hints: -their original encoding and byte length. From this hint data, `store_string` can -do a much better job minimizing the number of reallocations. +in `load_string` above, string values come with two useful hints: their +original encoding and byte length. From this hint data, `store_string` can do a +much better job minimizing the number of reallocations. We start with a case analysis to enumerate all the meaningful encoding combinations, subdividing the `latin1+utf16` encoding into either `latin1` or @@ -716,9 +716,9 @@ With only the definitions above, the Canonical ABI would be forced to place all parameters and results in linear memory. While this is necessary in the general case, in many cases performance can be improved by passing small-enough values in registers by using core function parameters and results. To support this -optimization, the Canonical ABI defines `flatten` to map interface function +optimization, the Canonical ABI defines `flatten` to map component function types to core function types by attempting to decompose all the -non-dynamically-sized interface types into core parameters and results. +non-dynamically-sized component value types into core value types. For a variety of [practical][Implementation Limits] reasons, we need to limit the total number of flattened parameters and results, falling back to storing @@ -731,8 +731,8 @@ When there are too many flat values, in general, a single `i32` pointer can be passed instead (pointing to a tuple in linear memory). When lowering *into* linear memory, this requires the Canonical ABI to call `realloc` (in `lower` below) to allocate space to put the tuple. As an optimization, when lowering -the return value of an imported function (lowered by `canon.lower`), the caller -can have already allocated space for the return value (e.g., efficiently on the +the return value of an imported function (via `canon lower`), the caller can +have already allocated space for the return value (e.g., efficiently on the stack), passing in an `i32` pointer as an parameter instead of returning an `i32` as a return value. @@ -749,9 +749,9 @@ def flatten(functype, context): flat_results = flatten_type(functype.result) if len(flat_results) > MAX_FLAT_RESULTS: match context: - case 'canon.lift': + case 'lift': flat_results = ['i32'] - case 'canon.lower': + case 'lower': flat_params += ['i32'] flat_results = [] @@ -807,10 +807,10 @@ def join(a, b): ### Flat Lifting The `lift_flat` function defines how to convert zero or more core values into a -single high-level value of interface type `t`. The values are given by a value -iterator that iterates over a complete parameter or result list and asserts -that the expected and actual types line up. Presenting the definition of -`lift_flat` piecewise, we start with the top-level case analysis: +single high-level value of type `t`. The values are given by a value iterator +that iterates over a complete parameter or result list and asserts that the +expected and actual types line up. Presenting the definition of `lift_flat` +piecewise, we start with the top-level case analysis: ```python @dataclass class Value: @@ -849,10 +849,10 @@ def lift_flat(opts, vi, t): ``` Integers are lifted from core `i32` or `i64` values using the signedness of the -interface type to interpret the high-order bit. When the interface type is -narrower than an `i32`, the Canonical ABI specifies a dynamic range check in -order to catch bugs. The conversion logic here assumes that `i32` values are -always represented as unsigned Python `int`s and thus lifting to a signed type +target type to interpret the high-order bit. When the target type is narrower +than an `i32`, the Canonical ABI specifies a dynamic range check in order to +catch bugs. The conversion logic here assumes that `i32` values are always +represented as unsigned Python `int`s and thus lifting to a signed type performs a manual 2s complement conversion in the Python (which would be a no-op in hardware). ```python @@ -948,9 +948,9 @@ def lift_flat_flags(vi, labels): ### Flat Lowering -The `lower_flat` function defines how to convert a value `v` of a given -interface type `t` into zero or more core values. Presenting the definition of -`lower_flat` piecewise, we start with the top-level case analysis: +The `lower_flat` function defines how to convert a value `v` of a given type +`t` into zero or more core values. Presenting the definition of `lower_flat` +piecewise, we start with the top-level case analysis: ```python def lower_flat(opts, v, t): match despecialize(t): @@ -973,9 +973,9 @@ def lower_flat(opts, v, t): case Flags(labels) : return lower_flat_flags(v, labels) ``` -Since interface-typed values are assumed to in-range and, as previously stated, +Since component-level values are assumed in-range and, as previously stated, core `i32` values are always internally represented as unsigned `int`s, -unsigned interface values need no extra conversion. Signed interface values are +unsigned integer values need no extra conversion. Signed integer values are converted to unsigned core `i32`s by 2s complement arithmetic (which again would be a no-op in hardware): ```python @@ -1044,8 +1044,8 @@ def lower_flat_flags(v, labels): ### Lifting and Lowering The `lift` function defines how to lift a list of at most `max_flat` core -parameters or results given by the `ValueIter` `vi` into a tuple of interface -values with types `ts`: +parameters or results given by the `ValueIter` `vi` into a tuple of values with +types `ts`: ```python def lift(opts, max_flat, vi, ts): flat_types = flatten_types(ts) @@ -1058,9 +1058,9 @@ def lift(opts, max_flat, vi, ts): return [ lift_flat(opts, vi, t) for t in ts ] ``` -The `lower` function defines how to lower a list of interface values `vs` of -types `ts` into a list of at most `max_flat` core values. As already described -for [`flatten`](#flattening) above, lowering handles the +The `lower` function defines how to lower a list of component-level values `vs` +of types `ts` into a list of at most `max_flat` core values. As already +described for [`flatten`](#flattening) above, lowering handles the greater-than-`max_flat` case by either allocating storage with `realloc` or accepting a caller-allocated buffer as an out-param: ```python @@ -1086,24 +1086,23 @@ def lower(opts, max_flat, vs, ts, out_param = None): ## Canonical ABI built-ins Using the above supporting definitions, we can describe the static and dynamic -semantics of [`func`], whose AST is defined in the main explainer as: +semantics of [`canon`], whose AST is defined in the main explainer as: ``` -func ::= (func ? ) -funcbody ::= (canon.lift * ) - | (canon.lower * ) +canon ::= (canon lift * (func ?)) + | (canon lower * (core func ?)) ``` The following subsections define the static and dynamic semantics of each case of `funcbody`. -### `canon.lift` +### `lift` For a function: ``` -(func $f (canon.lift $ft: $opts:* $callee:)) +(canon lift $ft: $opts:* $callee: (func $f)) ``` validation specifies: - * `$callee` must have type `flatten($ft, 'canon.lift')` + * `$callee` must have type `flatten($ft, 'lift')` * `$f` is given type `$ft` * a `memory` is present if required by lifting and is a subtype of `(memory 1)` * a `realloc` is present if required by lifting and has type `(func (param i32 i32 i32 i32) (result i32))` @@ -1112,19 +1111,19 @@ validation specifies: When instantiating component instance `$inst`: * Define `$f` to be the closure `lambda args: canon_lift($opts, $inst, $callee, $ft, args)` -Thus, `$f` captures `$opts`, `$inst`, `$callee` and `$ft` in a closure which can be -subsequently exported or passed into a child instance (via `with`). If `$f` -ends up being called by the host, the host is responsible for, in a -host-defined manner, conjuring up interface values suitable for passing into -`lower` and, conversely, consuming the interface values produced by `lift`. For +Thus, `$f` captures `$opts`, `$inst`, `$callee` and `$ft` in a closure which +can be subsequently exported or passed into a child instance (via `with`). If +`$f` ends up being called by the host, the host is responsible for, in a +host-defined manner, conjuring up component values suitable for passing into +`lower` and, conversely, consuming the component values produced by `lift`. For example, if the host is a native JS runtime, the [JavaScript embedding] would -specify how native JavaScript values are converted to and from interface +specify how native JavaScript values are converted to and from component values. Alternatively, if the host is a Unix CLI that invokes component exports directly from the command line, the CLI could choose to automatically parse -`argv` into interface values according to the declared interface types of the -export. In any case, `canon.lift` specifies how these variously-produced -interface values are consumed as parameters (and produced as results) by a -*single host-agnostic component*. +`argv` into component-level values according to the declared types of the +export. In any case, `canon lift` specifies how these variously-produced values +are consumed as parameters (and produced as results) by a *single host-agnostic +component*. The `$inst` captured above is assumed to have at least the following two fields, which are used to implement the [component invariants]: @@ -1165,9 +1164,9 @@ def canon_lift(callee_opts, callee_instance, callee, functype, args): There are a number of things to note about this definition: Uncaught Core WebAssembly [exceptions] result in a trap at component -boundaries. Thus, if a component wishes to signal an error, it must -use some sort of explicit interface type such as `expected` (whose `error` case -particular language bindings may choose to map to and from exceptions). +boundaries. Thus, if a component wishes to signal an error, it must use some +sort of explicit type such as `expected` (whose `error` case particular +language bindings may choose to map to and from exceptions). The contract assumed by `canon_lift` (and ensured by `canon_lower` below) is that the caller of `canon_lift` *must* call `post_return` right after lowering @@ -1196,14 +1195,14 @@ component linking configurations, hence the eager error helps ensure compositionality. -### `canon.lower` +### `lower` For a function: ``` -(func $f (canon.lower $opts:* $callee:)) +(canon lower $opts:* $callee: (core func $f)) ``` where `$callee` has type `$ft`, validation specifies: -* `$f` is given type `flatten($ft, 'canon.lower')` +* `$f` is given type `flatten($ft, 'lower')` * a `memory` is present if required by lifting and is a subtype of `(memory 1)` * a `realloc` is present if required by lifting and has type `(func (param i32 i32 i32 i32) (result i32))` * there is no `post-return` in `$opts` @@ -1249,7 +1248,7 @@ lifting and lowering), with a few exceptions: `i32` parameter. A useful consequence of the above rules for `may_enter` and `may_leave` is that -attempting to `canon.lower` to a `callee` in the same instance is a guaranteed, +attempting to `canon lower` to a `callee` in the same instance is a guaranteed, immediate trap which a link-time compiler can eagerly compile to an `unreachable`. This avoids what would otherwise be a surprising form of memory aliasing that could introduce obscure bugs. @@ -1263,9 +1262,9 @@ the elimination of string operations on the labels of records and variants) as well as post-MVP [adapter functions]. -[Function Definitions]: Explainer.md#function-definitions -[`canonopt`]: Explainer.md#function-definitions -[`func`]: Explainer.md#function-definitions +[Canonical Definitions]: Explainer.md#canonical-definitions +[`canonopt`]: Explainer.md#canonical-definitions +[`canon`]: Explainer.md#canonical-definitions [Type Definitions]: Explainer.md#type-definitions [Component Invariants]: Explainer.md#component-invariants [JavaScript Embedding]: Explainer.md#JavaScript-embedding diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index 85d418a1..6eb1df92 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -1,15 +1,15 @@ # Component Model Explainer This explainer walks through the assembly-level definition of a -[component](../high-level) and the proposed embedding of components into a -native JavaScript runtime. +[component](../high-level) and the proposed embedding of components into +native JavaScript runtimes. * [Grammar](#grammar) * [Component definitions](#component-definitions) * [Instance definitions](#instance-definitions) * [Alias definitions](#alias-definitions) * [Type definitions](#type-definitions) - * [Function definitions](#function-definitions) + * [Canonical definitions](#canonical-definitions) * [Start definitions](#start-definitions) * [Import and export definitions](#import-and-export-definitions) * [Component invariants](#component-invariants) @@ -20,7 +20,7 @@ native JavaScript runtime. * [TODO](#TODO) (Based on the previous [scoping and layering] proposal to the WebAssembly CG, -this repo merges and supersedes the [Module Linking] and [Interface Types] +this repo merges and supersedes the [module-linking] and [interface-types] proposals, pushing some of their original features into the post-MVP [future feature](FutureFeatures.md) backlog.) @@ -51,44 +51,62 @@ below. At the top-level, a `component` is a sequence of definitions of various kinds: ``` component ::= (component ? *) -definition ::= +definition ::= core-prefix() + | core-prefix() + | core-prefix() + | core-prefix() | | | | - | + | | | | ``` -Core WebAssembly modules (henceforth just "modules") are also sequences of -(different kinds of) definitions. However, unlike modules, components allow -arbitrarily interleaving the different kinds of definitions. As we'll see -below, this arbitrary interleaving reflects the need for different kinds of -definitions to be able to refer back to each other. Importantly, though, -component definitions are acyclic: definitions can only refer back to preceding -definitions (in the AST, text format or binary format). - -The first kind of component definition is a module, as defined by the existing -Core WebAssembly specification's [`core:module`] top-level production. Thus, -components physically embed one or more modules and can be thought of as a -kind of container format for modules. - -The second kind of definition is, recursively, a component itself. Thus, -components form trees with modules (and all other kinds of definitions) only -appearing at the leaves. - -With what's defined so far, we can define the following component: +Components are like Core WebAssembly modules in that their contained +definitions are acyclic: definitions can only refer to preceding definitions +(in the AST, text format and binary format). However, unlike modules, +components can arbitrarily interleave different kinds of definitions. + +The `core-prefix` meta-function transforms a grammatical rule for parsing a +Core WebAssembly definition into a grammatical rule for parsing the same +definition, but with a `core` token added right after the leftmost paren: +``` +core-prefix(X) ::= '(' 'core' Y ')' where X = '(' Y ')' +``` +For example, `core:module` accepts `(module (func))` so +`core-prefix()` accepts `(core module (func))`. Note that the +inner `func` doesn't need a `core` prefix; the `core` token is used to mark the +*transition* from parsing component definitions into core definitions. + +The [`core:module`] production is unmodified by the Component Model and thus +components embed Core WebAssemby (text and binary format) modules as currently +standardized, allowing reuse of an unmodified Core WebAssembly implementation. +The next two productions, `core:instance` and `core:alias`, are not currently +included in Core WebAssembly, but would be if Core WebAssembly adopted the +[module-linking] proposal. These two new core definitions are introduced below, +alongside their component-level counterparts. Finally, the existing +[`core:type`] production is extended below to add core module types as proposed +for module-linking. Thus, the overall idea is to represent core definitions (in +the AST, binary and text format) as-if they had already been added to Core +WebAssembly so that, if they eventually are, the implementation of decoding and +validation can be shared in a layered fashion. + +The next kind of definition is, recursively, a component itself. Thus, +components form trees with all other kinds of definitions only appearing at the +leaves. For example, with what's defined so far, we can write the following +component: ```wasm (component (component - (module (func (export "one") (result i32) (i32.const 1))) - (module (func (export "two") (result f32) (f32.const 2))) + (core module (func (export "one") (result i32) (i32.const 1))) + (core module (func (export "two") (result f32) (f32.const 2))) ) - (module (func (export "three") (result i64) (i64.const 3))) + (core module (func (export "three") (result i64) (i64.const 3))) (component (component - (module (func (export "four") (result f64) (f64.const 4))) + (core module (func (export "four") (result f64) (f64.const 4))) ) ) (component) @@ -96,7 +114,7 @@ With what's defined so far, we can define the following component: ``` This top-level component roots a tree with 4 modules and 1 component as leaves. However, in the absence of any `instance` definitions (introduced -next), nothing will be instantiated or executed at runtime: everything here is +next), nothing will be instantiated or executed at runtime; everything here is dead code. @@ -105,125 +123,150 @@ dead code. Whereas modules and components represent immutable *code*, instances associate code with potentially-mutable *state* (e.g., linear memory) and thus are necessary to create before being able to *run* the code. Instance definitions -create module or component instances by selecting a module/component and -supplying a set of named *arguments* which satisfy all the named *imports* of -the selected module/component: -``` -instance ::= (instance ? ) -instanceexpr ::= (instantiate (module ) (with )*) - | (instantiate (component ) (with )*) - | * - | core * -modulearg ::= (instance ) - | (instance *) -componentarg ::= (module ) - | (component ) - | (instance ) - | (func ) - | (value ) - | (type ) - | (instance *) -export ::= (export ) -``` -When instantiating a module via -`(instantiate (module $M) (with )*)`, the two-level imports of -the module `$M` are resolved as follows: -1. The first `name` of an import is looked up in the named list of `modulearg` - to select a module instance. -2. The second `name` of an import is looked up in the named list of exports of - the module instance found by the first step to select the imported - core definition (a `func`, `memory`, `table`, `global`, etc). - -Based on this, we can link two modules `$A` and `$B` together with the +create module or component instances by selecting a module or component and +then supplying a set of named *arguments* which satisfy all the named *imports* +of the selected module or component. + +The syntax for defining a core module instance is: +``` +core:instance ::= (instance ? ) +core:instanceexpr ::= (instantiate *) + | * +core:instantiatearg ::= (with ) + | (with (instance *)) +core:sortidx ::= ( ) +core:sort ::= func + | table + | memory + | global + | type + | module + | instance +core:export ::= (export ) +``` +When instantiating a module via `instantiate`, the two-level imports of the +core modules are resolved as follows: +1. The first `name` of the import is looked up in the named list of + `core:instantiatearg` to select a core module instance. +2. The second `name` of the import is looked up in the named list of exports of + the core module instance found by the first step to select the imported + core definition. + +Each `core:sort` corresponds 1:1 with a distinct [index space] that contains +only core definitions of that *sort*. The `varu32` field of `core:sortidx` +indexes into the sort's associated index space to select a definition. + +Based on this, we can link two core modules `$A` and `$B` together with the following component: ```wasm (component - (module $A + (core module $A (func (export "one") (result i32) (i32.const 1)) ) - (module $B + (core module $B (func (import "a" "one") (result i32)) ) - (instance $a (instantiate (module $A))) - (instance $b (instantiate (module $B) (with "a" (instance $a)))) + (core instance $a (instantiate $A)) + (core instance $b (instantiate $B (with "a" (instance $a)))) ) ``` -Components, as we'll see below, have single-level imports, i.e., each import -has only a single `name`, and thus every different kind of definition can be -passed as a `componentarg` when instantiating a component, not just instances. -Component instantiation will be revisited below after introducing the -prerequisite type and import definitions. +To see examples of other sorts, we'll need `alias` definitions, which are +introduced in the next section. + +The `*` form of `core:instanceexpr` allows module instances to be +created by directly tupling together preceding definitions, without the need to +`instantiate` a helper module. The "inline" form of `*` inside +`(with ...)` is syntactic sugar that is expanded during text format parsing +into an out-of-line instance definition referenced by `with`. To show an +example of these, we'll also need the `alias` definitions introduced in the +next section. + +The syntax for defining component instances is symmetric to core module +instances, but with a distinct component-level definition of `sort`: +``` +instance ::= (instance ? ) +instanceexpr ::= (instantiate *) + | * +instantiatearg ::= (with ) + | (with (instance *)) +sortidx ::= ( ) +sort ::= core-prefix() + | func + | value + | type + | component + | instance +export ::= (export ) +``` +Because component-level function, type and instance definitions are different +than core-level function, type and instance definitions, they are put into +disjoint index spaces which are indexed separately by `sortidx` and +`core:sortidx`, respectively. Components may import or export core modules +(since core modules are immutable values and thus do not break the +[shared-nothing] model) and so `sortidx` includes `core:sortidx` (which +validation then restricts to core modules; in the future, other immutable core +definitions could be allowed, such as `data` segments). -Lastly, the `(instance *)` and `(instance *)` -expressions allow component and module instances to be created by directly -tupling together preceding definitions, without the need to `instantiate` -anything. The "inline" forms of these expressions in `modulearg` -and `componentarg` are text format sugar for the "out of line" form in -`instanceexpr`. To show an example of how these instance-creation forms are -useful, we'll first need to introduce the `alias` definitions in the next -section. +To see a non-trivial example of component instantiation, we'll first need to +introduce a few other definitions below that allow components to import, define +and export component functions. ### Alias Definitions -Alias definitions project definitions out of other components' index spaces +Alias definitions project definitions out of other components' index spaces and into the current component's index spaces. As represented in the AST below, -there are two kinds of "targets" for an alias: the `export` of a component -instance, or a local definition of an `outer` component that contains the -current component: -``` -alias ::= (alias ) -aliastarget ::= export - | outer -aliaskind ::= (module ?) - | (component ?) - | (instance ?) - | (func ?) - | (value ?) - | (type ?) - | (table ?) - | (memory ?) - | (global ?) - | ... other Post-MVP Core definition kinds -``` -Aliases add a new element to the index space indicated by `aliaskind`. -(Validation ensures that the `aliastarget` does indeed refer to a matching -definition kind.) The `id` in `aliaskind` is bound to this new index and -thus can be used anywhere a normal `id` can be used. - -In the case of `export` aliases, validation requires that `instanceidx` refers -to an instance which exports `name`. - -In the case of `outer` aliases, the (`outeridx`, `idx`) pair serves as a -[de Bruijn index], with `outeridx` being the number of enclosing components to -skip and `idx` being an index into the target component's `aliaskind` index -space. In particular, `outeridx` can be `0`, in which case the outer alias -refers to the current component. To maintain the acyclicity of module +there are two kinds of "targets" for an alias: the `export` of an instance and +a definition in an index space of an `outer` component (containing the current +component): +``` +core:alias ::= (alias ( ?)) +core:aliastarget ::= export + | outer + +alias ::= (alias ( ?)) +aliastarget ::= export + | outer +``` +The `core:sort`/`sort` immediate of the alias specifies which index space in +the target component is being read from and which index space of the containing +component is being added to. If present, the `id` of the alias is bound to the +new index added by the alias and can be used anywhere a normal `id` can be +used. + +In the case of `export` aliases, validation ensures `name` is an export in the +target instance and has a matching sort. + +In the case of `outer` aliases, the `varu32` pair serves as a [de Bruijn +index], with first `varu32` being the number of enclosing components to skip +and the second `varu32` being an index into the target component's sort's index +space. In particular, the first `varu32` can be `0`, in which case the outer +alias refers to the current component. To maintain the acyclicity of module instantiation, outer aliases are only allowed to refer to *preceding* outer definitions. Components containing outer aliases effectively produce a [closure] at instantiation time, including a copy of the outer-aliased definitions. Because -of the prevalent assumption that components are (stateless) *values*, outer -aliases are restricted to only refer to stateless definitions: components, -modules and types. (In the future, outer aliases to all kinds of definitions -could be allowed by recording the statefulness of the resulting component in -its type via some kind of "`stateful`" type attribute.) +of the prevalent assumption that components are immutable values, outer aliases +are restricted to only refer to immutable definitions: types, modules and +components. (In the future, outer aliases to all sorts of definitions could be +allowed by recording the statefulness of the resulting component in its type +via some kind of "`stateful`" type attribute.) Both kinds of aliases come with syntactic sugar for implicitly declaring them inline: -For `export` aliases, the inline sugar has the form `(kind +)` -and can be used anywhere a `kind` index appears in the AST. For example, the +For `export` aliases, the inline sugar has the form `(sort +)` +and can be used anywhere a `sort` index appears in the AST. For example, the following snippet uses an inline function alias: ```wasm -(instance $j (instantiate (component $J) (with "f" (func $i "f")))) -(export "x" (func $j "g" "h")) +(instance $j (instantiate $J (with "f" (func $i "f")))) +(export "x" (func (func $j "g" "h"))) ``` which is desugared into: ```wasm (alias export $i "f" (func $f_alias)) -(instance $j (instantiate (component $J) (with "f" (func $f_alias)))) +(instance $j (instantiate $J (with "f" (func $f_alias)))) (alias export $j "g" (instance $g_alias)) (alias export $g_alias "h" (func $h_alias)) (export "x" (func $h_alias)) @@ -234,129 +277,186 @@ definition, resolved using normal lexical scoping rules. For example, the following component: ```wasm (component - (module $M ...) + (core module $M ...) (component - (instance (instantiate (module $M))) + (core instance (instantiate $M)) ) ) ``` is desugared into: ```wasm (component $C - (module $M ...) + (core module $M ...) (component - (alias outer $C $M (module $C_M)) - (instance (instantiate (module $C_M))) + (core alias outer $C $M (module $C_M)) + (core instance (instantiate $C_M)) ) ) ``` Lastly, for symmetry with [imports][func-import-abbrev], aliases can be written -in an inverted form that puts the definition kind first: +in an inverted form that puts the sort first: ```wasm -(func $f (import "i" "f")) ≡ (import "i" "f" (func $f)) ;; (existing) -(func $g (alias $i "g1")) ≡ (alias $i "g1" (func $g)) ;; (new) +(func $f (import "i" "f")) ≡ (import "i" "f" (func $f)) (WebAssembly 1.0) +(func $g (alias export $i "g1")) ≡ (alias export $i "g1" (func $g)) +(core func $g (alias export $i "g1")) ≡ (core alias export $i "g1" (func $g)) ``` With what's defined so far, we're able to link modules with arbitrary renamings: ```wasm (component - (module $A + (core module $A (func (export "one") (result i32) (i32.const 1)) (func (export "two") (result i32) (i32.const 2)) (func (export "three") (result i32) (i32.const 3)) ) - (module $B + (core module $B (func (import "a" "one") (result i32)) ) - (instance $a (instantiate (module $A))) - (instance $b1 (instantiate (module $B) - (with "a" (instance $a)) ;; no renaming + (core instance $a (instantiate $A)) + (core instance $b1 (instantiate $B + (with "a" (instance $a)) ;; no renaming )) - (func $a_two (alias export $a "two")) ;; ≡ (alias export $a "two" (func $a_two)) - (instance $b2 (instantiate (module $B) + (core func $a_two (alias export $a "two")) ;; ≡ (core alias export $a "two" (func $a_two)) + (core instance $b2 (instantiate $B (with "a" (instance - (export "one" (func $a_two)) ;; renaming, using explicit alias + (export "one" (func $a_two)) ;; renaming, using out-of-line alias )) )) - (instance $b3 (instantiate (module $B) + (core instance $b3 (instantiate $B (with "a" (instance - (export "one" (func $a "three")) ;; renaming, using inline alias sugar + (export "one" (func $a "three")) ;; renaming, using inline alias sugar )) )) ) ``` -To show analogous examples of linking components, we'll first need to define -a new set of types and functions for components to use. +To show analogous examples of linking components, we'll need component-level +type and function definitions which are introduced in the next two sections. ### Type Definitions -The type grammar below defines two levels of types, with the second level -building on the first: -1. `intertype` (also referred to as "interface types" below): the set of - types of first-class, high-level values communicated across shared-nothing - component interface boundaries -2. `deftype`: the set of types of second-class component definitions which are - imported/exported at instantiation-time. - -The top-level `type` definition is used to define types out-of-line so that -they can be reused via `typeidx` by future definitions. -``` -type ::= (type ? ) -typeexpr ::= - | -deftype ::= - | - | - | - | -moduletype ::= (module ? *) -moduletype-def ::= - | - | (export ) -core:deftype ::= - | ... Post-MVP additions -componenttype ::= (component ? *) -componenttype-def ::= - | -import ::= (import ) -instancetype ::= (instance ? *) -instancetype-def ::= - | - | (export ) -functype ::= (func ? (param ? )* (result )) -valuetype ::= (value ? ) -intertype ::= unit | bool - | s8 | u8 | s16 | u16 | s32 | u32 | s64 | u64 - | float32 | float64 - | char | string - | (record (field )*) - | (variant (case (refines )?)+) - | (list ) - | (tuple *) - | (flags *) - | (enum +) - | (union +) - | (option ) - | (expected ) -``` -On a technical note: this type grammar uses `` and `` -recursively to allow it to more-precisely indicate the kinds of types allowed. -The formal spec AST would instead use a `` with validation rules to -restrict the target type while the formal text format would use something like -[`core:typeuse`], allowing any of: (1) a `typeidx`, (2) an identifier `$T` -resolving to a type definition (using `(type $T)` in cases where there is a -grammatical ambiguity), or (3) an inline type definition that is desugared into -a deduplicated out-of-line type definition. - -On another technical note: the optional `id` in all the `deftype` type -constructors (e.g., `(module ? ...)`) is only allowed to be present in the -context of `import` since this is the only context in which binding an -identifier makes sense. - -Starting with interface types, the set of values allowed for the *fundamental* -interface types is given by the following table: +The syntax for defining core types extends the existing core type definition +syntax, adding a `module` type constructor: +``` +core:type ::= (type ? ) (GC proposal) +core:deftype ::= (WebAssembly 1.0) + | (GC proposal) + | (GC proposal) + | +core:moduletype ::= (module ? *) +core:moduledecl ::= + | + | + | +core:importdecl ::= (import ) +core:exportdecl ::= (export ) +core:externdesc ::= (WebAssembly 1.0) +``` +Here, `core:deftype` (short for "defined type") is inherited from the [gc] +proposal and extended with a `module` type constructor. If module-linking is +added to Core WebAssembly, an `instance` type constructor would be added as +well but, for now, it's left out since it's unnecessary. Also, in the MVP, +validation will reject nested `core:moduletype`, since, before module-linking, +core modules cannot themselves import or export other core modules. + +The body of a module type contains an ordered list of "module declarators" +which describe, at a type level, the imports and exports of the module. In a +module-type context, import and export declarators can both reuse the existing +[`core:importdesc`] production defined in WebAssembly 1.0. To avoid confusion, +`core:importdesc` is renamed to `core:externdesc` (for symmetry with +[`core:externtype`]). + +In preparation for the forthcoming addition of [type-imports] to Core +WebAssembly, module types start with an empty type index space so that the type +index space can be populated with fresh type definitions constructed from type +imports. Thus, `core:moduledecl` also includes a `type` declarator for defining +the types used by the `import` and `export` declarators. An `alias` declarator +is also necessary in the future for defining type-sharing constraints between +type imports. In the short-term, `alias` declarators are restricted to only +allowing `outer` `type` aliases, thereby enabling a module type to reuse a +parent's type definition instead of re-defining it locally. + +As an example, the following component defines two equivalent module types, +where the former defines the function via `type` declarator and the latter via +`alias` declarator. In both cases, the type is given index `0` since the module +type starts with an empty type index space. +```wasm +(component $C + (core type $M1 (module + (type (func (param i32) (result i32))) + (import "a" "b" (func (type 0))) + (export "c" (func (type 0))) + )) + (core type $F (func (param i32) (result i32))) + (core type $M2 (module + (alias outer $C $F (type)) + (import "a" "b" (func (type 0))) + (export "c" (func (type 0))) + )) +) +``` + +Component-level type definitions are symmetric to core-level type definitions, +but use a completely different set of value types. Unlike [`core:valtype`] +which is low-level and assumes a shared linear memory for communicating +compound values, component-level value types assume no shared memory and must +therefore be high-level, describing entire compound values. +``` +type ::= (type ? ) +deftype ::= + | + | + | +functype ::= (func ? (param ? )* (result )) +componenttype ::= (component ? *) +instancetype ::= (instance ? *) +componentdecl ::= + | +instancedecl ::= + | + | +importdecl ::= (import ) +exportdecl ::= (export ) +externdesc ::= core-prefix() + | + | + | + | (value ? ) + | (type ? ) +typebound ::= (eq ) +valtype ::= unit + | bool + | s8 | u8 | s16 | u16 | s32 | u32 | s64 | u64 + | float32 | float64 + | char | string + | (record (field )*) + | (variant (case (refines )?)+) + | (list ) + | (tuple *) + | (flags *) + | (enum +) + | (union +) + | (option ) + | (expected ) +``` +This type grammar uses productions like `` and `` recursively +to allow it to more-precisely indicate what's allowed. The formal AST and +[binary format](Binary.md#type-definitions) instead use a `` with +validation rules to restrict the target type while the formal text format would +use something like [`core:typeuse`], allowing any of: (1) a `typeidx`, (2) an +identifier `$T` resolving to a type definition (using `(type $T)` in cases +where there is a grammatical ambiguity), or (3) an inline type definition that +is desugared into a deduplicated out-of-line type definition. + +The optional `id` after all the type constructors (e.g., `(module ? ...)`) +is only allowed to be present in the context of `import` since this is the only +context in which binding an identifier makes sense. + +The value types in `valtype` can be broken into two categories: *fundamental* +value types and *specialized* value types, where the latter are defined by +expansion into the former. The *fundamental value types* have the following +sets of abstract values: | Type | Values | | ------------------------- | ------ | | `bool` | `true` and `false` | @@ -364,11 +464,12 @@ interface types is given by the following table: | `u8`, `u16`, `u32`, `u64` | integers in the range [0, 2N-1] | | `float32`, `float64` | [IEEE754] floating-pointer numbers with a single, canonical "Not a Number" ([NaN]) value | | `char` | [Unicode Scalar Values] | -| `record` | heterogeneous [tuples] of named `intertype` values | -| `variant` | heterogeneous [tagged unions] of named `intertype` values | -| `list` | homogeneous, variable-length [sequences] of `intertype` values | +| `record` | heterogeneous [tuples] of named values | +| `variant` | heterogeneous [tagged unions] of named values | +| `list` | homogeneous, variable-length [sequences] of values | -NaN values are canonicalized to a single value so that: +The `float32` and `float64` values have their NaNs canonicalized to a single +value so that: 1. consumers of NaN values are free to use the rest of the NaN payload for optimization purposes (like [NaN boxing]) without needing to worry about whether the NaN payload bits were significant; and @@ -383,73 +484,64 @@ subtyping. In particular, a `variant` subtype can contain a `case` not present in the supertype if the subtype's `case` `refines` (directly or transitively) some `case` in the supertype. -The sets of values allowed for the remaining *specialized* interface types are +The sets of values allowed for the remaining *specialized value types* are defined by the following mapping: ``` - (tuple *) ↦ (record (field "𝒊" )*) for 𝒊=0,1,... - (flags *) ↦ (record (field bool)*) - unit ↦ (record) - (enum +) ↦ (variant (case unit)+) - (option ) ↦ (variant (case "none") (case "some" )) - (union +) ↦ (variant (case "𝒊" )+) for 𝒊=0,1,... -(expected ) ↦ (variant (case "ok" ) (case "error" )) - string ↦ (list char) + (tuple *) ↦ (record (field "𝒊" )*) for 𝒊=0,1,... + (flags *) ↦ (record (field bool)*) + unit ↦ (record) + (enum +) ↦ (variant (case unit)+) + (option ) ↦ (variant (case "none") (case "some" )) + (union +) ↦ (variant (case "𝒊" )+) for 𝒊=0,1,... +(expected ) ↦ (variant (case "ok" ) (case "error" )) + string ↦ (list char) ``` Note that, at least initially, variants are required to have a non-empty list of cases. This could be relaxed in the future to allow an empty list of cases, with the empty `(variant)` effectively serving as a [bottom type] and indicating unreachability. -Building on these interface types, there are four kinds of types describing the -four kinds of importable/exportable component definitions. (In the future, a -fifth type will be added for [resource types][Resource and Handle Types].) - -A `functype` describes a component function whose parameters and results are -`intertype` values. Thus `functype` is completely disjoint from -[`core:functype`] in the WebAssembly Core spec, whose parameters and results -are [`core:valtype`] values. As a low-level compiler target, `core:functype` -returns zero or more results. In contrast, as a high-level interface type -designed to be maximally bound to a variety of source languages, `functype` -always returns a single type, with `unit` being used for functions that don't -return an interesting value (analogous to "void" in some languages). As -syntactic sugar, the text format of `functype` additionally allows `result` to -be absent, interpreting this as `(result unit)`. Since `core:functype` can only -appear syntactically within a `(module ...)` S-expression, there is never a -need to syntactically distinguish `functype` from `core:functype` in the text -format: the context dictates which one a `(func ...)` S-expression parses into. - -A `valuetype` describes a single `intertype` value that is to be consumed -exactly once during component instantiation. How this happens is described +The remaining 5 type constructors use `valtype` to complete the description +of a shared-nothing component interface: + +The `func` type constructor describes a component-level function definition +that takes and returns component-level value types. In contrast to +[`core:functype`] which, as a low-level compiler target for a stack machine, +returns zero or more results, `functype` always returns a single type, with +`unit` being used for functions that don't return an interesting value +(analogous to "void" in some languages). Having a single return type simplifies +the binding of `functype` into a wide variety of source languages. As syntactic +sugar, the text format of `functype` additionally allows `result` to be absent, +interpreting this as `(result unit)`. + +The `component` type constructor is symmetric to the core `module` type +constructor, although its grammar is factored to share declarators with the +`instance` type constructor. The `import` and `export` declarator names +must be distinct within a single type. + +The `externdesc` production (used to declare the types of imported/exported +values) includes two additional type constructors that are not currently +present in `deftype` (since there is currently no reason for allowing them to +be shared or named as type definitions): + +The `value` case describes an imported or exported `valtype` value that is to +be consumed exactly once during instantiation. How this happens is described below along with [`start` definitions](#start-definitions). -As described above, components and modules are immutable values representing -code that cannot be run until instantiated via `instance` definition. Thus, -`moduletype` and `componenttype` describe *uninstantiated code*. `moduletype` -and `componenttype` contain not just import and export definitions, but also -type and alias definitions, allowing them to capture type sharing relationships -between imports and exports. This type sharing becomes necessary (not just a -size optimization) with the upcoming addition of [type imports and exports] to -Core WebAssembly and, symmetrically, [resource and handle types] to the -Component Model. - -The `instancetype` type constructor describes component instances, which are -named tuples of other definitions. Although `instance` definitions can produce -both module *and* component instances, only *component* instances can be -imported or exported (due to the overall [shared-nothing design](../high-level/Choices.md) -of the Component Model) and thus only *component* instances need explicit type -definitions. Consequently, the text format of `instancetype` does not include -a syntax for defining *module* instance types. As with `componenttype` and -`moduletype`, `instancetype` allows nested type and alias definitions to allow -type sharing. - -Lastly, to ensure cross-language interoperability, `moduletype`, -`componenttype` and `instancetype` all require import and export names to be -unique (within a particular module, component, instance or type thereof). In -the case of `moduletype` and two-level imports, this translates to requiring -that import name *pairs* must be *pair*-wise unique. Since the current Core -WebAssembly validation rules allow duplicate imports, this means that some -valid modules will not be typeable and will fail validation if used with the -Component Model. +The `type` case describes an imported or exported type along with its bounds, +which currently only has an `eq` option that says that the imported/exported +type must be exactly equal to the given immediate type. There are two main use +cases for this in the short-term: +* Type exports allow a component or interface to associate a name with a + structural type (e.g., `(export "nanos" (type (eq u64)))`) which bindings + generators can use to generate type aliases (e.g., `typedef uint64_t nanos;`). +* Type imports and exports allow a component to explicitly specify the + type parameters used to monomorphize a generic interface being imported + or exported. + +When [resource and handle types] are added to the explainer, `typebound` will +be extended with a `sub` option (symmetric to the [type-imports] proposal) that +allows importing and exporting *abstract* types. With what's defined so far, we can define component types using a mix of inline and out-of-line type definitions: @@ -462,52 +554,50 @@ and out-of-line type definitions: (alias outer $C $T (type $C_T)) (type $L (list $C_T)) (import "f" (func (param $L) (result (list u8)))) - (import "g" $G) - (export "g" $G) + (import "g" (func (type $G))) + (export "g" (func (type $G))) (export "h" (func (result $U))) )) ) ``` -Note that the inline use of `$G` and `$U` are inline `outer` aliases. +Note that the inline use of `$G` and `$U` are syntactic sugar for `outer` +aliases. -### Function Definitions +### Canonical Definitions -To implement or call interface-typed functions, we need to be able to cross a +To implement or call a component-level function, we need to cross a shared-nothing boundary. Traditionally, this problem is solved by defining a -serialization format for copying data across the boundary. The Component Model -MVP takes roughly this same approach, defining a linear-memory-based [ABI] -called the "Canonical ABI" which specifies, for any interface function type, a -[corresponding](CanonicalABI.md#flattening) core function type and -[rules](CanonicalABI.md#lifting-and-lowering) for copying values into or out of -linear memory. The Component Model differs from traditional approaches, though, -in that the ABI is configurable, allowing different memory representations for -the same abstract value. In the MVP, this configurability is limited to the -small set of `canonopt` shown below. However, Post-MVP, [adapter functions] -could be added to allow far more programmatic control. +serialization format. The Component Model MVP uses roughly this same approach, +defining a linear-memory-based [ABI] called the "Canonical ABI" which +specifies, for any `functype`, a [corresponding](CanonicalABI.md#flattening) +`core:functype` and [rules](CanonicalABI.md#lifting-and-lowering) for copying +values into and out of linear memory. The Component Model differs from +traditional approaches, though, in that the ABI is configurable, allowing +multiple different memory representations of the same abstract value. In the +MVP, this configurability is limited to the small set of `canonopt` shown +below. However, Post-MVP, [adapter functions] could be added to allow far more +programmatic control. The Canonical ABI is explicitly applied to "wrap" existing functions in one of two directions: -* `canon.lift` wraps a core function (of type `core:functype`) inside the - current component to produce a component function (of type `functype`) - that can be exported to other components. -* `canon.lower` wraps a component function (of type `functype`) that can - have been imported from another component to produce a core function (of type - `core:functype`) that can be imported and called from Core WebAssembly code - within the current component. - -Function definitions specify one of these two wrapping directions along with a -set of Canonical ABI configuration options. -``` -func ::= (func ? ) -funcbody ::= (canon.lift * ) - | (canon.lower * ) -canonopt ::= string-encoding=utf8 - | string-encoding=utf16 - | string-encoding=latin1+utf16 - | (memory ) - | (realloc ) - | (post-return ) +* `lift` wraps a core function (of type `core:functype`) to produce a component + function (of type `functype`) that can be passed to other components. +* `lower` wraps a component function (of type `functype`) to produce a core + function (of type `core:functype`) that can be imported and called from Core + WebAssembly code inside the current component. + +Canonical definitions specify one of these two wrapping directions, the function +to wrap and a list of configuration options: +``` +canon ::= (canon lift core-prefix() * (func ?)) + | (canon lower * (core func ?)) +canonopt ::= string-encoding=utf8 + | string-encoding=utf16 + | string-encoding=latin1+utf16 + | (memory core-prefix()) + | (realloc core-prefix()) + | (post-return core-prefix()) ``` The `string-encoding` option specifies the encoding the Canonical ABI will use for the `string` type. The `latin1+utf16` encoding captures a common string @@ -518,12 +608,12 @@ Point range) or UTF-16 (which can express all Code Points, but uses either default is UTF-8. It is a validation error to include more than one `string-encoding` option. -The `(memory )` option specifies the memory that the Canonical ABI will +The `(memory ...)` option specifies the memory that the Canonical ABI will use to load and store values. If the Canonical ABI needs to load or store, validation requires this option to be present (there is no default). -The `(realloc )` option specifies a core function that is validated to -have the following signature: +The `(realloc ...)` option specifies a core function that is validated to +have the following core function type: ```wasm (func (param $originalPtr i32) (param $originalSize i32) @@ -535,22 +625,22 @@ The Canonical ABI will use `realloc` both to allocate (passing `0` for the first two parameters) and reallocate. If the Canonical ABI needs `realloc`, validation requires this option to be present (there is no default). -The `(post-return )` option may only be present in `canon.lift` and -specifies a core function to be called with the original return values after -they have finished being read, allowing memory to be deallocated and +The `(post-return ...)` option may only be present in `canon lift` +and specifies a core function to be called with the original return values +after they have finished being read, allowing memory to be deallocated and destructors called. This immediate is always optional but, if present, is validated to have parameters matching the callee's return type and empty results. -Based on this description of the AST, the [Canonical ABI explainer][Canonical ABI] -gives a detailed walkthrough of the static and dynamic semantics of -`canon.lift` and `canon.lower`. +Based on this description of the AST, the [Canonical ABI explainer][Canonical +ABI] gives a detailed walkthrough of the static and dynamic semantics of `lift` +and `lower`. -One high-level consequence of the dynamic semantics of `canon.lift` given in +One high-level consequence of the dynamic semantics of `canon lift` given in the Canonical ABI explainer is that component functions are different from core functions in that all control flow transfer is explicitly reflected in their -type. For example, with Core WebAssembly [exception handling] and -[stack switching], a core function with type `(func (result i32))` can return +type. For example, with Core WebAssembly [exception-handling] and +[stack-switching], a core function with type `(func (result i32))` can return an `i32`, throw, suspend or trap. In contrast, a component function with type `(func (result string))` may only return a `string` or trap. To express failure, component functions can return `expected` and languages with exception @@ -558,23 +648,33 @@ handling can bind exceptions to the `error` case. Similarly, the forthcoming addition of [future and stream types] would explicitly declare patterns of stack-switching in component function signatures. -Using function definitions, we can finally write a non-trivial component that +Similar to the `import` and `alias` abbreviations shown above, `canon` +definitions can also be written in an inverted form that puts the sort first: +```wasm + (func $f (import "i" "f")) ≡ (import "i" "f" (func $f)) (WebAssembly 1.0) + (func $h (canon lift ...)) ≡ (canon lift ... (func $h)) +(core func $h (canon lower ...)) ≡ (canon lower ... (core func $h)) +``` +Note: in the future, `canon` may be generalized to define other sorts than +functions (such as types), hence the explicit `sort`. + +Using canonical definitions, we can finally write a non-trivial component that takes a string, does some logging, then returns a string. ```wasm (component (import "wasi:logging" (instance $logging (export "log" (func (param string))) )) - (import "libc" (module $Libc + (import "libc" (core module $Libc (export "mem" (memory 1)) (export "realloc" (func (param i32 i32) (result i32))) )) - (instance $libc (instantiate (module $Libc))) - (func $log (canon.lower - (memory (memory $libc "mem")) (realloc (func $libc "realloc")) + (core instance $libc (instantiate $Libc)) + (core func $log (canon lower (func $logging "log") + (memory (core memory $libc "mem")) (realloc (core func $libc "realloc")) )) - (module $Main + (core module $Main (import "libc" "memory" (memory 1)) (import "libc" "realloc" (func (param i32 i32) (result i32))) (import "wasi:logging" "log" (func $log (param i32 i32))) @@ -582,14 +682,14 @@ takes a string, does some logging, then returns a string. ... (call $log) ... ) ) - (instance $main (instantiate (module $Main) + (core instance $main (instantiate $Main (with "libc" (instance $libc)) (with "wasi:logging" (instance (export "log" (func $log)))) )) - (func (export "run") (canon.lift + (func (export "run") (canon lift + (core func $main "run") (func (param string) (result string)) - (memory (memory $libc "mem")) (realloc (func $libc "realloc")) - (func $main "run") + (memory (core memory $libc "mem")) (realloc (core func $libc "realloc")) )) ) ``` @@ -597,81 +697,76 @@ This example shows the pattern of splitting out a reusable language runtime module (`$Libc`) from a component-specific, non-reusable module (`$Main`). In addition to reducing code size and increasing code-sharing in multi-component scenarios, this separation allows `$libc` to be created first, so that its -exports are available for reference by `canon.lower`. Without this separation +exports are available for reference by `canon lower`. Without this separation (if `$Main` contained the `memory` and allocation functions), there would be a -cyclic dependency between `canon.lower` and `$Main` that would have to be -broken by the toolchain emitting an auxiliary module that broke the cycle using -a shared `funcref` table and `call_indirect`. +cyclic dependency between `canon lower` and `$Main` that would have to be +broken using an auxiliary module performing `call_indirect`. ### Start Definitions Like modules, components can have start functions that are called during instantiation. Unlike modules, components can call start functions at multiple -points during instantiation with each such call having interface-typed -parameters and results. Thus, `start` definitions in components look like -function calls: +points during instantiation with each such call having parameters and results. +Thus, `start` definitions in components look like function calls: ``` start ::= (start (value )* (result (value ))?) ``` The `(value )*` list specifies the arguments passed to `funcidx` by indexing into the *value index space*. Value definitions (in the value index -space) are like immutable `global` definitions in Core WebAssembly except they -must be consumed exactly once at instantiation-time. +space) are like immutable `global` definitions in Core WebAssembly except that +validation requires them to be consumed exactly once at instantiation-time +(i.e., they are [linear]). -As with any other definition kind, value definitions may be supplied to -components through `import` definitions. Using the grammar of `import` already -defined [above](#type-definitions), an example *value import* can be written: +As with all definition sorts, values may be imported and exported by +components. As an example value import: ``` (import "env" (value $env (record (field "locale" (option string))))) ``` As this example suggests, value imports can serve as generalized [environment -variables], allowing not just `string`, but the full range of interface types -to describe the imported configuration schema. +variables], allowing not just `string`, but the full range of `valtype`. With this, we can define a component that imports a string and computes a new -exported string, all at instantiation time: +exported string at instantiation time: ```wasm (component (import "name" (value $name string)) - (import "libc" (module $Libc + (import "libc" (core module $Libc (export "memory" (memory 1)) (export "realloc" (func (param i32 i32 i32 i32) (result i32))) )) - (instance $libc (instantiate (module $Libc))) - (module $Main + (core instance $libc (instantiate $Libc)) + (core module $Main (import "libc" ...) (func (export "start") (param i32 i32) (result i32 i32) ... general-purpose compute ) ) - (instance $main (instantiate (module $Main) (with "libc" (instance $libc)))) - (func $start (canon.lift + (core instance $main (instantiate $Main (with "libc" (instance $libc)))) + (func $start (canon lift + (core func $main "start") (func (param string) (result string)) - (memory (memory $libc "mem")) (realloc (func $libc "realloc")) - (func $main "start") + (memory (core memory $libc "mem")) (realloc (core func $libc "realloc")) )) (start $start (value $name) (result (value $greeting))) (export "greeting" (value $greeting)) ) ``` As this example shows, start functions reuse the same Canonical ABI machinery -as normal imports and exports for getting interface typed values into and out -of linear memory. +as normal imports and exports for getting component-level values into and out +of core linear memory. ### Import and Export Definitions -The rules for [`import`](#type-definitions) and [`export`](#instance-definitions) -definitions have actually already been defined above (with the caveat that the -real text format for `import` definitions would additionally allow binding an -identifier (e.g., adding the `$foo` in `(import "foo" (func $foo))`): +Lastly, imports and exports are defined in terms of the above as: ``` -import ::= already defined above as part of -export ::= already defined above as part of +import ::= (import ) +export ::= (export ) ``` +All import and export names within a component must be unique, respectively. -With what's defined so far, we can define a component that imports, links and +With what's defined so far, we can write a component that imports, links and exports other components: ```wasm (component @@ -684,10 +779,10 @@ exports other components: )) (export "g" (func (result string))) )) - (instance $d1 (instantiate (component $D) + (instance $d1 (instantiate $D (with "c" (instance $c)) )) - (instance $d2 (instantiate (component $D) + (instance $d2 (instantiate $D (with "c" (instance (export "f" (func $d1 "g")) )) @@ -706,11 +801,11 @@ note that all definitions are acyclic as is the resulting instance graph. As a consequence of the shared-nothing design described above, all calls into or out of a component instance necessarily transit through a component function definition. Thus, component functions form a "membrane" around the collection -of module instances contained by a component instance, allowing the Component -Model to establish invariants that increase optimizability and composability in -ways not otherwise possible in the shared-everything setting of Core -WebAssembly. The Component Model proposes establishing the following three -runtime invariants: +of core module instances contained by a component instance, allowing the +Component Model to establish invariants that increase optimizability and +composability in ways not otherwise possible in the shared-everything setting +of Core WebAssembly. The Component Model proposes establishing the following +three runtime invariants: 1. Components define a "lockdown" state that prevents continued execution after a trap. This both prevents continued execution with corrupt state and also allows more-aggressive compiler optimizations (e.g., store reordering). @@ -754,8 +849,8 @@ these same JS API functions to accept component binaries and produce new `WebAssembly.Component` objects that represent decoded and validated components. The [binary format of components](Binary.md) is designed to allow modules and components to be distinguished by the first 8 bytes of the binary -(splitting the 32-bit [`version`] field into a 16-bit `version` field and a -16-bit `kind` field with `0` for modules and `1` for components). +(splitting the 32-bit [`core:version`] field into a 16-bit `version` field and +a 16-bit `layer` field with `0` for modules and `1` for components). Once compiled, a `WebAssemby.Component` could be instantiated using the existing JS API `WebAssembly.instantiate(Streaming)`. Since components have the @@ -768,7 +863,7 @@ instantiated module, `WebAssembly.instantiate` would always produce a Lastly, when given a component binary, the compile-then-instantiate overloads of `WebAssembly.instantiate(Streaming)` would inherit the compound behavior of -the abovementioned functions (again, using the `version` field to eagerly +the abovementioned functions (again, using the `layer` field to eagerly distinguish between modules and components). For example, the following component: @@ -779,7 +874,7 @@ For example, the following component: (import "two" (value string)) (import "three" (instance (export "four" (instance - (export "five" (module + (export "five" (core module (import "six" "a" (func)) (import "six" "b" (func)) )) @@ -812,11 +907,11 @@ WebAssembly.instantiateStreaming(fetch('./a.wasm'), { The other significant addition to the JS API would be the expansion of the set of WebAssembly types coerced to and from JavaScript values (by [`ToJSValue`] -and [`ToWebAssemblyValue`]) to include all of [`intertype`](#type-definitions). +and [`ToWebAssemblyValue`]) to include all of [`valtype`](#type-definitions). At a high level, the additional coercions would be: -| Interface Type | `ToJSValue` | `ToWebAssemblyValue` | -| -------------- | ----------- | -------------------- | +| Type | `ToJSValue` | `ToWebAssemblyValue` | +| ---- | ----------- | -------------------- | | `unit` | `null` | accept everything | | `bool` | `true` or `false` | `ToBoolean` | | `s8`, `s16`, `s32` | as a Number value | `ToInt32` | @@ -852,8 +947,8 @@ Notes: ### ESM-integration -Like the JS API, [ESM-integration] can be extended to load components in all -the same places where modules can be loaded today, branching on the `kind` +Like the JS API, [esm-integration] can be extended to load components in all +the same places where modules can be loaded today, branching on the `layer` field in the binary format to determine whether to decode as a module or a component. The main question is how to deal with component imports having a single string as well as the new importable component, module and instance @@ -927,20 +1022,21 @@ and will be added over the coming months to complete the MVP proposal: [Structure Section]: https://webassembly.github.io/spec/core/syntax/index.html -[`core:module`]: https://webassembly.github.io/spec/core/syntax/modules.html#syntax-module -[`core:export`]: https://webassembly.github.io/spec/core/syntax/modules.html#syntax-export -[`core:import`]: https://webassembly.github.io/spec/core/syntax/modules.html#syntax-import -[`core:importdesc`]: https://webassembly.github.io/spec/core/syntax/modules.html#syntax-importdesc -[`core:functype`]: https://webassembly.github.io/spec/core/syntax/types.html#syntax-functype -[`core:valtype`]: https://webassembly.github.io/spec/core/syntax/types.html#value-types - [Text Format Section]: https://webassembly.github.io/spec/core/text/index.html +[Binary Format Section]: https://webassembly.github.io/spec/core/binary/index.html + +[Index Space]: https://webassembly.github.io/spec/core/syntax/modules.html#indices [Abbreviations]: https://webassembly.github.io/spec/core/text/conventions.html#abbreviations + +[`core:module`]: https://webassembly.github.io/spec/core/text/modules.html#text-module +[`core:type`]: https://webassembly.github.io/spec/core/text/modules.html#types +[`core:importdesc`]: https://webassembly.github.io/spec/core/text/modules.html#text-importdesc +[`core:externtype`]: https://webassembly.github.io/spec/core/syntax/types.html#external-types +[`core:valtype`]: https://webassembly.github.io/spec/core/text/types.html#value-types [`core:typeuse`]: https://webassembly.github.io/spec/core/text/modules.html#type-uses +[`core:functype`]: https://webassembly.github.io/spec/core/text/types.html#function-types [func-import-abbrev]: https://webassembly.github.io/spec/core/text/modules.html#text-func-abbrev - -[Binary Format Section]: https://webassembly.github.io/spec/core/binary/index.html -[`version`]: https://webassembly.github.io/spec/core/binary/modules.html#binary-version +[`core:version`]: https://webassembly.github.io/spec/core/binary/modules.html#binary-version [JS API]: https://webassembly.github.io/spec/js-api/index.html [*read the imports*]: https://webassembly.github.io/spec/js-api/index.html#read-the-imports @@ -958,7 +1054,6 @@ and will be added over the coming months to complete the MVP proposal: [Module Specifier]: https://tc39.es/ecma262/multipage/ecmascript-language-scripts-and-modules.html#prod-ModuleSpecifier [Named Imports]: https://tc39.es/ecma262/multipage/ecmascript-language-scripts-and-modules.html#prod-NamedImports [Imported Default Binding]: https://tc39.es/ecma262/multipage/ecmascript-language-scripts-and-modules.html#prod-ImportedDefaultBinding - [JS Tuple]: https://github.com/tc39/proposal-record-tuple [JS Record]: https://github.com/tc39/proposal-record-tuple @@ -974,16 +1069,19 @@ and will be added over the coming months to complete the MVP proposal: [Sequences]: https://en.wikipedia.org/wiki/Sequence [ABI]: https://en.wikipedia.org/wiki/Application_binary_interface [Environment Variables]: https://en.wikipedia.org/wiki/Environment_variable +[Linear]: https://en.wikipedia.org/wiki/Substructural_type_system#Linear_type_systems -[Module Linking]: https://github.com/WebAssembly/module-linking/blob/main/design/proposals/module-linking/Explainer.md -[Interface Types]: https://github.com/WebAssembly/interface-types/blob/main/proposals/interface-types/Explainer.md -[Type Imports and Exports]: https://github.com/WebAssembly/proposal-type-imports/blob/master/proposals/type-imports/Overview.md -[Exception Handling]: https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md -[Stack Switching]: https://github.com/WebAssembly/stack-switching/blob/main/proposals/stack-switching/Overview.md -[ESM-integration]: https://github.com/WebAssembly/esm-integration/tree/main/proposals/esm-integration +[module-linking]: https://github.com/WebAssembly/module-linking/blob/main/design/proposals/module-linking/Explainer.md +[interface-types]: https://github.com/WebAssembly/interface-types/blob/main/proposals/interface-types/Explainer.md +[type-imports]: https://github.com/WebAssembly/proposal-type-imports/blob/master/proposals/type-imports/Overview.md +[exception-handling]: https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md +[stack-switching]: https://github.com/WebAssembly/stack-switching/blob/main/proposals/stack-switching/Overview.md +[esm-integration]: https://github.com/WebAssembly/esm-integration/tree/main/proposals/esm-integration +[gc]: https://github.com/WebAssembly/gc/blob/main/proposals/gc/MVP.md [Adapter Functions]: FutureFeatures.md#custom-abis-via-adapter-functions [Canonical ABI]: CanonicalABI.md +[Shared-Nothing]: ../high-level/Choices.md [`wizer`]: https://github.com/bytecodealliance/wizer diff --git a/design/mvp/FutureFeatures.md b/design/mvp/FutureFeatures.md index cf986b66..360a77e6 100644 --- a/design/mvp/FutureFeatures.md +++ b/design/mvp/FutureFeatures.md @@ -15,23 +15,22 @@ serialization format, as this often incurs extra copying when the source or destination language-runtime data structures don't precisely match the fixed serialization format. A significant amount of work was spent designing a language of [adapter functions] that provided fairly general programmatic -control over the process of serializing and deserializing interface-typed values. +control over the process of serializing and deserializing high-level values. (The Interface Types Explainer currently contains a snapshot of this design.) However, a significant amount of additional design work remained, including (likely) changing the underlying semantic foundations from lazy evaluation to algebraic effects. -In pursuit of a timely MVP and as part of the overall [scoping and layering proposal], -the goal of avoiding a fixed serialization format was dropped from the MVP, by -instead defining a [Canonical ABI](CanonicalABI.md) in the MVP. However, the -current design of [function definitions](Explainer.md#function-definitions) -anticipates a future extension whereby function bodies can contain not just the -fixed Canonical ABI-following `canon.lift` and `canon.lower` but, -alternatively, general adapter function code. +In pursuit of a timely MVP and as part of the overall [scoping and layering +proposal], the goal of avoiding a fixed serialization format was dropped from +the MVP by instead defining a [Canonical ABI](CanonicalABI.md) in the MVP. +However, the current design anticipates a future extension whereby lifting and +lowering functions can be generated not just from `canon lift` and `canon +lower`, but, alternatively, general-purpose serialization/deserialization code. -In this future state, `canon.lift` and `canon.lower` could be specified by -simple expansion into the adapter code, making these instructions effectively -macros. However, even in this future state, there is still concrete value in +In this future state, `canon lift` and `canon lower` could be specified by +simple expansion into the general-purpose code, making these instructions +effectively macros. However, even in this future state, there is still value in having a fixedly-defined Canonical ABI as it allows more-aggressive optimization of calls between components (which both use the Canonical ABI) and between a component and the host (which often must use a fixed ABI for calling @@ -53,8 +52,8 @@ Additionally, having two similar-but-different, partially-overlapping concepts makes the whole proposal harder to explain. Thus, the MVP drops the concept of "adapter modules", including only shared-nothing "components". However, if concrete future use cases emerged for creating modules that partially used -interface types and partially shared linear memory, "adapter modules" could be -added as a future feature. +shared-nothing component values and partially shared linear memory, "adapter +modules" could be added as a future feature. ## Shared-everything Module Linking in Core WebAssembly diff --git a/design/mvp/Subtyping.md b/design/mvp/Subtyping.md index 608dc088..7114f050 100644 --- a/design/mvp/Subtyping.md +++ b/design/mvp/Subtyping.md @@ -6,7 +6,7 @@ But roughly speaking: | Type | Subtyping | | ------------------------- | --------- | -| `unit` | every interface type is a subtype of `unit` | +| `unit` | every value type is a subtype of `unit` | | `bool` | | | `s8`, `s16`, `s32`, `s64`, `u8`, `u16`, `u32`, `u64` | lossless coercions are allowed | | `float32`, `float64` | `float32 <: float64` | @@ -20,5 +20,5 @@ But roughly speaking: | `union` | `T <: (union ... T ...)` | | `func` | parameter names must match in order; contravariant parameter subtyping; superfluous parameters can be ignored in the subtype; `option` parameters can be ignored in the supertype; covariant result subtyping | -The remaining specialized interface types inherit their subtyping from their -fundamental interface types. +The remaining specialized value types inherit their subtyping from their +fundamental value types. diff --git a/design/mvp/canonical-abi/definitions.py b/design/mvp/canonical-abi/definitions.py index 949ae02e..183ed04b 100644 --- a/design/mvp/canonical-abi/definitions.py +++ b/design/mvp/canonical-abi/definitions.py @@ -19,74 +19,74 @@ def trap_if(cond): if cond: raise Trap() -class InterfaceType: pass -class Unit(InterfaceType): pass -class Bool(InterfaceType): pass -class S8(InterfaceType): pass -class U8(InterfaceType): pass -class S16(InterfaceType): pass -class U16(InterfaceType): pass -class S32(InterfaceType): pass -class U32(InterfaceType): pass -class S64(InterfaceType): pass -class U64(InterfaceType): pass -class Float32(InterfaceType): pass -class Float64(InterfaceType): pass -class Char(InterfaceType): pass -class String(InterfaceType): pass +class ValType: pass +class Unit(ValType): pass +class Bool(ValType): pass +class S8(ValType): pass +class U8(ValType): pass +class S16(ValType): pass +class U16(ValType): pass +class S32(ValType): pass +class U32(ValType): pass +class S64(ValType): pass +class U64(ValType): pass +class Float32(ValType): pass +class Float64(ValType): pass +class Char(ValType): pass +class String(ValType): pass @dataclass -class List(InterfaceType): - t: InterfaceType +class List(ValType): + t: ValType @dataclass class Field: label: str - t: InterfaceType + t: ValType @dataclass -class Record(InterfaceType): +class Record(ValType): fields: [Field] @dataclass -class Tuple(InterfaceType): - ts: [InterfaceType] +class Tuple(ValType): + ts: [ValType] @dataclass -class Flags(InterfaceType): +class Flags(ValType): labels: [str] @dataclass class Case: label: str - t: InterfaceType + t: ValType refines: str = None @dataclass -class Variant(InterfaceType): +class Variant(ValType): cases: [Case] @dataclass -class Enum(InterfaceType): +class Enum(ValType): labels: [str] @dataclass -class Union(InterfaceType): - ts: [InterfaceType] +class Union(ValType): + ts: [ValType] @dataclass -class Option(InterfaceType): - t: InterfaceType +class Option(ValType): + t: ValType @dataclass -class Expected(InterfaceType): - ok: InterfaceType - error: InterfaceType +class Expected(ValType): + ok: ValType + error: ValType @dataclass class Func: - params: [InterfaceType] - result: InterfaceType + params: [ValType] + result: ValType ### Despecialization @@ -603,9 +603,9 @@ def flatten(functype, context): flat_results = flatten_type(functype.result) if len(flat_results) > MAX_FLAT_RESULTS: match context: - case 'canon.lift': + case 'lift': flat_results = ['i32'] - case 'canon.lower': + case 'lower': flat_params += ['i32'] flat_results = [] @@ -869,7 +869,7 @@ def lower(opts, max_flat, vs, ts, out_param = None): flat_vals += lower_flat(opts, vs[i], ts[i]) return flat_vals -### `canon.lift` +### `lift` class Instance: may_leave = True @@ -898,7 +898,7 @@ def post_return(): return (result, post_return) -### `canon.lower` +### `lower` def canon_lower(caller_opts, caller_instance, callee, functype, flat_args): trap_if(not caller_instance.may_leave) diff --git a/design/mvp/canonical-abi/run_tests.py b/design/mvp/canonical-abi/run_tests.py index 9e6bb0cb..8f270bde 100644 --- a/design/mvp/canonical-abi/run_tests.py +++ b/design/mvp/canonical-abi/run_tests.py @@ -312,13 +312,13 @@ def test_flatten(t, params, results): if len(results) > definitions.MAX_FLAT_RESULTS: expect['results'] = ['i32'] - got = flatten(t, 'canon.lift') + got = flatten(t, 'lift') assert(got == expect) if len(results) > definitions.MAX_FLAT_RESULTS: expect['params'] += ['i32'] expect['results'] = [] - got = flatten(t, 'canon.lower') + got = flatten(t, 'lower') assert(got == expect) test_flatten(Func([U8(),Float32(),Float64()],Unit()), ['i32','f32','f64'], []) diff --git a/design/mvp/examples/SharedEverythingDynamicLinking.md b/design/mvp/examples/SharedEverythingDynamicLinking.md index 0957faa1..30f75901 100644 --- a/design/mvp/examples/SharedEverythingDynamicLinking.md +++ b/design/mvp/examples/SharedEverythingDynamicLinking.md @@ -157,10 +157,10 @@ would look like: (with "libc" (instance $libc)) (with "libzip" (instance $libzip)) )) - (func (export "zip") (canon.lift + (func (export "zip") (canon lift + (func $main "zip") (func (param (list u8)) (result (list u8))) (memory (memory $libc "memory")) (realloc (func $libc "realloc")) - (func $main "zip") )) ) ``` @@ -236,10 +236,10 @@ component-aware `clang`, the resulting component would look like: (with "libc" (instance $libc)) (with "libimg" (instance $libimg)) )) - (func (export "transform") (canon.lift + (func (export "transform") (canon lift + (func $main "transform") (func (param (list u8)) (result (list u8))) (memory (memory $libc "memory")) (realloc (func $libc "realloc")) - (func $main "transform") )) ) ``` @@ -283,23 +283,23 @@ components. The resulting component could look like: )) (instance $libc (instantiate (module $Libc))) - (func $zip (canon.lower - (memory (memory $libc "memory")) (realloc (func $libc "realloc")) + (func $zip (canon lower (func $zipper "zip") - )) - (func $transform (canon.lower (memory (memory $libc "memory")) (realloc (func $libc "realloc")) + )) + (func $transform (canon lower (func $imgmgk "transform") + (memory (memory $libc "memory")) (realloc (func $libc "realloc")) )) (instance $main (instantiate (module $Main) (with "libc" (instance $libc)) (with "zipper" (instance (export "zip" (func $zipper "zip")))) (with "imgmgk" (instance (export "transform" (func $imgmgk "transform")))) )) - (func (export "run") (canon.lift + (func (export "run") (canon lift + (func $main "run") (func (param string) (result string)) (memory (memory $libc "memory")) (realloc (func $libc "realloc")) - (func $main "run") )) ) ``` From 6e78729e14b0f7cbf1bf83670515de87fde1b518 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Tue, 3 May 2022 13:11:05 -0500 Subject: [PATCH 02/27] Restore value type binary encoding, refactor type grammar slightly --- design/mvp/Binary.md | 51 +++++++++---------- design/mvp/Explainer.md | 106 ++++++++++++++++++++-------------------- 2 files changed, 78 insertions(+), 79 deletions(-) diff --git a/design/mvp/Binary.md b/design/mvp/Binary.md index a37f4c59..7da2ee98 100644 --- a/design/mvp/Binary.md +++ b/design/mvp/Binary.md @@ -73,7 +73,7 @@ instanceexpr ::= 0x00 c: arg*:vec() => (i | 0x01 e*:vec() => e* instantiatearg ::= n: si: => (with n si) sortidx ::= sort: idx: => (sort idx) -sort ::= 0x00 si: => si +sort ::= 0x00 => core module | 0x01 => func | 0x02 => value | 0x03 => type @@ -150,30 +150,12 @@ Notes: ``` type ::= dt: => (type dt) -deftype ::= vt: => vt +deftype ::= dvt: => dvt | ft: => ft + | tt: => tt | ct: => ct | it: => it -functype ::= 0x40 param*:vec() t: => (func param* (result t)) -param ::= 0x00 t: => (param t) - | 0x01 n: t: => (param n t) -componenttype ::= 0x41 cd*:vec() => (component cd*) -instancetype ::= 0x42 id*:vec() => (instance id*) -componentdecl ::= 0x00 id: => id - | id: => id -instancedecl ::= 0x01 t: => t - | 0x02 a: => a - | 0x03 ed: => ed -importdecl ::= n: ed: => (import n ed) -exportdecl ::= n: ed: => (export n ed) -externdesc ::= 0x00 i: => core-type-index-space[i] (must be moduletype) - | 0x01 i: => type-index-space[i] (must be func|instance|componenttype) - | 0x02 t: => (value t) - | 0x03 tb: => (type tb) -typebound ::= 0x00 i: => (eq type-index-space[i]) (any deftype) - | 0x00 t: => (eq t) -valtype ::= i: => type-index-space[i] (must be valtype) - | 0x7f => unit +primvaltype ::= 0x7f => unit | 0x7e => bool | 0x7d => s8 | 0x7c => u8 @@ -187,6 +169,7 @@ valtype ::= i: => type-index-space[i] ( | 0x74 => float64 | 0x73 => char | 0x72 => string +defvaltype ::= pvt: => pvt | 0x71 field*:vec() => (record field*) | 0x70 case*:vec() => (variant case*) | 0x6f t: => (list t) @@ -196,9 +179,27 @@ valtype ::= i: => type-index-space[i] ( | 0x6b t*:vec() => (union t*) | 0x6a t: => (option t) | 0x69 t: u: => (expected t u) +valtype ::= i: => type-index-space[i] (must be defvaltype) + | pit: => pit field ::= n: t: => (field n t) case ::= n: t: 0x0 => (case n t) | n: t: 0x1 i: => (case n t (refines case-label[i])) +typetype ::= tb: (type tb) +typebound ::= 0x00 i: => (eq type-index-space[i]) +functype ::= 0x40 param*:vec() t: => (func param* (result t)) +param ::= 0x00 t: => (param t) + | 0x01 n: t: => (param n t) +componenttype ::= 0x41 cd*:vec() => (component cd*) +instancetype ::= 0x42 id*:vec() => (instance id*) +componentdecl ::= 0x00 id: => id + | id: => id +instancedecl ::= 0x01 t: => t + | 0x02 a: => a + | 0x03 ed: => ed +importdecl ::= n: et: => (import n et) +exportdecl ::= n: et: => (export n et) +externtype ::= 0x00 i: => (core module core-type-index-space[i]) + | sort: i: => (sort type-index-space[i]) (sort must match type) ``` Notes: * The type opcodes follow the same negative-SLEB128 scheme as Core WebAssembly, @@ -209,10 +210,6 @@ Notes: * As described in the explainer, each component and instance type is validated with an initially-empty type index space. Outer aliases can be used to pull in type definitions from containing components. -* The rule for `typebound` contains both an unrestricted `` case and, - within `valtype`, a `valtype`-restricted `` case. Since the former - is a strict generalization of the latter, there is no ambiguity. The net - effect is that `eq` accepts all types. ## Canonical Definitions @@ -273,7 +270,7 @@ flags are set. (See [Import and Export Definitions](Explainer.md#import-and-export-definitions) in the explainer.) ``` -import ::= n: ed: => (import n ed) +import ::= n: et: => (import n et) export ::= n: si: => (export n si) ``` Notes: diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index 6eb1df92..5994abab 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -190,7 +190,7 @@ instanceexpr ::= (instantiate *) instantiatearg ::= (with ) | (with (instance *)) sortidx ::= ( ) -sort ::= core-prefix() +sort ::= core module | func | value | type @@ -201,11 +201,14 @@ export ::= (export ) Because component-level function, type and instance definitions are different than core-level function, type and instance definitions, they are put into disjoint index spaces which are indexed separately by `sortidx` and -`core:sortidx`, respectively. Components may import or export core modules -(since core modules are immutable values and thus do not break the -[shared-nothing] model) and so `sortidx` includes `core:sortidx` (which -validation then restricts to core modules; in the future, other immutable core -definitions could be allowed, such as `data` segments). +`core:sortidx`, respectively. Components may also import or export core modules +since core modules are immutable values and thus do not break the +[shared-nothing] model. In the future, other immutable core sorts could be +added to this list such as, if it was made importable/exportable, `data`. + +The `value` sort refers to a value that is provided and consumed during +instantiation. How this works is described in the +[start definitions](#start-definitions) section. To see a non-trivial example of component instantiation, we'll first need to introduce a few other definitions below that allow components to import, define @@ -405,26 +408,11 @@ therefore be high-level, describing entire compound values. ``` type ::= (type ? ) deftype ::= - | - | - | -functype ::= (func ? (param ? )* (result )) -componenttype ::= (component ? *) -instancetype ::= (instance ? *) -componentdecl ::= - | -instancedecl ::= - | - | -importdecl ::= (import ) -exportdecl ::= (export ) -externdesc ::= core-prefix() - | + | +nonvaltype ::= + | | | - | (value ? ) - | (type ? ) -typebound ::= (eq ) valtype ::= unit | bool | s8 | u8 | s16 | u16 | s32 | u32 | s64 | u64 @@ -439,10 +427,25 @@ valtype ::= unit | (union +) | (option ) | (expected ) +functype ::= (func ? (param ? )* (result )) +typetype ::= (type ? ) +typebound ::= (eq ) +componenttype ::= (component ? *) +instancetype ::= (instance ? *) +componentdecl ::= + | +instancedecl ::= + | + | +importdecl ::= (import ) +exportdecl ::= (export ) +externtype ::= core-prefix() + | (value ? ) + | ``` -This type grammar uses productions like `` and `` recursively -to allow it to more-precisely indicate what's allowed. The formal AST and -[binary format](Binary.md#type-definitions) instead use a `` with +This grammar defines `type` recursively to allow it to more-precisely indicate +what's allowed at each point in the recursion. The formal AST and +[binary format](Binary.md#type-definitions) would instead use a `typeidx` with validation rules to restrict the target type while the formal text format would use something like [`core:typeuse`], allowing any of: (1) a `typeidx`, (2) an identifier `$T` resolving to a type definition (using `(type $T)` in cases @@ -504,6 +507,21 @@ unreachability. The remaining 5 type constructors use `valtype` to complete the description of a shared-nothing component interface: +The `type` type-constructor describes an imported or exported type along with +its bounds, which currently only has an `eq` option that says that the +imported/exported type must be exactly equal to the given immediate type. There +are two main use cases for this in the short-term: +* Type exports allow a component or interface to associate a name with a + structural type (e.g., `(export "nanos" (type (eq u64)))`) which bindings + generators can use to generate type aliases (e.g., `typedef uint64_t nanos;`). +* Type imports and exports allow a component to explicitly specify the + type parameters used to monomorphize a generic interface being imported + or exported. + +When [resource and handle types] are added to the explainer, `typebound` will +be extended with a `sub` option (symmetric to the [type-imports] proposal) that +allows importing and exporting *abstract* types. + The `func` type constructor describes a component-level function definition that takes and returns component-level value types. In contrast to [`core:functype`] which, as a low-level compiler target for a stack machine, @@ -517,31 +535,15 @@ interpreting this as `(result unit)`. The `component` type constructor is symmetric to the core `module` type constructor, although its grammar is factored to share declarators with the `instance` type constructor. The `import` and `export` declarator names -must be distinct within a single type. - -The `externdesc` production (used to declare the types of imported/exported -values) includes two additional type constructors that are not currently -present in `deftype` (since there is currently no reason for allowing them to -be shared or named as type definitions): +must be distinct within a single type. The `externtype` production shared by +the `import` and `export` declarators is symmetric to [`core:externtype`] and +includes all importable/exportable types. -The `value` case describes an imported or exported `valtype` value that is to -be consumed exactly once during instantiation. How this happens is described -below along with [`start` definitions](#start-definitions). - -The `type` case describes an imported or exported type along with its bounds, -which currently only has an `eq` option that says that the imported/exported -type must be exactly equal to the given immediate type. There are two main use -cases for this in the short-term: -* Type exports allow a component or interface to associate a name with a - structural type (e.g., `(export "nanos" (type (eq u64)))`) which bindings - generators can use to generate type aliases (e.g., `typedef uint64_t nanos;`). -* Type imports and exports allow a component to explicitly specify the - type parameters used to monomorphize a generic interface being imported - or exported. - -When [resource and handle types] are added to the explainer, `typebound` will -be extended with a `sub` option (symmetric to the [type-imports] proposal) that -allows importing and exporting *abstract* types. +The family of value types, `valtype`, is unified by a *single* type +constructor, `value`, that corresponds 1:1 with the `value` sort (described in +the [start definitions](#start-definitions) section below). As a type +constructor, `value` is symmetric to `global` in Core WebAssembly, but without +a mutability option. With what's defined so far, we can define component types using a mix of inline and out-of-line type definitions: @@ -761,7 +763,7 @@ of core linear memory. Lastly, imports and exports are defined in terms of the above as: ``` -import ::= (import ) +import ::= (import ) export ::= (export ) ``` All import and export names within a component must be unique, respectively. From 24975fc0d758beb0179b8489c6fb80aee44982e6 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Tue, 3 May 2022 18:42:34 -0500 Subject: [PATCH 03/27] Remove 'outer' option from core:alias --- design/mvp/Binary.md | 4 --- design/mvp/Explainer.md | 55 +++++++++++++++++++++-------------------- 2 files changed, 28 insertions(+), 31 deletions(-) diff --git a/design/mvp/Binary.md b/design/mvp/Binary.md index 7da2ee98..66a21892 100644 --- a/design/mvp/Binary.md +++ b/design/mvp/Binary.md @@ -102,7 +102,6 @@ Notes: ``` core:alias ::= sort: target: => (core alias target (sort)) core:aliastarget ::= 0x00 i: n: => export i n - | 0x01 ct: idx: => outer ct idx alias ::= sort: target: => (alias target (sort)) aliastarget ::= 0x00 i: n: => export i n @@ -131,7 +130,6 @@ core:deftype ::= ft: => ft ( core:moduletype ::= 0x50 md*:vec() => (module md*) core:moduledecl ::= 0x00 i: => i | 0x01 t: => t - | 0x02 a: => a | 0x03 e: => e core:import ::= m: f: ed: => (import m f ed) (WebAssembly 1.0) core:externdesc ::= id: => id (WebAssembly 1.0) @@ -142,8 +140,6 @@ Notes: * `core:import` as written above is binary-compatible with [`core:import`]. * Validation of `core:moduledecl` (currently) rejects `core:moduletype` definitions inside `type` declarators (i.e., nested core module types). -* Validation of `core:moduledecl` (currently) only allows `outer` `type` - `alias` declarators. * As described in the explainer, each module type is validated with an initially-empty type index space. Outer aliases can be used to pull in type definitions from containing components. diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index 5994abab..3b017313 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -225,7 +225,6 @@ component): ``` core:alias ::= (alias ( ?)) core:aliastarget ::= export - | outer alias ::= (alias ( ?)) aliastarget ::= export @@ -248,6 +247,11 @@ alias refers to the current component. To maintain the acyclicity of module instantiation, outer aliases are only allowed to refer to *preceding* outer definitions. +There is no `outer` option in `core:aliastarget` because it would only be able +to refer to enclosing *core* modules and module types and, until +module-linking, modules and module types can't nest. In a module-linking +future, outer aliases would be added, making `core:alias` symmetric to `alias`. + Components containing outer aliases effectively produce a [closure] at instantiation time, including a copy of the outer-aliased definitions. Because of the prevalent assumption that components are immutable values, outer aliases @@ -350,7 +354,6 @@ core:deftype ::= (WebAssembly 1.0) core:moduletype ::= (module ? *) core:moduledecl ::= | - | | core:importdecl ::= (import ) core:exportdecl ::= (export ) @@ -374,31 +377,7 @@ In preparation for the forthcoming addition of [type-imports] to Core WebAssembly, module types start with an empty type index space so that the type index space can be populated with fresh type definitions constructed from type imports. Thus, `core:moduledecl` also includes a `type` declarator for defining -the types used by the `import` and `export` declarators. An `alias` declarator -is also necessary in the future for defining type-sharing constraints between -type imports. In the short-term, `alias` declarators are restricted to only -allowing `outer` `type` aliases, thereby enabling a module type to reuse a -parent's type definition instead of re-defining it locally. - -As an example, the following component defines two equivalent module types, -where the former defines the function via `type` declarator and the latter via -`alias` declarator. In both cases, the type is given index `0` since the module -type starts with an empty type index space. -```wasm -(component $C - (core type $M1 (module - (type (func (param i32) (result i32))) - (import "a" "b" (func (type 0))) - (export "c" (func (type 0))) - )) - (core type $F (func (param i32) (result i32))) - (core type $M2 (module - (alias outer $C $F (type)) - (import "a" "b" (func (type 0))) - (export "c" (func (type 0))) - )) -) -``` +the types used by the `import` and `export` declarators. Component-level type definitions are symmetric to core-level type definitions, but use a completely different set of value types. Unlike [`core:valtype`] @@ -539,6 +518,28 @@ must be distinct within a single type. The `externtype` production shared by the `import` and `export` declarators is symmetric to [`core:externtype`] and includes all importable/exportable types. +Component and instance types also include an `alias` declarator for projecting +the exports out of imported instances and sharing types with outer components. +As an example, the following component defines two equivalent component types, +where the former defines the function type via `type` declarator and the latter +via `alias` declarator. In both cases, the type is given index `0` since +component types start with an empty type index space. +```wasm +(component $C + (type $C1 (component + (type (func (param string) (result string))) + (import "a" "b" (func (type 0))) + (export "c" (func (type 0))) + )) + (type $F (func (param string) (result string))) + (type $C2 (component + (alias outer $C $F (type)) + (import "a" "b" (func (type 0))) + (export "c" (func (type 0))) + )) +) +``` + The family of value types, `valtype`, is unified by a *single* type constructor, `value`, that corresponds 1:1 with the `value` sort (described in the [start definitions](#start-definitions) section below). As a type From 67608edac5aeeac6c9b198cb28365d778bce359d Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Wed, 4 May 2022 09:40:48 -0500 Subject: [PATCH 04/27] Tweak grammar to be more regular --- design/mvp/Binary.md | 4 ++-- design/mvp/Explainer.md | 15 ++++++++------- 2 files changed, 10 insertions(+), 9 deletions(-) diff --git a/design/mvp/Binary.md b/design/mvp/Binary.md index 66a21892..b75c187e 100644 --- a/design/mvp/Binary.md +++ b/design/mvp/Binary.md @@ -73,7 +73,7 @@ instanceexpr ::= 0x00 c: arg*:vec() => (i | 0x01 e*:vec() => e* instantiatearg ::= n: si: => (with n si) sortidx ::= sort: idx: => (sort idx) -sort ::= 0x00 => core module +sort ::= 0x00 csi: => core csi | 0x01 => func | 0x02 => value | 0x03 => type @@ -194,7 +194,7 @@ instancedecl ::= 0x01 t: => t | 0x03 ed: => ed importdecl ::= n: et: => (import n et) exportdecl ::= n: et: => (export n et) -externtype ::= 0x00 i: => (core module core-type-index-space[i]) +externtype ::= 0x00 0x10 i: => (core module core-type-index-space[i]) (must be moduletype) | sort: i: => (sort type-index-space[i]) (sort must match type) ``` Notes: diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index 3b017313..5b3c8f91 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -182,7 +182,7 @@ example of these, we'll also need the `alias` definitions introduced in the next section. The syntax for defining component instances is symmetric to core module -instances, but with a distinct component-level definition of `sort`: +instances, but with an expanded component-level definition of `sort`: ``` instance ::= (instance ? ) instanceexpr ::= (instantiate *) @@ -190,7 +190,7 @@ instanceexpr ::= (instantiate *) instantiatearg ::= (with ) | (with (instance *)) sortidx ::= ( ) -sort ::= core module +sort ::= core-prefix() | func | value | type @@ -200,11 +200,12 @@ export ::= (export ) ``` Because component-level function, type and instance definitions are different than core-level function, type and instance definitions, they are put into -disjoint index spaces which are indexed separately by `sortidx` and -`core:sortidx`, respectively. Components may also import or export core modules -since core modules are immutable values and thus do not break the -[shared-nothing] model. In the future, other immutable core sorts could be -added to this list such as, if it was made importable/exportable, `data`. +disjoint index spaces which are indexed separately. Components may import +and export various core definitions (when they are compatible with the +[shared-nothing] model, which currently means only `module`, but may in the +future include `data`). Thus, component-level `sort` injects the full set +of `core:sort`, so that they may be referenced (leaving it up to validation +rules to throw out the core sorts that aren't allowed in various contexts). The `value` sort refers to a value that is provided and consumed during instantiation. How this works is described in the From 14ae2d05d42e27cd068d4392e2cc6ccab4ddb301 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Wed, 4 May 2022 10:31:14 -0500 Subject: [PATCH 05/27] Fix whitespace --- design/mvp/Explainer.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index 5b3c8f91..a2e52e31 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -594,14 +594,14 @@ two directions: Canonical definitions specify one of these two wrapping directions, the function to wrap and a list of configuration options: ``` -canon ::= (canon lift core-prefix() * (func ?)) - | (canon lower * (core func ?)) -canonopt ::= string-encoding=utf8 - | string-encoding=utf16 - | string-encoding=latin1+utf16 - | (memory core-prefix()) - | (realloc core-prefix()) - | (post-return core-prefix()) +canon ::= (canon lift core-prefix() * (func ?)) + | (canon lower * (core func ?)) +canonopt ::= string-encoding=utf8 + | string-encoding=utf16 + | string-encoding=latin1+utf16 + | (memory core-prefix()) + | (realloc core-prefix()) + | (post-return core-prefix()) ``` The `string-encoding` option specifies the encoding the Canonical ABI will use for the `string` type. The `latin1+utf16` encoding captures a common string From 0d99c78b0da9596877f114ee59cf246ccf7b49ea Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Thu, 5 May 2022 13:25:26 -0500 Subject: [PATCH 06/27] Fix bug in outer alias example --- design/mvp/Explainer.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index a2e52e31..4caa7b83 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -296,7 +296,7 @@ is desugared into: (component $C (core module $M ...) (component - (core alias outer $C $M (module $C_M)) + (alias outer $C $M (core module $C_M)) (core instance (instantiate $C_M)) ) ) From 18917081b558b3352f9037cc1c255d99841aa77e Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Thu, 5 May 2022 13:39:08 -0500 Subject: [PATCH 07/27] Fix bug in outer alias example (better) --- design/mvp/Explainer.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index 4caa7b83..88764e25 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -285,19 +285,19 @@ definition, resolved using normal lexical scoping rules. For example, the following component: ```wasm (component - (core module $M ...) + (component $C ...) (component - (core instance (instantiate $M)) + (instance (instantiate $C)) ) ) ``` is desugared into: ```wasm -(component $C - (core module $M ...) +(component $Parent + (component $C ...) (component - (alias outer $C $M (core module $C_M)) - (core instance (instantiate $C_M)) + (alias outer $Parent $C (component $Parent_C)) + (instance (instantiate $Parent_C)) ) ) ``` From 9e510969a59d0b60c0bcc7e4dafad51496b1569a Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Thu, 5 May 2022 14:54:57 -0500 Subject: [PATCH 08/27] Fix thinko in definition of 'sort' --- design/mvp/Explainer.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index 88764e25..21bc9ae0 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -190,7 +190,7 @@ instanceexpr ::= (instantiate *) instantiatearg ::= (with ) | (with (instance *)) sortidx ::= ( ) -sort ::= core-prefix() +sort ::= core-prefix() | func | value | type From a6e40d16cfc63c695227d6e3387e7e48597b4270 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Thu, 5 May 2022 14:57:02 -0500 Subject: [PATCH 09/27] ... and in Binary.md too --- design/mvp/Binary.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/mvp/Binary.md b/design/mvp/Binary.md index b75c187e..29ed7b29 100644 --- a/design/mvp/Binary.md +++ b/design/mvp/Binary.md @@ -73,7 +73,7 @@ instanceexpr ::= 0x00 c: arg*:vec() => (i | 0x01 e*:vec() => e* instantiatearg ::= n: si: => (with n si) sortidx ::= sort: idx: => (sort idx) -sort ::= 0x00 csi: => core csi +sort ::= 0x00 cs: => core cs | 0x01 => func | 0x02 => value | 0x03 => type From fce98d20916c265116ee1e8786cef82d271d1af4 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Tue, 10 May 2022 15:22:20 -0500 Subject: [PATCH 10/27] Remove ambiguous hand-waving from type grammar --- design/mvp/Binary.md | 31 +++++---- design/mvp/Explainer.md | 150 ++++++++++++++++++++-------------------- 2 files changed, 93 insertions(+), 88 deletions(-) diff --git a/design/mvp/Binary.md b/design/mvp/Binary.md index 29ed7b29..af9014c2 100644 --- a/design/mvp/Binary.md +++ b/design/mvp/Binary.md @@ -131,13 +131,11 @@ core:moduletype ::= 0x50 md*:vec() => (module md*) core:moduledecl ::= 0x00 i: => i | 0x01 t: => t | 0x03 e: => e -core:import ::= m: f: ed: => (import m f ed) (WebAssembly 1.0) -core:externdesc ::= id: => id (WebAssembly 1.0) -core:exportdecl ::= n: ed: => (export n ed) +core:importdecl ::= i: => i +core:exportdecl ::= n: d: => (export n d) ``` Notes: -* Reused Core binary rules: [`core:importdesc`], [`core:functype`] -* `core:import` as written above is binary-compatible with [`core:import`]. +* Reused Core binary rules: [`core:import`], [`core:importdesc`], [`core:functype`] * Validation of `core:moduledecl` (currently) rejects `core:moduletype` definitions inside `type` declarators (i.e., nested core module types). * As described in the explainer, each module type is validated with an @@ -148,7 +146,6 @@ Notes: type ::= dt: => (type dt) deftype ::= dvt: => dvt | ft: => ft - | tt: => tt | ct: => ct | it: => it primvaltype ::= 0x7f => unit @@ -175,13 +172,11 @@ defvaltype ::= pvt: => pvt | 0x6b t*:vec() => (union t*) | 0x6a t: => (option t) | 0x69 t: u: => (expected t u) -valtype ::= i: => type-index-space[i] (must be defvaltype) - | pit: => pit field ::= n: t: => (field n t) case ::= n: t: 0x0 => (case n t) | n: t: 0x1 i: => (case n t (refines case-label[i])) -typetype ::= tb: (type tb) -typebound ::= 0x00 i: => (eq type-index-space[i]) +valtype ::= i: => i + | pvt: => pvt functype ::= 0x40 param*:vec() t: => (func param* (result t)) param ::= 0x00 t: => (param t) | 0x01 n: t: => (param n t) @@ -192,20 +187,28 @@ componentdecl ::= 0x00 id: => id instancedecl ::= 0x01 t: => t | 0x02 a: => a | 0x03 ed: => ed -importdecl ::= n: et: => (import n et) -exportdecl ::= n: et: => (export n et) -externtype ::= 0x00 0x10 i: => (core module core-type-index-space[i]) (must be moduletype) - | sort: i: => (sort type-index-space[i]) (sort must match type) +importdecl ::= n: ed: => (import n ed) +exportdecl ::= n: ed: => (export n ed) +externdesc ::= 0x00 0x10 i: => (core module (type i)) + | 0x01 i: => (func (type i)) + | 0x02 t: => (value t) + | 0x03 b: => (type b) + | 0x04 i: => (instance (type i)) + | 0x05 i: => (component (type i)) +typebound ::= 0x00 i: => (eq i) ``` Notes: * The type opcodes follow the same negative-SLEB128 scheme as Core WebAssembly, with type opcodes starting at SLEB128(-1) (`0x7f`) and going down, reserving the nonnegative SLEB128s for type indices. +* Validation of `valtype` requires the `typeidx` to refer to a `defvaltype`. * Validation of `moduledecl` (currently) only allows `outer` `type` `alias` declarators. * As described in the explainer, each component and instance type is validated with an initially-empty type index space. Outer aliases can be used to pull in type definitions from containing components. +* Validation of `externdesc` requires the various `typeidx` type constructors + to match the preceding `sort`. ## Canonical Definitions diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index 21bc9ae0..978b2e88 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -63,6 +63,8 @@ definition ::= core-prefix() | | | + +where core-prefix(X) parses '(' 'core' Y ')' when X parses '(' Y ')' ``` Components are like Core WebAssembly modules in that their contained definitions are acyclic: definitions can only refer to preceding definitions @@ -71,10 +73,7 @@ components can arbitrarily interleave different kinds of definitions. The `core-prefix` meta-function transforms a grammatical rule for parsing a Core WebAssembly definition into a grammatical rule for parsing the same -definition, but with a `core` token added right after the leftmost paren: -``` -core-prefix(X) ::= '(' 'core' Y ')' where X = '(' Y ')' -``` +definition, but with a `core` token added right after the leftmost paren. For example, `core:module` accepts `(module (func))` so `core-prefix()` accepts `(core module (func))`. Note that the inner `func` doesn't need a `core` prefix; the `core` token is used to mark the @@ -356,10 +355,13 @@ core:moduletype ::= (module ? *) core:moduledecl ::= | | -core:importdecl ::= (import ) -core:exportdecl ::= (export ) -core:externdesc ::= (WebAssembly 1.0) +core:importdecl ::= (import ) (WebAssembly 1.0) +core:exportdecl ::= (export ) +core:exportdesc ::= strip-id() + +where strip-id(X) parses '(' sort Y ')' when X parses '(' sort ? Y ')' ``` + Here, `core:deftype` (short for "defined type") is inherited from the [gc] proposal and extended with a `module` type constructor. If module-linking is added to Core WebAssembly, an `instance` type constructor would be added as @@ -370,9 +372,9 @@ core modules cannot themselves import or export other core modules. The body of a module type contains an ordered list of "module declarators" which describe, at a type level, the imports and exports of the module. In a module-type context, import and export declarators can both reuse the existing -[`core:importdesc`] production defined in WebAssembly 1.0. To avoid confusion, -`core:importdesc` is renamed to `core:externdesc` (for symmetry with -[`core:externtype`]). +[`core:importdesc`] production defined in WebAssembly 1.0, with the only +difference being that, in the text format, `core:importdesc` can bind an +identifier for later reuse while `core:exportdesc` cannot. In preparation for the forthcoming addition of [type-imports] to Core WebAssembly, module types start with an empty type index space so that the type @@ -387,13 +389,11 @@ compound values, component-level value types assume no shared memory and must therefore be high-level, describing entire compound values. ``` type ::= (type ? ) -deftype ::= - | -nonvaltype ::= - | +deftype ::= + | | | -valtype ::= unit +defvaltype ::= unit | bool | s8 | u8 | s16 | u16 | s32 | u32 | s64 | u64 | float32 | float64 @@ -407,35 +407,30 @@ valtype ::= unit | (union +) | (option ) | (expected ) -functype ::= (func ? (param ? )* (result )) -typetype ::= (type ? ) -typebound ::= (eq ) -componenttype ::= (component ? *) -instancetype ::= (instance ? *) +valtype ::= + | +functype ::= (func (param ? )* (result )) +componenttype ::= (component *) +instancetype ::= (instance *) componentdecl ::= | instancedecl ::= | | -importdecl ::= (import ) -exportdecl ::= (export ) -externtype ::= core-prefix() - | (value ? ) - | -``` -This grammar defines `type` recursively to allow it to more-precisely indicate -what's allowed at each point in the recursion. The formal AST and -[binary format](Binary.md#type-definitions) would instead use a `typeidx` with -validation rules to restrict the target type while the formal text format would -use something like [`core:typeuse`], allowing any of: (1) a `typeidx`, (2) an -identifier `$T` resolving to a type definition (using `(type $T)` in cases -where there is a grammatical ambiguity), or (3) an inline type definition that -is desugared into a deduplicated out-of-line type definition. - -The optional `id` after all the type constructors (e.g., `(module ? ...)`) -is only allowed to be present in the context of `import` since this is the only -context in which binding an identifier makes sense. +importdecl ::= (import ) +exportdecl ::= (export ) +importdesc ::= bind-id() +exportdesc ::= ( (type ) ) + | core-prefix() + | + | + | + | (value ) + | (type ) +typebound ::= (eq ) +where bind-id(X) parses '(' sort ? Y ')' when X parses '(' sort Y ')' +``` The value types in `valtype` can be broken into two categories: *fundamental* value types and *specialized* value types, where the latter are defined by expansion into the former. The *fundamental value types* have the following @@ -484,13 +479,43 @@ cases. This could be relaxed in the future to allow an empty list of cases, with the empty `(variant)` effectively serving as a [bottom type] and indicating unreachability. -The remaining 5 type constructors use `valtype` to complete the description -of a shared-nothing component interface: +The remaining 3 type constructors in `deftype` use `valtype` to describe +shared-nothing functions, components and component instances: + +The `func` type constructor describes a component-level function definition +that takes and returns `valtype`. In contrast to [`core:functype`] which, as a +low-level compiler target for a stack machine, returns zero or more results, +`functype` always returns a single type, with `unit` being used for functions +that don't return an interesting value (analogous to "void" in some languages). +Having a single return type simplifies the binding of `functype` into a wide +variety of source languages. As syntactic sugar, the text format of `functype` +additionally allows `result` to be absent, interpreting this as `(result +unit)`. + +The `instance` type constructor represents the result of instantiating a +component and thus is the same as a `component` type minus the description +of imports. -The `type` type-constructor describes an imported or exported type along with -its bounds, which currently only has an `eq` option that says that the -imported/exported type must be exactly equal to the given immediate type. There -are two main use cases for this in the short-term: +The `component` type constructor is symmetric to the core `module` type +constructor and is built from a sequence of "declarators" which are used to +describe the imports and exports of the component. There are four kinds of +declarators: + +As with core modules, `importdecl` and `exportdecl` classify component `import` +and `export` definitions, with `importdecl` allowing an identifier to be +bound for use within the type. Following the precedent of [`core:typeuse`], the +text format allows both references to out-of-line type definitions (via +`(type )`) and inline type expressions that the text format desugars +into out-of-line type definitions. + +The `value` case of `importdesc`/`exportdesc` describes a runtime value +that is imported or exported at instantiation time as described in the [start +definitions](#start-definitions) section below. + +The `type` case of `importdesc`/`exportdesc` describes an imported or exported +type along with its bounds. The bounds currently only have an `eq` option that +says that the imported/exported type must be exactly equal to the referenced +type. There are two main use cases for this in the short-term: * Type exports allow a component or interface to associate a name with a structural type (e.g., `(export "nanos" (type (eq u64)))`) which bindings generators can use to generate type aliases (e.g., `typedef uint64_t nanos;`). @@ -502,29 +527,12 @@ When [resource and handle types] are added to the explainer, `typebound` will be extended with a `sub` option (symmetric to the [type-imports] proposal) that allows importing and exporting *abstract* types. -The `func` type constructor describes a component-level function definition -that takes and returns component-level value types. In contrast to -[`core:functype`] which, as a low-level compiler target for a stack machine, -returns zero or more results, `functype` always returns a single type, with -`unit` being used for functions that don't return an interesting value -(analogous to "void" in some languages). Having a single return type simplifies -the binding of `functype` into a wide variety of source languages. As syntactic -sugar, the text format of `functype` additionally allows `result` to be absent, -interpreting this as `(result unit)`. - -The `component` type constructor is symmetric to the core `module` type -constructor, although its grammar is factored to share declarators with the -`instance` type constructor. The `import` and `export` declarator names -must be distinct within a single type. The `externtype` production shared by -the `import` and `export` declarators is symmetric to [`core:externtype`] and -includes all importable/exportable types. - -Component and instance types also include an `alias` declarator for projecting -the exports out of imported instances and sharing types with outer components. -As an example, the following component defines two equivalent component types, -where the former defines the function type via `type` declarator and the latter -via `alias` declarator. In both cases, the type is given index `0` since -component types start with an empty type index space. +Lastly, component and instance types also include an `alias` declarator for +projecting the exports out of imported instances and sharing types with outer +components. As an example, the following component defines two equivalent +component types, where the former defines the function type via `type` +declarator and the latter via `alias` declarator. In both cases, the type is +given index `0` since component types start with an empty type index space. ```wasm (component $C (type $C1 (component @@ -541,12 +549,6 @@ component types start with an empty type index space. ) ``` -The family of value types, `valtype`, is unified by a *single* type -constructor, `value`, that corresponds 1:1 with the `value` sort (described in -the [start definitions](#start-definitions) section below). As a type -constructor, `value` is symmetric to `global` in Core WebAssembly, but without -a mutability option. - With what's defined so far, we can define component types using a mix of inline and out-of-line type definitions: ```wasm @@ -765,7 +767,7 @@ of core linear memory. Lastly, imports and exports are defined in terms of the above as: ``` -import ::= (import ) +import ::= (import ) export ::= (export ) ``` All import and export names within a component must be unique, respectively. From 9d50001b21eadda59faea9e7e99a715edef93e4b Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Tue, 10 May 2022 18:28:29 -0500 Subject: [PATCH 11/27] Add better validation notes in Binary.md, normalize on 'externdesc' --- design/mvp/Binary.md | 15 ++++++++------- design/mvp/Explainer.md | 8 ++++---- 2 files changed, 12 insertions(+), 11 deletions(-) diff --git a/design/mvp/Binary.md b/design/mvp/Binary.md index af9014c2..aaa72a62 100644 --- a/design/mvp/Binary.md +++ b/design/mvp/Binary.md @@ -84,17 +84,17 @@ export ::= n: si: => (e Notes: * Reused Core binary rules: [`core:name`] * The `core:sort` values are chosen to match the discriminant opcodes of - [`core:importdesc`] so that `core:exportdesc` (below) is identical. + [`core:importdesc`]. * `type` is added to `core:sort` in anticipation of the [type-imports] proposal. Until that proposal, core modules won't be able to actually import or export types, however, the `type` sort is allowed as part of outer aliases (below). * `module` and `instance` are added to `core:sort` in anticipation of the [module-linking] - proposal, which would add these types to Core WebAssembly. Again, core modules won't be - able to actually import or export modules/instances, but they are used for aliases. + proposal, which would add these types to Core WebAssembly. Until then, they are useful + for aliases (below). +* Validation of `core:instantiatearg` would initially only allow the `instance` + sort, but would be extended to accept other sorts as core wasm is extended. * The indices in `sortidx` are validated according to their `sort`'s index spaces, which are built incrementally as each definition is validated. -* The types of arguments supplied by `instantiate` are validated against the - types of the matching import according to the [subtyping](Subtyping.md) rules. ## Alias Definitions @@ -269,12 +269,13 @@ flags are set. (See [Import and Export Definitions](Explainer.md#import-and-export-definitions) in the explainer.) ``` -import ::= n: et: => (import n et) +import ::= n: ed: => (import n ed) export ::= n: si: => (export n si) ``` Notes: * Validation requires all import and export `name`s are unique. - +* Validation requires any exported `sortidx` to have a valid `externdesc` + (which disallows core sorts other than `core module`). [`core:section`]: https://webassembly.github.io/spec/core/binary/modules.html#binary-section diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index 978b2e88..4cf5fafe 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -355,7 +355,7 @@ core:moduletype ::= (module ? *) core:moduledecl ::= | | -core:importdecl ::= (import ) (WebAssembly 1.0) +core:importdecl ::= (import ) core:exportdecl ::= (export ) core:exportdesc ::= strip-id() @@ -418,9 +418,9 @@ instancedecl ::= | | importdecl ::= (import ) -exportdecl ::= (export ) -importdesc ::= bind-id() -exportdesc ::= ( (type ) ) +exportdecl ::= (export ) +importdesc ::= bind-id() +externdesc ::= ( (type ) ) | core-prefix() | | From 45f433b59f951f03199041b2d7f784e201319a14 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Wed, 11 May 2022 10:21:44 -0500 Subject: [PATCH 12/27] s/varu32/u32/ because to match actual core wasm spec --- design/mvp/Binary.md | 14 ++++++++------ design/mvp/Explainer.md | 18 +++++++++--------- 2 files changed, 17 insertions(+), 15 deletions(-) diff --git a/design/mvp/Binary.md b/design/mvp/Binary.md index aaa72a62..73e53c3c 100644 --- a/design/mvp/Binary.md +++ b/design/mvp/Binary.md @@ -58,7 +58,7 @@ core:instance ::= ie: => (i core:instanceexpr ::= 0x00 m: arg*:vec() => (instantiate m arg*) | 0x01 e*:vec() => e* core:instantiatearg ::= n: si: => (with n si) -core:sortidx ::= sort: idx: => (sort idx) +core:sortidx ::= sort: idx: => (sort idx) core:sort ::= 0x00 => func | 0x01 => table | 0x02 => memory @@ -72,7 +72,7 @@ instance ::= ie: => (i instanceexpr ::= 0x00 c: arg*:vec() => (instantiate c arg*) | 0x01 e*:vec() => e* instantiatearg ::= n: si: => (with n si) -sortidx ::= sort: idx: => (sort idx) +sortidx ::= sort: idx: => (sort idx) sort ::= 0x00 cs: => core cs | 0x01 => func | 0x02 => value @@ -82,7 +82,7 @@ sort ::= 0x00 cs: => co export ::= n: si: => (export n si) ``` Notes: -* Reused Core binary rules: [`core:name`] +* Reused Core binary rules: [`core:name`], (variable-length encoded) [`core:u32`] * The `core:sort` values are chosen to match the discriminant opcodes of [`core:importdesc`]. * `type` is added to `core:sort` in anticipation of the [type-imports] proposal. Until that @@ -105,9 +105,10 @@ core:aliastarget ::= 0x00 i: n: => export i n alias ::= sort: target: => (alias target (sort)) aliastarget ::= 0x00 i: n: => export i n - | 0x01 ct: idx: => outer ct idx + | 0x01 ct: idx: => outer ct idx ``` Notes: +* Reused Core binary rules: (variable-length encoded) [`core:u32`] * For `export` aliases, `i` is validated to refer to an instance in the instance index space that exports `n` with the specified `sort`. * For `outer` aliases, `ct` is validated to be *less or equal than* the number @@ -174,7 +175,7 @@ defvaltype ::= pvt: => pvt | 0x69 t: u: => (expected t u) field ::= n: t: => (field n t) case ::= n: t: 0x0 => (case n t) - | n: t: 0x1 i: => (case n t (refines case-label[i])) + | n: t: 0x1 i: => (case n t (refines case-label[i])) valtype ::= i: => i | pvt: => pvt functype ::= 0x40 param*:vec() t: => (func param* (result t)) @@ -227,7 +228,7 @@ canonopt ::= 0x00 => string-encod ``` Notes: * The second `0x00` byte in `canon` stands for the `func` sort and thus the - `0x00 ` pair standards for a `func` `sortidx` or `core:sortidx`. + `0x00 ` pair standards for a `func` `sortidx` or `core:sortidx`. * Validation prevents duplicate or conflicting `canonopt`. * Validation of `canon lift` requires `f` to have type `flatten(ft)` (defined by the [Canonical ABI](CanonicalABI.md#flattening)). The function being @@ -278,6 +279,7 @@ Notes: (which disallows core sorts other than `core module`). +[`core:u32`]: https://webassembly.github.io/spec/core/binary/values.html#integers [`core:section`]: https://webassembly.github.io/spec/core/binary/modules.html#binary-section [`core:custom`]: https://webassembly.github.io/spec/core/binary/modules.html#custom-section [`core:module`]: https://webassembly.github.io/spec/core/binary/modules.html#binary-module diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index 4cf5fafe..2c9f5207 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -133,7 +133,7 @@ core:instanceexpr ::= (instantiate *) | * core:instantiatearg ::= (with ) | (with (instance *)) -core:sortidx ::= ( ) +core:sortidx ::= ( ) core:sort ::= func | table | memory @@ -152,7 +152,7 @@ core modules are resolved as follows: core definition. Each `core:sort` corresponds 1:1 with a distinct [index space] that contains -only core definitions of that *sort*. The `varu32` field of `core:sortidx` +only core definitions of that *sort*. The `u32` field of `core:sortidx` indexes into the sort's associated index space to select a definition. Based on this, we can link two core modules `$A` and `$B` together with the @@ -188,7 +188,7 @@ instanceexpr ::= (instantiate *) | * instantiatearg ::= (with ) | (with (instance *)) -sortidx ::= ( ) +sortidx ::= ( ) sort ::= core-prefix() | func | value @@ -228,7 +228,7 @@ core:aliastarget ::= export alias ::= (alias ( ?)) aliastarget ::= export - | outer + | outer ``` The `core:sort`/`sort` immediate of the alias specifies which index space in the target component is being read from and which index space of the containing @@ -239,10 +239,10 @@ used. In the case of `export` aliases, validation ensures `name` is an export in the target instance and has a matching sort. -In the case of `outer` aliases, the `varu32` pair serves as a [de Bruijn -index], with first `varu32` being the number of enclosing components to skip -and the second `varu32` being an index into the target component's sort's index -space. In particular, the first `varu32` can be `0`, in which case the outer +In the case of `outer` aliases, the `u32` pair serves as a [de Bruijn +index], with first `u32` being the number of enclosing components to skip +and the second `u32` being an index into the target component's sort's index +space. In particular, the first `u32` can be `0`, in which case the outer alias refers to the current component. To maintain the acyclicity of module instantiation, outer aliases are only allowed to refer to *preceding* outer definitions. @@ -420,7 +420,7 @@ instancedecl ::= importdecl ::= (import ) exportdecl ::= (export ) importdesc ::= bind-id() -externdesc ::= ( (type ) ) +externdesc ::= ( (type ) ) | core-prefix() | | From 2e7167610db9dd3427074baa2fc31ab8f0d60035 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Wed, 11 May 2022 10:47:32 -0500 Subject: [PATCH 13/27] Clamp down core (with ...) expressions to just the 'instance' sort --- design/mvp/Binary.md | 4 ++-- design/mvp/Explainer.md | 6 ++++-- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/design/mvp/Binary.md b/design/mvp/Binary.md index 73e53c3c..5faf204d 100644 --- a/design/mvp/Binary.md +++ b/design/mvp/Binary.md @@ -57,7 +57,7 @@ Notes: core:instance ::= ie: => (instance ie) core:instanceexpr ::= 0x00 m: arg*:vec() => (instantiate m arg*) | 0x01 e*:vec() => e* -core:instantiatearg ::= n: si: => (with n si) +core:instantiatearg ::= n: 0x11 i: => (with n (instance i)) core:sortidx ::= sort: idx: => (sort idx) core:sort ::= 0x00 => func | 0x01 => table @@ -91,7 +91,7 @@ Notes: * `module` and `instance` are added to `core:sort` in anticipation of the [module-linking] proposal, which would add these types to Core WebAssembly. Until then, they are useful for aliases (below). -* Validation of `core:instantiatearg` would initially only allow the `instance` +* Validation of `core:instantiatearg` initially only allows the `instance` sort, but would be extended to accept other sorts as core wasm is extended. * The indices in `sortidx` are validated according to their `sort`'s index spaces, which are built incrementally as each definition is validated. diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index 2c9f5207..2c501c8b 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -131,7 +131,7 @@ The syntax for defining a core module instance is: core:instance ::= (instance ? ) core:instanceexpr ::= (instantiate *) | * -core:instantiatearg ::= (with ) +core:instantiatearg ::= (with (instance )) | (with (instance *)) core:sortidx ::= ( ) core:sort ::= func @@ -146,7 +146,9 @@ core:export ::= (export ) When instantiating a module via `instantiate`, the two-level imports of the core modules are resolved as follows: 1. The first `name` of the import is looked up in the named list of - `core:instantiatearg` to select a core module instance. + `core:instantiatearg` to select a core module instance. (In the future, + other `core:sort`s could be allowed if core wasm adds single-level + imports.) 2. The second `name` of the import is looked up in the named list of exports of the core module instance found by the first step to select the imported core definition. From e84e499df60fc095f1fb5f1f05bbe73b0f3706d4 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Wed, 11 May 2022 10:51:04 -0500 Subject: [PATCH 14/27] Remove dangling ? --- design/mvp/Explainer.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index 2c501c8b..ac59b46d 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -353,7 +353,7 @@ core:deftype ::= (WebAssembly 1.0) | (GC proposal) | (GC proposal) | -core:moduletype ::= (module ? *) +core:moduletype ::= (module *) core:moduledecl ::= | | From e3e1a9852dbe9d935c103304df09bab0456c837f Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Wed, 11 May 2022 10:52:50 -0500 Subject: [PATCH 15/27] Fix typo in core:exportdesc --- design/mvp/Explainer.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index ac59b46d..824b0430 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -359,7 +359,7 @@ core:moduledecl ::= | core:importdecl ::= (import ) core:exportdecl ::= (export ) -core:exportdesc ::= strip-id() +core:exportdesc ::= strip-id() where strip-id(X) parses '(' sort Y ')' when X parses '(' sort ? Y ')' ``` From 5934e70230be9d7e92aec35ad4a8feda451e8db5 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Wed, 11 May 2022 12:03:17 -0500 Subject: [PATCH 16/27] Improve explanation of type imports and fresh type index spaces --- design/mvp/Explainer.md | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index 824b0430..93ef4b51 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -378,11 +378,24 @@ module-type context, import and export declarators can both reuse the existing difference being that, in the text format, `core:importdesc` can bind an identifier for later reuse while `core:exportdesc` cannot. -In preparation for the forthcoming addition of [type-imports] to Core -WebAssembly, module types start with an empty type index space so that the type -index space can be populated with fresh type definitions constructed from type -imports. Thus, `core:moduledecl` also includes a `type` declarator for defining -the types used by the `import` and `export` declarators. +With the Core WebAssembly [type-imports], module types will need the ability to +define the types of exports based on the types of imports. In preparation for +this, module types start with an empty type index space that is populated by +`type` declarators, so that, in the future, these `type` declarators can refer to +type imports local to the module type itself. For example, in the future, the +following module type would be expressible: +``` +(component $C + (type $M (module + (import "" "T" (type $T)) + (type $PairT (struct (field (ref $T)) (field (ref $T)))) + (export "make_pair" (func (param (ref $T)) (result (ref $PairT)))) + )) +) +``` +In this example, `$M` has a distinct type index space from `$C`, where element +0 is the imported type, element 1 is the `struct` type, and element 2 is an +implicitly-created `func` type referring to both. Component-level type definitions are symmetric to core-level type definitions, but use a completely different set of value types. Unlike [`core:valtype`] From 43a615682a736a24479f014bd8ee888691942a67 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Wed, 11 May 2022 12:39:06 -0500 Subject: [PATCH 17/27] s/Bottom type/Empty type/ --- design/mvp/Explainer.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index 93ef4b51..156f5950 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -491,7 +491,7 @@ defined by the following mapping: ``` Note that, at least initially, variants are required to have a non-empty list of cases. This could be relaxed in the future to allow an empty list of cases, with -the empty `(variant)` effectively serving as a [bottom type] and indicating +the empty `(variant)` effectively serving as a [empty type] and indicating unreachability. The remaining 3 type constructors in `deftype` use `valtype` to describe @@ -1080,7 +1080,7 @@ and will be added over the coming months to complete the MVP proposal: [De Bruijn Index]: https://en.wikipedia.org/wiki/De_Bruijn_index [Closure]: https://en.wikipedia.org/wiki/Closure_(computer_programming) -[Bottom Type]: https://en.wikipedia.org/wiki/Bottom_type +[Empty Type]: https://en.wikipedia.org/w/index.php?title=Empty_type [IEEE754]: https://en.wikipedia.org/wiki/IEEE_754 [NaN]: https://en.wikipedia.org/wiki/NaN [NaN Boxing]: https://wingolog.org/archives/2011/05/18/value-representation-in-javascript-implementations From 88816fc721529c0613f80f73512ed618b2b868d7 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Wed, 11 May 2022 12:57:45 -0500 Subject: [PATCH 18/27] Tweak wording around type imports/exports rationale --- design/mvp/Explainer.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index 156f5950..6957cedd 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -534,9 +534,8 @@ type. There are two main use cases for this in the short-term: * Type exports allow a component or interface to associate a name with a structural type (e.g., `(export "nanos" (type (eq u64)))`) which bindings generators can use to generate type aliases (e.g., `typedef uint64_t nanos;`). -* Type imports and exports allow a component to explicitly specify the - type parameters used to monomorphize a generic interface being imported - or exported. +* Type imports and exports can provide additional information to toolchains and + runtimes for defining the behavior of host APIs. When [resource and handle types] are added to the explainer, `typebound` will be extended with a `sub` option (symmetric to the [type-imports] proposal) that From 1d8607691fa43871de6d4da333a91ff431d162dd Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Wed, 11 May 2022 14:29:39 -0500 Subject: [PATCH 19/27] Don't use core-prefix in --- design/mvp/Explainer.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index 6957cedd..4930f895 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -191,7 +191,7 @@ instanceexpr ::= (instantiate *) instantiatearg ::= (with ) | (with (instance *)) sortidx ::= ( ) -sort ::= core-prefix() +sort ::= core | func | value | type From 61314cf0b10f4b20a9c931a27af5ad7e2e20fb2f Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Wed, 11 May 2022 14:38:21 -0500 Subject: [PATCH 20/27] Fix bug in example --- design/mvp/Explainer.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index 4930f895..74d9fb9e 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -551,14 +551,14 @@ given index `0` since component types start with an empty type index space. (component $C (type $C1 (component (type (func (param string) (result string))) - (import "a" "b" (func (type 0))) - (export "c" (func (type 0))) + (import "a" (func (type 0))) + (export "b" (func (type 0))) )) (type $F (func (param string) (result string))) (type $C2 (component (alias outer $C $F (type)) - (import "a" "b" (func (type 0))) - (export "c" (func (type 0))) + (import "a" (func (type 0))) + (export "b" (func (type 0))) )) ) ``` From a0eb04369903047edc493d8472ec58cbf587d8c5 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Wed, 11 May 2022 14:41:17 -0500 Subject: [PATCH 21/27] Update example to match explicit sort in exportdesc --- design/mvp/Explainer.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index 74d9fb9e..4899de38 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -804,7 +804,7 @@ exports other components: )) (instance $d2 (instantiate $D (with "c" (instance - (export "f" (func $d1 "g")) + (export "f" (func (func $d1 "g"))) )) )) (export "d2" (instance $d2)) From 89ed4355005c06040d00b811d1009121b9076616 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Wed, 11 May 2022 17:44:24 -0500 Subject: [PATCH 22/27] Revert previous; update inline alias syntax description to match all the examples --- design/mvp/Explainer.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index 4899de38..ba849016 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -266,13 +266,14 @@ Both kinds of aliases come with syntactic sugar for implicitly declaring them inline: For `export` aliases, the inline sugar has the form `(sort +)` -and can be used anywhere a `sort` index appears in the AST. For example, the -following snippet uses an inline function alias: +and can be used in place of a `sortidx` or any sort-specific index (such as a +`typeidx` or `funcidx`). For example, the following snippet uses two inline +function aliases: ```wasm (instance $j (instantiate $J (with "f" (func $i "f")))) -(export "x" (func (func $j "g" "h"))) +(export "x" (func $j "g" "h")) ``` -which is desugared into: +which are desugared into: ```wasm (alias export $i "f" (func $f_alias)) (instance $j (instantiate $J (with "f" (func $f_alias)))) @@ -804,7 +805,7 @@ exports other components: )) (instance $d2 (instantiate $D (with "c" (instance - (export "f" (func (func $d1 "g"))) + (export "f" (func $d1 "g")) )) )) (export "d2" (instance $d2)) From f3d60a233c79efb7a5ab6b4f4905af62da287f19 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Sat, 21 May 2022 01:23:49 +0200 Subject: [PATCH 23/27] Tweak validation wording Co-authored-by: Peter Huene --- design/mvp/Binary.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/mvp/Binary.md b/design/mvp/Binary.md index 5faf204d..ba20cbb2 100644 --- a/design/mvp/Binary.md +++ b/design/mvp/Binary.md @@ -203,7 +203,7 @@ Notes: with type opcodes starting at SLEB128(-1) (`0x7f`) and going down, reserving the nonnegative SLEB128s for type indices. * Validation of `valtype` requires the `typeidx` to refer to a `defvaltype`. -* Validation of `moduledecl` (currently) only allows `outer` `type` `alias` +* Validation of `instancedecl` (currently) only allows `outer` `type` `alias` declarators. * As described in the explainer, each component and instance type is validated with an initially-empty type index space. Outer aliases can be used to pull From 33f8e37e5d3c3abe193e67b89f48cd6ff27df0f3 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Sat, 21 May 2022 01:24:36 +0200 Subject: [PATCH 24/27] Avoid EH conflicts in binary encoding of core:sort Co-authored-by: Peter Huene --- design/mvp/Binary.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/design/mvp/Binary.md b/design/mvp/Binary.md index ba20cbb2..9128b10d 100644 --- a/design/mvp/Binary.md +++ b/design/mvp/Binary.md @@ -63,9 +63,9 @@ core:sort ::= 0x00 => fu | 0x01 => table | 0x02 => memory | 0x03 => global - | 0x04 => type - | 0x10 => module - | 0x11 => instance + | 0x10 => type + | 0x11 => module + | 0x12 => instance core:export ::= n: si: => (export n si) instance ::= ie: => (instance ie) From 080f4c3f8fd53c7e512cbf665238f4baeada5d3b Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Mon, 23 May 2022 14:21:01 -0500 Subject: [PATCH 25/27] Sync externdesc with preceding binary format opcode change Co-authored-by: Peter Huene --- design/mvp/Binary.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/mvp/Binary.md b/design/mvp/Binary.md index 9128b10d..06a0fe9b 100644 --- a/design/mvp/Binary.md +++ b/design/mvp/Binary.md @@ -190,7 +190,7 @@ instancedecl ::= 0x01 t: => t | 0x03 ed: => ed importdecl ::= n: ed: => (import n ed) exportdecl ::= n: ed: => (export n ed) -externdesc ::= 0x00 0x10 i: => (core module (type i)) +externdesc ::= 0x00 0x11 i: => (core module (type i)) | 0x01 i: => (func (type i)) | 0x02 t: => (value t) | 0x03 b: => (type b) From 7401e4c80bf9ea8af7a8a67f64d36f838890a7b6 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Tue, 24 May 2022 16:58:23 -0500 Subject: [PATCH 26/27] Sync core:instantiatearg with preceding binary format opcode change Co-authored-by: Peter Huene --- design/mvp/Binary.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/design/mvp/Binary.md b/design/mvp/Binary.md index 06a0fe9b..8354d913 100644 --- a/design/mvp/Binary.md +++ b/design/mvp/Binary.md @@ -57,7 +57,7 @@ Notes: core:instance ::= ie: => (instance ie) core:instanceexpr ::= 0x00 m: arg*:vec() => (instantiate m arg*) | 0x01 e*:vec() => e* -core:instantiatearg ::= n: 0x11 i: => (with n (instance i)) +core:instantiatearg ::= n: 0x12 i: => (with n (instance i)) core:sortidx ::= sort: idx: => (sort idx) core:sort ::= 0x00 => func | 0x01 => table From 49fb1171a49259e1aa6dd74e8f05feb8df856306 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Tue, 24 May 2022 18:04:47 -0500 Subject: [PATCH 27/27] Make in 'canon lift' symmetric to imports --- design/mvp/Binary.md | 2 +- design/mvp/Explainer.md | 38 ++++++++++--------- .../SharedEverythingDynamicLinking.md | 12 +++--- 3 files changed, 27 insertions(+), 25 deletions(-) diff --git a/design/mvp/Binary.md b/design/mvp/Binary.md index 8354d913..d3c7918b 100644 --- a/design/mvp/Binary.md +++ b/design/mvp/Binary.md @@ -216,7 +216,7 @@ Notes: (See [Canonical Definitions](Explainer.md#canonical-definitions) in the explainer.) ``` -canon ::= 0x00 0x00 f: ft: opts: => (canon lift f type-index-space[ft] opts (func)) +canon ::= 0x00 0x00 f: opts: ft: => (canon lift f opts type-index-space[ft]) | 0x01 0x00 f: opts: => (canon lower f opts (core func)) opts ::= opt*:vec() => opt* canonopt ::= 0x00 => string-encoding=utf8 diff --git a/design/mvp/Explainer.md b/design/mvp/Explainer.md index ba849016..29957749 100644 --- a/design/mvp/Explainer.md +++ b/design/mvp/Explainer.md @@ -433,9 +433,8 @@ componentdecl ::= instancedecl ::= | | -importdecl ::= (import ) +importdecl ::= (import bind-id()) exportdecl ::= (export ) -importdesc ::= bind-id() externdesc ::= ( (type ) ) | core-prefix() | @@ -524,14 +523,14 @@ text format allows both references to out-of-line type definitions (via `(type )`) and inline type expressions that the text format desugars into out-of-line type definitions. -The `value` case of `importdesc`/`exportdesc` describes a runtime value -that is imported or exported at instantiation time as described in the [start -definitions](#start-definitions) section below. +The `value` case of `externdesc` describes a runtime value that is imported or +exported at instantiation time as described in the +[start definitions](#start-definitions) section below. -The `type` case of `importdesc`/`exportdesc` describes an imported or exported -type along with its bounds. The bounds currently only have an `eq` option that -says that the imported/exported type must be exactly equal to the referenced -type. There are two main use cases for this in the short-term: +The `type` case of `externdesc` describes an imported or exported type along +with its bounds. The bounds currently only have an `eq` option that says that +the imported/exported type must be exactly equal to the referenced type. There +are two main use cases for this in the short-term: * Type exports allow a component or interface to associate a name with a structural type (e.g., `(export "nanos" (type (eq u64)))`) which bindings generators can use to generate type aliases (e.g., `typedef uint64_t nanos;`). @@ -611,7 +610,7 @@ two directions: Canonical definitions specify one of these two wrapping directions, the function to wrap and a list of configuration options: ``` -canon ::= (canon lift core-prefix() * (func ?)) +canon ::= (canon lift core-prefix() * bind-id()) | (canon lower * (core func ?)) canonopt ::= string-encoding=utf8 | string-encoding=utf16 @@ -620,6 +619,10 @@ canonopt ::= string-encoding=utf8 | (realloc core-prefix()) | (post-return core-prefix()) ``` +While the production `externdesc` accepts any `sort`, the validation rules +for `canon lift` would only allow the `func` sort. In the future, other sorts +may be added (viz., types), hence the explicit sort. + The `string-encoding` option specifies the encoding the Canonical ABI will use for the `string` type. The `latin1+utf16` encoding captures a common string encoding across Java, JavaScript and .NET VMs and allows a dynamic choice @@ -672,9 +675,9 @@ stack-switching in component function signatures. Similar to the `import` and `alias` abbreviations shown above, `canon` definitions can also be written in an inverted form that puts the sort first: ```wasm - (func $f (import "i" "f")) ≡ (import "i" "f" (func $f)) (WebAssembly 1.0) - (func $h (canon lift ...)) ≡ (canon lift ... (func $h)) -(core func $h (canon lower ...)) ≡ (canon lower ... (core func $h)) + (func $f ...type... (import "i" "f")) ≡ (import "i" "f" (func $f ...type...)) (WebAssembly 1.0) + (func $h ...type... (canon lift ...)) ≡ (canon lift ... (func $h ...type...)) +(core func $h ...type... (canon lower ...)) ≡ (canon lower ... (core func $h ...type...)) ``` Note: in the future, `canon` may be generalized to define other sorts than functions (such as types), hence the explicit `sort`. @@ -707,11 +710,11 @@ takes a string, does some logging, then returns a string. (with "libc" (instance $libc)) (with "wasi:logging" (instance (export "log" (func $log)))) )) - (func (export "run") (canon lift + (func $run (param string) (result string) (canon lift (core func $main "run") - (func (param string) (result string)) (memory (core memory $libc "mem")) (realloc (core func $libc "realloc")) )) + (export "run" (func $run)) ) ``` This example shows the pattern of splitting out a reusable language runtime @@ -764,9 +767,8 @@ exported string at instantiation time: ) ) (core instance $main (instantiate $Main (with "libc" (instance $libc)))) - (func $start (canon lift + (func $start (param string) (result string) (canon lift (core func $main "start") - (func (param string) (result string)) (memory (core memory $libc "mem")) (realloc (core func $libc "realloc")) )) (start $start (value $name) (result (value $greeting))) @@ -782,7 +784,7 @@ of core linear memory. Lastly, imports and exports are defined in terms of the above as: ``` -import ::= (import ) +import ::= export ::= (export ) ``` All import and export names within a component must be unique, respectively. diff --git a/design/mvp/examples/SharedEverythingDynamicLinking.md b/design/mvp/examples/SharedEverythingDynamicLinking.md index 30f75901..2ccfd4b5 100644 --- a/design/mvp/examples/SharedEverythingDynamicLinking.md +++ b/design/mvp/examples/SharedEverythingDynamicLinking.md @@ -157,11 +157,11 @@ would look like: (with "libc" (instance $libc)) (with "libzip" (instance $libzip)) )) - (func (export "zip") (canon lift + (func $zip (param (list u8)) (result (list u8)) (canon lift (func $main "zip") - (func (param (list u8)) (result (list u8))) (memory (memory $libc "memory")) (realloc (func $libc "realloc")) )) + (export "zip" (func $zip)) ) ``` Here, `zipper` links its own private module code (`$Main`) with the shareable @@ -236,11 +236,11 @@ component-aware `clang`, the resulting component would look like: (with "libc" (instance $libc)) (with "libimg" (instance $libimg)) )) - (func (export "transform") (canon lift + (func $transform (param (list u8)) (result (list u8)) (canon lift (func $main "transform") - (func (param (list u8)) (result (list u8))) (memory (memory $libc "memory")) (realloc (func $libc "realloc")) )) + (export "transform" (func $transform)) ) ``` Here, we see the general pattern emerging of the dependency DAG between @@ -296,11 +296,11 @@ components. The resulting component could look like: (with "zipper" (instance (export "zip" (func $zipper "zip")))) (with "imgmgk" (instance (export "transform" (func $imgmgk "transform")))) )) - (func (export "run") (canon lift + (func $run (param string) (result string) (canon lift (func $main "run") - (func (param string) (result string)) (memory (memory $libc "memory")) (realloc (func $libc "realloc")) )) + (export "run" (func $run)) ) ``` Note here that `$Libc` is passed to the nested `zipper` and `imgmgk` instances