Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 17 additions & 15 deletions BinaryEncoding.md
Original file line number Diff line number Diff line change
Expand Up @@ -317,18 +317,20 @@ It is legal to have several entries with the same type.
| Name | Opcode | Immediate | Description |
| ---- | ---- | ---- | ---- |
| `nop` | `0x00` | | no operation |
| `block` | `0x01` | count = `varuint32` | a sequence of expressions, the last of which yields a value |
| `loop` | `0x02` | count = `varuint32` | a block which can also form control flow loops |
| `if` | `0x03` | | high-level one-armed if |
| `if_else` | `0x04` | | high-level two-armed if |
| `block` | `0x01` | | begin a sequence of expressions, the last of which yields a value |
| `loop` | `0x02` | | begin a block which can also form control flow loops |
| `if` | `0x03` | | begin if expression |
| `else` | `0x04` | | begin else expression of if |
| `select` | `0x05` | | select one of two values based on condition |
| `br` | `0x06` | relative_depth = `varuint32` | break that targets a outer nested block |
| `br_if` | `0x07` | relative_depth = `varuint32` | conditional break that targets a outer nested block |
| `br_table` | `0x08` | see below | branch table control flow construct |
| `return` | `0x14` | | return zero or one value from this function |
| `unreachable` | `0x15` | | trap immediately |
| `return` | `0x09` | | return zero or one value from this function |
| `unreachable` | `0x0a` | | trap immediately |
| `end` | `0x0f` | | end a block, loop, or if |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my postorder design, if has its own endif opcode. block and loop ends have an arity immediate, while in if+else+endif the arity immediate goes on the if to avoid being redundant between the else and the endif. Does your design omit arity immediates on end markers, and if so, how do you determine what to leave on the stack after a block exit?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new local AST node stack is created with the start of each effective block (including the if branches) and at the end of the block all the remaining stack elements are the block top level expressions of the effective block so an immediate expression count is not needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying that the end of the block unconditionally pushes the last AST node of the block onto the AST node stack? This seems inconsistent with the arity-immediate used in br and br_if. If the arity is 0, they have no result-value operands, rather than just unconditionally having one operand.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Tue, Mar 29, 2016 at 4:33 PM, Dan Gohman notifications@github.com
wrote:

In BinaryEncoding.md
#628 (comment):

| select | 0x05 | | select one of two values based on condition |
| br | 0x06 | relative_depth = varuint32 | break that targets a outer nested block |
| br_if | 0x07 | relative_depth = varuint32 | conditional break that targets a outer nested block |
| br_table | 0x08 | see below | branch table control flow construct |
-| return | 0x14 | | return zero or one value from this function |
-| unreachable | 0x15 | | trap immediately |
+| return | 0x09 | | return zero or one value from this function |
+| unreachable | 0x0a | | trap immediately |
+| end | 0x0f | | end a block, loop, or if |

In my postorder design, if has its own endif opcode. block and loop ends
have an arity immediate, while in if+else+endif the arity immediate goes
on the if to avoid being redundant between the else and the endif. Does
your design omit arity immediates on end markers, and if so, how do you
determine what to leave on the stack after a block exit?

Yes, I omit the arities, so that when you fall off the end of the block,
you can have at most one return value (i.e. the last value on the stack). I
think that's sufficient; if you want to use a multi-value block in the
future, you have to use a br with arity. (And I'm still not convinced that
multi-arity blocks and ifs will be a thing in the future).


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
https://github.com/WebAssembly/design/pull/628/files/2766cd2fa08155d85e3e6197cacdf4921a3ff253#r57733287

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the end of a block the last AST node on that blocks AST node stack is a result node of the block and the other AST nodes are the nodes with discarded expression results - if decoding into an AST tree then they are the AST nodes of the block and the break operators do not affect the number of child expressions that a block has.

The semantics of the arity-immediate used in the break operators is a matter I question! If interpreted as zero representing a single argument nop and 1 representing the values of a single expression then it works just fine. In the case of 0 there is an AST node representing zero values and in the case of 1 there is an AST node for the expression. But these do not affect the number of child nodes of a block which I presumed was what your 'arity immediate' encoded, or was it a type declaration for the block?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JSStats I have an implementation of a postorder encoder and decoder using one-value-per-stack-entry, and it: works well, has an elegant path to supporting multi-value expressions (by pushing each value on the stack separately), and is indeed focused on single-pass validation and SSA construction (stack entries represent SSA definitions -- they literally hold an MDefinition* in SM).

Instead of having seperate break0 and break1, my design uses the arity immediate proposal, but with a value-oriented interpretation: instead of specifying how many AST children are present, the arity field specifies how many values to expect. From an SSA construction perspective, actual values are what matter, and void "values" don't contribute to the task at hand.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sunfishcode You can have your stack of defs and avoid all the zero-value restrictions by simply pushing an integer count of the number of defs per AST node on the stack after the values - this would just be an implementation detail of how SM represents nodes during decoding. For example:

(nop) -> stack: 0
(i32.const 55) -> stack: (i32.const 55) 1
(block (nop) (i32.const 55)) -> 0 (i32.const 55) 1

This scales to more than one value

(tuple (i32.const 55) (i32.const 56)) -> stack: (i32.const 55) (i32.const 56) 2

And allows:

(i32.eqz (call $fn_returning_two_values_i32_f64))

That does not seem a big burden to avoid all the AST expressiveness restrictions and bloating the encoding with more annotations etc. Would you have any concerns with this solution?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern would be that these extra counts would require extra logic to encode on the stack and to validate, and it's not clear to me what they contribute.

If your concern is dropping individual values of a multi-value expression, this can be implemented with or without these count entries.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is an example of invalid code generation without the counts:

(block (i32.add (i32.const 55) (call $fn_returning_two_values_55_56)))

block_start
  (i32.const 55)     block_stack: 51
  call $fn_returning_two_values_55_56   block_stack: 51 56 55
  i32.add            block_stack: 51 111
block_end            parent_stack: 111

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


The `br_table` operator has an immediate operand which is encoded as follows:
Note that there is no explicit `if_else` opcode, as the else clause is encoded with the `else` bytecode.

| Field | Type | Description |
| ---- | ---- | ---- |
Expand All @@ -343,15 +345,15 @@ out of range, `br_table` branches to the default target.
## Basic operators ([described here](AstSemantics.md#constants))
| Name | Opcode | Immediate | Description |
| ---- | ---- | ---- | ---- |
| `i32.const` | `0x0a` | value = `varint32` | a constant value interpreted as `i32` |
| `i64.const` | `0x0b` | value = `varint64` | a constant value interpreted as `i64` |
| `f64.const` | `0x0c` | value = `uint64` | a constant value interpreted as `f64` |
| `f32.const` | `0x0d` | value = `uint32` | a constant value interpreted as `f32` |
| `get_local` | `0x0e` | local_index = `varuint32` | read a local variable or parameter |
| `set_local` | `0x0f` | local_index = `varuint32` | write a local variable or parameter |
| `call` | `0x12` | function_index = `varuint32` | call a function by its index |
| `call_indirect` | `0x13` | signature_index = `varuint32` | call a function indirect with an expected signature |
| `call_import` | `0x1f` | import_index = `varuint32` | call an imported function by its index |
| `i32.const` | `0x10` | value = `varint32` | a constant value interpreted as `i32` |
| `i64.const` | `0x11` | value = `varint64` | a constant value interpreted as `i64` |
| `f64.const` | `0x12` | value = `uint64` | a constant value interpreted as `f64` |
| `f32.const` | `0x13` | value = `uint32` | a constant value interpreted as `f32` |
| `get_local` | `0x14` | local_index = `varuint32` | read a local variable or parameter |
| `set_local` | `0x15` | local_index = `varuint32` | write a local variable or parameter |
| `call` | `0x16` | function_index = `varuint32` | call a function by its index |
| `call_indirect` | `0x17` | signature_index = `varuint32` | call a function indirect with an expected signature |
| `call_import` | `0x18` | import_index = `varuint32` | call an imported function by its index |

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the plan to add an operator table there seems little need to be re-organising now or plan for a catch operator, but I guess it's a small matter for the few tools to update these. Or are you anticipating mapping the operators back to indexes internally for implementations and planning a suggested numbering for these.

## Memory-related operators ([described here](AstSemantics.md#linear-memory-accesses))

Expand Down