Skip to content

Make the branch type checking consistent with the type system.#215

Closed
ghost wants to merge 1 commit intomasterfrom
unknown repository
Closed

Make the branch type checking consistent with the type system.#215
ghost wants to merge 1 commit intomasterfrom
unknown repository

Conversation

@ghost
Copy link

@ghost ghost commented Jan 24, 2016

The branch optional argument should default to the empty-values type None, as from a nop operation. To be consistent with the type system a branch should validate when the expression is not included or the expression returns the empty-values type None and when the target expected type is also None.

When the optional expression is included and returns a single-value type such as i32 the branch should also validate when the expected type is None. This is constent with the type system which accepts more values than expected, for example accepting a single value type such as i32 when expecting type None.

The misunderstanding appears to be an interpretation of the optional argument(s) as a list of the values to return, but multi-value support would not work in that manner rather there would still be only one optional argument when returning multiple-values and it would just have a multi-value type, just as the argument can currently be the zero-values type from a nop or a single value type such as i32.

The v8 implementation already appears to be consistent with this change although a small change will be required to sexp-wasm.

@rossberg
Copy link
Member

You might want to check out issue #179 and the corresponding PR #180. ;)

As for multiple values, you can go back to an old version of this repo (pre #53) to see how they could work. That implementation allowed multiple arguments to break, symmetric to multiple arguments to call or return (the latter in fact being desugared to multi-arg break). AFAICT, that was the most consistent design.

@ghost
Copy link
Author

ghost commented Jan 25, 2016

@rossberg-chromium This reverts the change in #179

It creates an inconsistency in the type system to accept
(module (func (nop)))
(module (func (i32.const 1)))
but to fail on
(module (func (block $1 (br $1 (nop)))))
(module (func (block $1 (br $1 (i32.const 1)))))

Please read above 'The misunderstanding appears to be an interpretation of the optional argument(s) as a list of the values ...'

I did see the prior multi-value support for br and return. The tests are here dc582c3

What I think was wrong with that was that br and return accepted a list of values. What should have been done is to still accept only a single expression, but it would have a multi-value type. So rather than (break $l (get_local $x1) (get_local $x2)) it would be (break $l (values (get_local $x1) (get_local $x2))). This would then be consistent with (break $l (call $swap (get_local $x1) (get_local $x2))) where $swap is a single expression returning two values.

'The misunderstanding appears to be an interpretation of the optional argument(s) as a list of the values ...'

Can we fix this?

@ghost
Copy link
Author

ghost commented Jan 25, 2016

Here's another example to consider. If br were accepting a list of values, what would the following do? Would it return two values, the first from each call, or concatenate all values returning four values?

(br $l (call $swap (get_local $x1) (get_local $x2))
       (call $swap (get_local $x1) (get_local $x2)))

If you want an operator to concatenate all the values then add it separately so it can be used in other contexts too. The values operator would return the first value from each argument, not concatenate all the values.

It is also inconsistent to be able to return multiple values built from a list only from a br and not the fall-through expression of a block. Whatever the value expression to br, should also be possible as the last expression to br and give the same result from the block. For example:

(block $l1
  (br_if (...) $l1 (values (get_local $x1) (get_local $x2)))
  (values  (get_local $x2) (get_local $x1)))

rather than

(block $l1
  (br_if (...) $l1 (get_local $x1) (get_local $x2))
  (block $2 (br $2 (get_local $x2) (get_local $x1))))

@rossberg
Copy link
Member

@JSStats, it is instructive to consider the analogy with calls. When you have a function without arguments, you call it with a nop argument either -- the arity of the call has to syntactically match the arity of the function called. Similarly, the arity of a break has to syntactically match the "arity" of the block it breaks from. The only difference is that calls allow arbitrary arities, while breaks currently only allow 0 or 1.

In the multi-value extension, break allowed arbitrary arities, just like calls. And just like calls, it allowed a syntactic list of values. In addition, it also allowed a single argument that has a multi-value type like you suggest -- and so did call, in a completely consistent manner. That is, the requirement for matching arity syntactically was removed everywhere. However, that is an extension that does not make much sense without consistent support for multi-values, IMO. I don't think it is worth introducing it as a special case just for breaks. As a corner case, it can be more confusing than useful, see #179.

@rossberg
Copy link
Member

Here's another example to consider. If br were accepting a list of
values, what would the following do? Would it return two values, the first
from each call, or concatenate all values returning four values?

(br $l (call $swap (get_local $x1) (get_local $x2))
(call $swap (get_local $x1) (get_local $x2)))

If you want an operator to concatenate all the values then add it
separately so it can be used in other contexts too. The values operator
would return the first value from each argument, not concatenate all the
values.

Again, the question would apply to calls as well. The answer lies in the
careful distinction between value types (single values) and expression
types (lists of value types) that the type system makes: it is simply a
type error to use multi-argument syntax with arguments that do not have
value type.

Effectively, the multi-argument forms of call, break, and return are sugar
for unary applications to a tuple argument. Where it is enforced by the
type system that tuples are second-class and don't nest.

It is also inconsistent to be able to return multiple values built from a

list only from a br and not the fall-through expression of a block.
Whatever the value expression to br, should also be possible as the last
expression tobr` and give the same result from the block. For example:

(block $l1
(br_if (...) $l1 (values (get_local $x1) (get_local $x2)))
(values (get_local $x2) (get_local $x1)))

rather than

(block $l1
(br_if (...) $l1 (get_local $x1) (get_local $x2))
(block $2 (br $2 (get_local $x2) (get_local $x1))))

Yes. Returning multi-values (e.g. produced from calls) from an expression
(be it a block, conditional, or switch) was fully possible in the original
proto. It didn't have the values form itself, but even that you could
easily have introduced as sugar:

(values <expr>*)   =   (block (break <expr>*))

@ghost
Copy link
Author

ghost commented Jan 26, 2016

@rossberg-chromium I think there is a clear distinction between arguments and results. Consider how the type system handles results now - if no values are expected it still allows the expression to return values, it does not demand that the 'arity' of the expected number of values matches the number of values the expression produces, well there is no defaulting so it demands at least as many values as expected, but excess values are discarded.

My concern is that multi-value support has not been thought through. Could I ask if you have ever used a programming language with multi-value expressions? If so could I have a look how it worked, to better understand your expectations.

You still have not answered what are the result values from break if given multiple expressions that return multiple values under you model? With just one it returns all the values from it, but what about two or three each with a different number of values? Think it through, perhaps you'll start to see the problems, how complex it has become, and that if you want these operators it is best to separate them from break and keep break simple accepting a single expression with multiple values.

Even under your model it is not consistent, if break with multiple arguments builds a multi-value result expression then it is still only necessary that the blocks expected type have the same number or less values. E.g. (block (block $l1 (br $l1 (i32.const 1))) ...) is returning a single value to an expected type of no-values which would otherwise validate just as the following validates (block (i32.const 1) ...)

@ghost
Copy link
Author

ghost commented Jan 26, 2016

@rossberg-chromium (values <expr>*) = (block (break <expr>*)) does not hold because values builds multiple values taking a single value from each expression. Your break expression accepted multiple values from a call returning a multi-value result. (values (i32.const 1)) = (i32.const 1) but your (block (break (i32.const 1))) /= (i32.const 1) because one fails to validate if no values are expected and the other does not!

@rossberg
Copy link
Member

@JSStats, no, in the multi-value extension, (block (break (i32.const 1))) would type-check just fine, even if the block was in a context expecting no value -- any expression type implicitly converts to the empty value everywhere (including calls). But we don't have multi-values. Hence there is no use to making this type-check for some corner case currently.

I thought I answered your other question already. You cannot "merge" multi-values, it simply doesn't type-check. When forming a multi-value, each member has to be a plain value.

The multi-value support has been thought through very thoroughly, I can assure you. All your questions about what multi-break means are hopefully answered by pointing out that you could simply desugar

(break <var> <expr>*)   =   (break <var> (values <expr>*))

if you made values primitive. There isn't more to it. Likewise,

(return <expr>*)   =   (return (values <expr>*))
(call <expr> <expr>*)   =   (call <expr> (values <expr>*))

Once you have multi-return, there is absolutely no conceptual difference between multiple arguments and multiple results. The only reason that exists in many languages is that multi-return often was an afterthought.

@ghost
Copy link
Author

ghost commented Jan 27, 2016

@JSStats https://github.com/JSStats, no, in the multi-value extension,
|(block (break (i32.const 1)))| would type-check just fine, even if the
block was in a context expecting no value -- any expression type
implicitly converts to the empty value everywhere (including calls). But
we don't have multi-values. Hence there is no use to making this
type-check for some corner case currently.

But this currently fails to validate!

> (module (func (block $1 (br $1 (i32.const 1)))))
stdin:1.25-1.46: arity mismatch

We do have multi-values, we have nop returning zero values, and single values types.

I thought I answered your other question already. You cannot "merge"
multi-values, it simply doesn't type-check. When forming a multi-value,
each member has to be a plain value.

This is not consistent with the multi-value support you suggest, in the case in which the break expression is a single function call returning multiple values you suggest it returns them all, yet above you define 'each member has to be a plain value.' and for it to be invalid to pass multiple values.

The multi-value support has been thought through very thoroughly, I can
assure you.

Good one, can I quote you on that?

All your questions about what multi-break means are
hopefully answered by pointing out that you could simply desugar

|(break ) = (break (values )) |

Wrong, values builds a multi-value result from a single value from each of it's arguments. You have the br expression returning multiple values from a single expression, and at the same time claiming it is not valid to do so! Further as already noted (values (i32.const 1)) = (i32.const 1) and is valid in any context that (i32.const 1) is valid. These are big differences.

if you made |values| primitive. There isn't more to it. Likewise,

|(return ) = (return (values )) (call ) =
(call (values
)) |

Once you have multi-return, there is absolutely no conceptual difference
between multiple arguments and multiple results. The only reason that
exists in many languages is that multi-return often was an afterthought.

Could you please direct me to a programming language consistent with what you are proposing? Or is it that all programming languages are legacy and you just happen to be 'doing it the right way' because they all got it wrong?

You mention 'unification' a lot, perhaps you see multi-value as a structural matter, but the AST is flat (basically) with each operator standing on it's own with each argument and result having a type including multi-values. Sorry if I am reaching here, but I really am trying to understand your perspective.

The function arguments case is a destructing operation as opposed to a construction operation.

For example, what does (call <expr1> (values <expr2> <expr3>)) pass to the function? All three values concatenated, or two values? Do you see the function being declared with the same structure? What about (call <expr1> (values (values <expr2> <expr3>) <expr4>)) (values <expr5>)) Does it recursively concatenate all five values, or pass three values?

Recall you defined it above as 'each member has to be a plain value.' and I quote 'thought through very thoroughly, I can assure you.' ;)

Programmers are pragmatic, sometimes they want to concatenate values, to build up an argument list, and sometimes they just want the first value. I don't think needing both operations is a sign that it was an 'afterthought', nor is separating destructing from construction. If nothing else their use shows that the patterns were useful.

@rossberg
Copy link
Member

@JSStats https://github.com/JSStats https://github.com/JSStats, no, in
the multi-value extension,
|(block (break (i32.const 1)))| would type-check just fine, even if the
block was in a context expecting no value -- any expression type
implicitly converts to the empty value everywhere (including calls). But
we don't have multi-values. Hence there is no use to making this
type-check for some corner case currently.

But this currently fails to validate!

(module (func (block $1 (br $1 (i32.const 1)))))
stdin:1.25-1.46: arity mismatch

Yes, which is why I said: "in the multi-value extension [this holds] [...]
But we don't have multi-values [currently]."

We do have multi-values, we have nop returning zero values, and single

values types.

That's not general multi-values, though. Look, I'm not saying what you
propose does not make sense. I'm merely saying it is not useful in the
absence of multi-values (and in fact turns out to be confusing, see #179).

I thought I answered your other question already. You cannot "merge"

multi-values, it simply doesn't type-check. When forming a multi-value,
each member has to be a plain value.

This is not consistent with the multi-value support you suggest, in the
case in which the break expression is a single function call returning
multiple values you suggest it returns them all, yet above you define 'each
member has to be a plain value.' and for it to be invalid to pass multiple
values.

It is perfectly consistent, as long as you keep in mind that desugaring
only applies to breaks which are not already in base form, i.e. those with
n=/=1 arguments. In the example you describe, nothing needs desugaring, and
no values operator gets involved.

The multi-value support has been thought through very thoroughly, I can

assure you.

Good one, can I quote you on that?

All your questions about what multi-break means are
hopefully answered by pointing out that you could simply desugar

|(break *) = (break (values *)) |

Wrong, values builds a multi-value result from a single value from each
of it's arguments. You have the br expression returning multiple values
from a single expression, and at the same time claiming it is not valid to
do so! Further as already noted (values (i32.const 1)) = (i32.const 1)
and is valid in any context that (i32.const 1) is valid. These are big
differences.

No, I think you are confused. Which is probably my fault. Let me try once
more. The multi-value semantics is simple and regular. If you assume there
was a primitive values operator then there would be only a couple of core
rules relevant to multiple values:

  1. (values e_1 ... e_n) has type [t_1 ... t_n] iff e_i has type [t_i]
    for all i <= n.
  2. (break x e_1 ... e_n) for n=/=1 is short for (break x (values e_1 ... e_n)). Similarly for call and return.
  3. Any expression that has type [t_1 ... t_n] also has type [].

Everything else follows from these.

Could you please direct me to a programming language consistent with what

you are proposing? Or is it that all programming languages are legacy and
you just happen to be 'doing it the right way' because they all got it
wrong?

For example, it is a subset of what virtually all functional languages do
with tuples. (A subset because in those languages, tuples are actually
first-class values, whereas here we intentionally do not make them so, by
distinguishing value from expression types.) For example, in Standard ML:

fun f(x, y) = (y, x)
fun g(b, x, y) = if b then (x, y) else (y, x)
f(1, 2)
f(if b then (2, 3) else (3,4))
f(f(g(true, 2, 4)))

You mention 'unification' a lot, perhaps you see multi-value as a

structural matter, but the AST is flat (basically) with each operator
standing on it's own with each argument and result having a type including
multi-values. Sorry if I am reaching here, but I really am trying to
understand your perspective.

I'm afraid I don't follow. Unification is an algorithmic concern, and
orthogonal to what we're discussing right now.

The function arguments case is a destructing operation as opposed to a

construction operation.

Generally speaking, function calls are neither. The only reason function
parameters have to destructure in our case is because multi-values are not
first class, i.e., we don't want to allow variables of multi-value type.

For example, what does (call (values )) pass to the

function? All three values concatenated, or two values? Do you see the
function being declared with the same structure? What about (call
(values (values ) )) (values )) Does it
recursively concatenate all five values, or pass three values?

All these examples are type errors, for the same reason as before. See
rules (1) and (2) above.

@ghost
Copy link
Author

ghost commented Jan 27, 2016

@rossberg-chromium We seem to both agree that (module (func (block $1 (br $1 (i32.const 1))))) would validate with multi-value support. I believe you also support the arity of an operator not depending on the context. If we could please just change the MVP type system so that this validates then and so that (module (func (block $1 (br $1 (nop)))) validates then we can defer the rest of the discussion as this would be enough of a consensus for now. Could you support that?

@rossberg
Copy link
Member

@rossberg-chromium https://github.com/rossberg-chromium We seem to both
agree that (module (func (block $1 (br $1 (i32.const 1))))) would
validate with multi-value support.

Yes.

I believe you also support the arity of an operator not depending on the
context.

I'm not sure what you mean by that. The arity of calls and returns
naturally depends on the context.

If we could please just change the MVP type system so that this validates

then and so that (module (func (block $1 (br $1 (nop)))) validates then
we can defer the rest of the discussion as this would be enough of a
consensus for now. Could you support that?

There was an explicit request to remove this in the past. It's an odd
corner case currently. It is an extra complication for decoders to support
it. At the same time, it would currently have no practical benefit.
Finally, not allowing it errs on the side of being conservative (we can
always add it later). I find these enough reasons to not support it for the
time being. But I'd be open if there was a clear signal that more people
want it (but see first point about that).

@rossberg
Copy link
Member

rossberg commented Jan 27, 2016 via email

@ghost
Copy link
Author

ghost commented Jan 28, 2016

@rossberg-chromium I'm getting a little confused here on your position, you seem to be flipping. You had been maintaining that '(module (func (block $1 (br $1 (i32.const 1)))))' and '(module (func (block $1 (br $1 (nop)))))'would validate with your view of multi-value support, but just above you claim it would not be consistent. Which one is it now?

Flagging a pattern as invalid now, when it is expected to be valid in future, might sound like a precautionary path. However, it is not, because the encoding will be optimized for valid code, and we could very well end up with an encoding that can not even represent the expected valid code, and with implementations that do not support it. The best path to ensuring there is a clear path forward is to reserved this pattern by making it validate as expected now and to add tests for this. Thus it is important to work through this.

Let me work through your language above:

  1. (values e_1 ... e_n) has type [t_1 ... t_n] iff e_i has type [t_i] for all i <= n.

If [..] is SML for list then I agree that the expression result values could be represented by a list in an abstract implementation and that there types could be too.

However I believe the consumers of these should be matching them with arg::_ for a single value consumer, and matching any list for a zero value consumer, and by extension a two value consumer would match arg1::arg2::_. (forgive me if my SML is amateurish). These would only fail to validate when the consumer is expecting more values than actually supplied, not when the number of values supplied is more than expected. This is a useful property in programming languages, and already taken advantage of in wasm where most operators are expressions that return values and excess values are be discarded. For example, (module (func (i32.const 1))) validates as would (module (func (call $some_fn_returning_two_values))), and thus so should (module (func (values 1 2))).

Only after this pattern matching, would the bound arguments be type checked, the rest discarded and not checked.

btw: The empty type list [] is still a type set of one element - it's not the same as the empty set of types resulting from unreachable and unreachable does not return [] and would need to be represented by another object and a symbol might do.

  1. (break x e_1 ... e_n) for n=/=1 is short for (break x (values e_1 ... e_n)). Similarly for call and return.

Here values is just another operator that returns values built from matching each of its arguments using the rule arg::_. It should be possible to substitute values for a function call that returns the same number of values, for example (break x (call $fn_returning_two_values)) in which case it is clear these get passed through block to its consumer, which if it is a single value consumer matches its argument with arg::_ or it might consume no values discarding both of them. Thus there is no 'arity' issue here, unless the consumer does not receive enough values.

Could you please give an example for call so I understand what you are suggesting here. (call fn e_1 e_2 .. e_n) /= (call fn (values e_1 e_2 ... e_n)) rather (call fn (values e_1 e_2 ... e_n)) = (call fn e_1)

The values used here is not a SML tuple. For the benefit of other people SML functions accept one argument and return one result, and SML has no multi-value expression result support. Tuples can be used to encapsulate multiple arguments and return multiple values. Notably there is no one-element tuple, and from my limited knowledge of SML tuples only patten match to patterns with exactly the same number of elements - they could not represent the current wasm semantics, although lists in SML could.

  1. Any expression that has type [t_1 ... t_n] also has type [].

You would need to define 'has type'. SML might have different definitions of types here than are useful in this context. I have tried to define the rules above using SML pattern matching. For example a consumer of one value matches with arg::_, and would not validate receiving zero values, but would validate receiving more than 1 value.

@ghost
Copy link
Author

ghost commented Jan 29, 2016

I would like to add the encoding of the return operation into this issue. I believe the issue is almost identical if the return is viewed as a break to a top-most block and functions can receive result values from a return or a fall-through expression. For example:

;;; These are all currently invalid.
(module (func $f0 (return (nop))))
(module (func $f0 (block $top (br $top (nop)))))
(module (func $nop) (func $f0 (return (call $nop))))
(module (func $nop) (func $f0 (block $top (br $top (call $nop)))))
(module (func $f0 (return (i32.const 1))))
(module (func $f0 (block $top (br $top (i32.const 1)))))
(module (func $v1 (result i32) (i32.const 1)) (func $f0 (return (call $v1))))
(module (func $v1 (result i32) (i32.const 1)) (func $f0 (block $top (br $top (call $v1)))))
;;; These are all currently valid, and arguable should be equivalent.
(module (func $f0 (nop)))
(module (func $f0 (block $top (nop))))
(module (func $nop) (func $f0 (call $nop)))
(module (func $nop) (func $f0 (block $top (call $nop))))
(module (func $f0 (i32.const 1)))
(module (func $f0 (block $top (i32.const 1))))
(module (func $v1 (result i32) (i32.const 1)) (func $f0 (call $v1)))
(module (func $v1 (result i32) (i32.const 1)) (func $f0 (block $top (call $v1))))

;;; The following currently fail to validate and I concur with this as not enough
;;; values are supplied and wasm does not currently default unsupplied values,
;;; and a separate matter.
(module (func $f1 (result i32)))
(module (func $f1 (result i32) (nop)))
(module (func $f1 (result i32) (return (nop))))
(module (func $f1 (result i32) (block $top (nop))))
(module (func $f1 (result i32) (block $top (br $top (nop)))))
(module (func $nop) (func $f1 (result i32) (call $nop)))
(module (func $nop) (func $f1 (result i32) (return (call $nop))))
(module (func $nop) (func $f1 (result i32) (block $top (call $nop))))
(module (func $nop) (func $f1 (result i32) (block $top (br $top (call $nop)))))

;;; With the suggested multi-value support the following would validate.
(module
   ;; These definitions would return the same number of result values.
  (func $v2 (result i32 i32)
    (values (i32.const 1) (i32.const1)))
  (func $v2r (result i32 i32)
    (return (values (i32.const 1) (i32.const1))))
  (func $v2b (result i32 i32)
    (block $top (br $top (values (i32.const 1) (i32.const1)))))
  (func $f0 (call $v2))
  (func $f0r (return (call $v2)))
  (func $f0b (block $top (br $top (call $v2)))))
  (func $f1  (result i32) (call $v2))
  (func $f1r (result i32) (return (call $v2)))
  (func $f1b (result i32) (block $top (call $v2))))
  (func $f1b (result i32) (block $top (br $top (call $v2)))))
  (func $f2  (result i32 i32) (call $v2))
  (func $f2r (result i32 i32) (return (call $v2)))
  (func $f2b (result i32 i32) (block $top (call $v2))))
  (func $f2b (result i32 i32) (block $top (br $top (call $v2)))))
  )

;;; The following would fail to validate - not enough values and wasm
;;; does not do defaulting of unsupplied values.
(module
  (func $v2 (result i32 i32)
    (values (i32.const 1) (i32.const1)))
  (func $f3  (result i32 i32 i32) (call $v2))
  (func $f3r (result i32 i32 i32) (return (call $v2)))
  (func $f3b (result i32 i32 i32) (block $top (call $v2))))
  (func $f3b (result i32 i32 i32) (block $top (br $top (call $v2)))))
  )

For an example of consequences of not addressing this pre-MVP I would point to the v8 encoding of return which does not encode the optional result expression if the function is not expecting a result, and the v8 encoding's use of a nop in the br expression to represent (possibly) different semantics. Both of these encoding decisions would block multi-value support in future, possibly requiring new operators to replace these or some hack workarounds that I do not believe are becoming of wasm.

FAQs:

  • Why should a consumer of result values not be required to accept exactly the number of values supplied, while a function consuming it's arguments it required to match the number supplied? Ans: discarding unused values from expression results is a very common practical need in programming languages, particularly when most expressions return values, and this is already accommodated in many places in wasm and it's just return and br that differ.
  • Why not just allow discarding all the values, and otherwise require the number of values consumed to match the number supplied? Ans: just as it is practically very common to be discarding the one value result in a single-value expression system, with multi-value expressions it is also common to want to discard some of the values - this is just a practical pragmatic matter and something that someone experienced in writing code in a language with multiple-value support would know. The JS destructing support does just this, as do many other languages.
  • Why not have return and br build values from a list of arguments? Ans: They would no longer be returning the multiple value results of an expression just as the fall-through does. It would require the result values to be destructured and then rebuilt in a frustrating and verbose programming style and make transformations such as block-break-diamonds to if-else very frustrating. Or it would require special handling of the single-argument case which would pass along all the values, and further special cases for the zero-values case - a very irregular and inconsistent language that again is not equivalent to the fall-through and would frustrate transforms.

@rossberg
Copy link
Member

On 29 January 2016 at 13:20, JSStats notifications@github.com wrote:

Why should a consumer of result values not be required to accept
exactly the number of values supplied, while a function consuming it's
arguments it required to match the number supplied? Ans: discarding unused
values from expression results is a very common practical need in
programming languages, particularly when most expressions return values,
and this is already accommodated in many places in wasm and it's just
return and br that differ.

Please keep in mind that Wasm is not a user-facing language, it is a
compilation target. To justify the extra complexity of this feature for its
own sake, you would need to come up with convincing evidence that compilers
would significantly benefit from it. I doubt that. What would be the point
of generating (return (call $f)) instead of the obvious (call $f) (return)? At best it would save 2 bytes for an auxiliary block in a few
places, if a compiler is willing to bother implementing a respective
peephole compaction.

Why not just allow discarding all the values, and otherwise require
the number of values consumed to match the number supplied? Ans: just as it
is practically very common to be discarding the one value result in a
single-value expression system, with multi-value expressions it is also
common to want to discard some of the values - this is just a practical
pragmatic matter and something that someone experienced in writing code in
a language with multiple-value support would know. The JS destructing
support does just this, as do many other languages.

Same here. General width subtyping seems even less useful for compilers.

Why not have return and br build values from a list of arguments? Ans:
They would no longer be returning the multiple value results of an
expression just as the fall-through does. It would require the result
values to be destructured and then rebuilt in a frustrating and verbose
programming style and make transformations such as block-break-diamonds to
if-else very frustrating. Or it would require special handling of the
single-argument case which would pass along all the values, and further
special cases for the zero-values case - a very irregular and inconsistent
language that again is not equivalent to the fall-through and would
frustrate transforms.

The latter solution is completely analogous to what most languages with
tuple syntax do, so you'll have a hard time convincing me that it is weird.
It is needed for code compactness, especially in the common case of calls.
And unless you are arguing against consistency, it should then be
symmetric for break and return. (Also, it is incorrect that it requires a
special case for 0.)

@ghost
Copy link
Author

ghost commented Jan 31, 2016

SML/Ocaml is a high level language with a pattern matching focus. WASM is a low level language with no pattern matching, and close to the hardware. The 'arity' constraints that have been applied are not even constraints in Ocaml - they just need explicit workarounds. Below are some examples in Ocaml of discarding excessive values, and consuming only part of the values to prove this point.

For the compilation-target use case it would be a burden not to be able to lower such code into efficient wasm (ease for the producer, encoding wise, and performance wise).

Frustrating wasm code that wants wants to emit these pattens in completely unnecessary. The wasm runtime compiler always knows the number of values expected and the number of values actually available and it is an almost trivial matter to discard them.

The 'arity' check between the first values of the br and return list and the consumer is also a construction. This can be proven easily by considering an operator that concatenates all its arguments values together and pass them to the consumer. It's a product of attempting to map SML conventions onto wasm, along with the constructed limitations. These are not practical limitations of a wasm runtime compiler, not should they be limitations of the AST.

I am a developer and maintainer of a language with support for multiple value expressions - and with a type system for reasoning about these types. I would like to be able to lower to efficient wasm code when the number of values is known.

It not clear how addressing this could negatively impact code compactness so if you could clarify you concern that the restrictions are 'needed for code compactness, especially in the common case of calls' then I might be able to address this too. As can be seen in the workarounds necessary in Ocaml below, the restrictions would bloat the AST, it would be necessary to do the same in wasm, to destructure all the actual values into local registers and rebuild expression results.

Unless there is an efficient way to transform such code patterns into the wasm AST then I can't give up on this issue.

;; Example of discarding unused values.
let f1 x = (1, x);;
;; Complaints but allows the results of f1 to be discarded.
let f2 x = f1 10; f1 x;;
;; Can silence the warning - the point is that it is not a programming error, but the syntax becomes verbose to work around it.
let f2 x = let _ = f1 10 in f1 x;;

;; Example of consuming only part of result values which can be done, but verbose.
let f3 x = let (y,_) = f1 x in f1 y;;

;; Example of consuming only part of three return values.
let f4 x = (1, 2, x);;
;; Can be done but verbose, and requires destructuring *all* the values.
let f5 x = let (y, _, _) = f4 x in f4 y;;

;; Example of consuming only part of three return values, but this time using lists.
let f6 x = [1; 2; x];;
;; Can be done but verbose, but at least only the consumed values need destructuring.
let f7 x = let (y::_) = f6 x in f6 y;;

@rossberg
Copy link
Member

rossberg commented Feb 2, 2016

On 31 January 2016 at 01:12, JSStats notifications@github.com wrote:

SML/Ocaml is a high level language with a pattern matching focus. WASM is
a low level language with no pattern matching, and close to the hardware.
The 'arity' constraints that have been applied are not even constraints in
Ocaml - they just need explicit workarounds. Below are some examples in
Ocaml of discarding excessive values, and consuming only part of the values
to prove this point.

For the compilation-target use case it would be a burden not to be able to
lower such code into efficient wasm (ease for the producer, encoding wise,
and performance wise).

Performance is not affected by this either way.

Frustrating wasm code that wants wants to emit these pattens in completely
unnecessary. The wasm runtime compiler always knows the number of values
expected and the number of values actually available and it is an almost
trivial matter to discard them.

The 'arity' check between the first values of the br and return list and
the consumer is also a construction. This can be proven easily by
considering an operator that concatenates all its arguments values together
and pass them to the consumer. It's a product of attempting to map SML
conventions onto wasm, along with the constructed limitations. These are
not practical limitations of a wasm runtime compiler, not should they be
limitations of the AST.

There is no other motivation than keeping the language simple. If in doubt,
be conservative. Once there is evidence that the relaxations you want are
relevant in practice, we can discuss further, but until then, let's err on
the side of simplicity.

I am a developer and maintainer of a language with support for multiple

value expressions - and with a type system for reasoning about these types.
I would like to be able to lower to efficient wasm code when the number of
values is known.

It not clear how addressing this could negatively impact code compactness
so if you could clarify you concern that the restrictions are 'needed for
code compactness, especially in the common case of calls'

That sentence wasn't referring to restrictions but to allowing short forms
with multiple arguments.

Unless there is an efficient way to transform such code patterns into the

wasm AST then I can't give up on this issue.

;; Example of discarding unused values.
let f1 x = (1, x);;
;; Complaints but allows the results of f1 to be discarded.
let f2 x = f1 10; f1 x;;
;; Can silence the warning - the point is that it is not a programming error, but the syntax becomes verbose to work around it.
let f2 x = let _ = f1 10 in f1 x;;

All the above are directly expressible in (the originally proposed
multi-value extension of) Wasm.

;; Example of consuming only part of result values which can be done,
but verbose.

let f3 x = let (y,_) = f1 x in f1 y;;

;; Example of consuming only part of three return values.
let f4 x = (1, 2, x);;
;; Can be done but verbose, and requires destructuring all the values.
let f5 x = let (y, _, _) = f4 x in f4 y;;

These are also easily possible, even though there is no short-hand for the
(x, _, _) case. That case is no different from cases like (_, _, x) or
(_, x, _), which are just as frequent in practice. There is nothing
special about the former case that would justify a non-trivial subtyping
extension to Wasm just to make it slightly more convenient than the others.

;; Example of consuming only part of three return values, but this
time using lists.

let f6 x = [1; 2; x];;
;; Can be done but verbose, but at least only the consumed values need destructuring.
let f7 x = let (y::_) = f6 x in f6 y;;

How is that case related to anything? A list does not compile to a
multi-value at all. Compilation of list destructuring has to perform tag
comparisons and projections.

@ghost
Copy link
Author

ghost commented Feb 2, 2016

SML/Ocaml is a high level language with a pattern matching focus. WASM is
a low level language with no pattern matching, and close to the hardware.
The 'arity' constraints that have been applied are not even constraints in
Ocaml - they just need explicit workarounds. Below are some examples in
Ocaml of discarding excessive values, and consuming only part of the
values
to prove this point.

For the compilation-target use case it would be a burden not to be able to
lower such code into efficient wasm (ease for the producer, encoding wise,
and performance wise).

Performance is not affected by this either way.

Even if it only bloated the AST this would be a very significant issue.

It will affect performance if there is a lot of unnecessary destructuring into local variables to receive all the values when only one or the first few are used, and if the runtime compiler is not smart enough to optimize these away.

There is no other motivation than keeping the language simple. If in doubt,
be conservative. Once there is evidence that the relaxations you want are
relevant in practice, we can discuss further, but until then, let's err on
the side of simplicity.

Discarding unused values is simple for the runtime compiler.

Common Lisp and Lua have conventions that consume only the used values and discard the rest, this is the convention for every consumed expression result except for those consumed by special operators that consume multiple values. For example an i32.add could accept multiple value expressions for each argument and it consumes only the first. The destructuring also has this convention - excess values are ignored.

They also default unsupplied values but I don't think wasm should be burdened with that.

It's a trivial matter for wasm to discard these excess values and it avoids a lot of redundant destructuring. For example it allows:

(func $f2v (param ...) (result i32) (result i32) (.... (values $l1 $l2)))

(func $fa ... (i32.add (call $f2v ...) (call $f2v ...)))

Rather than

(func $fa ...
  (block
    (mv_set_local ($l1 $l2) (call $f2v ...))
    (mv_set_local ($l3 $l4) (call $f2v ...))
    (i32.add $l1 $l2)
  ) ... )

That's a big savings.

It not clear how addressing this could negatively impact code compactness
so if you could clarify you concern that the restrictions are 'needed for
code compactness, especially in the common case of calls'

That sentence wasn't referring to restrictions but to allowing short forms
with multiple arguments.

If you could give an example of 'short forms with multiple arguments' then I might be able to address this.

The only benefit that comes to mind is an encoding optimization using the context to reduce the number of arguments to br and return.

Could we accommodate these encoding optimizations in other ways, at a lower layer than the AST? Alternatively could we simply have separate opcodes, some that depend on context and some that do not. To be clear: to add br and return operators that accept a single expression and pass through the values of this expression irrespective of the context.

Add a separate operator for calling a function with the concatenation of all the values from the arguments, rather than taking one value from each argument as usual. E.g. (mv_call $f5v (call $f2v ...) (i32.const 1) (call $f2v ...)) would pass five values to $f5v and would subsume the prior functionality suggest for mv support. Yes, CL can also do this, not just accept one mv expression as an argument for a mv function call.

;; Example of discarding unused values.
let f1 x = (1, x);;
;; Complaints but allows the results of f1 to be discarded.
let f2 x = f1 10; f1 x;;
;; Can silence the warning - the point is that it is not a programming
error, but the syntax becomes verbose to work around it.
let f2 x = let _ = f1 10 in f1 x;;

All the above are directly expressible in (the originally proposed
multi-value extension of) Wasm.

I though it would be an arity mismatch to discard values, even all of them, or was it just an arity mismatch to discard some of them. I am not arguing that this could not be done, just that it would have require destructing to discard them? Perhaps you could give an example of how efficiently this could be expressed?

;; Example of consuming only part of result values which can be done,
but verbose.

let f3 x = let (y,_) = f1 x in f1 y;;

;; Example of consuming only part of three return values.
let f4 x = (1, 2, x);;
;; Can be done but verbose, and requires destructuring all the values.
let f5 x = let (y, _, _) = f4 x in f4 y;;

These are also easily possible, even though there is no short-hand for the
(x, _, _) case. That case is no different from cases like (_, _, x) or
(_, x, _), which are just as frequent in practice. There is nothing
special about the former case that would justify a non-trivial subtyping
extension to Wasm just to make it slightly more convenient than the others.

It is not the case in CL and Lua that (_, _, x) and (_, x, _) are just as common as (x, _, _). The language makes it easy to accept the head and discard the tail so this is the predominant pattern.

Again, it is a trivial matter to implement this in wasm. I strongly dispute that it is 'a non-trivial subtyping extension to Wasm just to make it slightly more convenient than the others.' I am prepared to prove this, but we don't have any mv implementations yet to modify. I could certainly demonstrate it with AST validation code for the proposed mv support.

;; Example of consuming only part of three return values, but this
time using lists.

let f6 x = [1; 2; x];;
;; Can be done but verbose, but at least only the consumed values need
destructuring.
let f7 x = let (y::_) = f6 x in f6 y;;

How is that case related to anything? A list does not compile to a
multi-value at all. Compilation of list destructuring has to perform tag
comparisons and projections.

The list expresses the pattern of matching the head of the list and discarding the rest. It's a weakness of the SML compiler if it does not optimize this case, just as it would be a weakness of an SML compiler to box tuples above. With wasm discarding unused trailing values it is possible to lower the above into very efficient wasm, efficient to represent in the AST.

@rossberg
Copy link
Member

rossberg commented Feb 2, 2016

SML/Ocaml is a high level language with a pattern matching focus. WASM is
a low level language with no pattern matching, and close to the hardware.
The 'arity' constraints that have been applied are not even constraints in
Ocaml - they just need explicit workarounds. Below are some examples in
Ocaml of discarding excessive values, and consuming only part of the
values
to prove this point.

For the compilation-target use case it would be a burden not to be able to
lower such code into efficient wasm (ease for the producer, encoding wise,
and performance wise).

Performance is not affected by this either way.

Even if it only bloated the AST this would be a very significant issue.

It will affect performance if there is a lot of unnecessary destructuring
into local variables to receive all the values when only one or the first
few are used, and if the runtime compiler is not smart enough to optimize
these away.

The difference is tiny, the case is very rare, and I doubt that it would
ever have any measurable impact on the code size of any real Wasm program.
Can we not blow arguments like that out of proportion?

There is no other motivation than keeping the language simple. If in doubt,
be conservative. Once there is evidence that the relaxations you want are
relevant in practice, we can discuss further, but until then, let's err on
the side of simplicity.

Discarding unused values is simple for the runtime compiler.

Common Lisp and Lua have conventions that consume only the used values and
discard the rest, this is the convention for every consumed expression
result except for those consumed by special operators that consume multiple
values. For example an i32.add could accept multiple value expressions for
each argument and it consumes only the first. The destructuring also has
this convention - excess values are ignored.

They also default unsupplied values but I don't think wasm should be
burdened with that.

It's a trivial matter for wasm to discard these excess values and it
avoids a lot of redundant destructuring. For example it allows:

(func $f2v (param ...) (result i32) (result i32) (.... (values $l1 $l2)))

(func $fa ... (i32.add (call $f2v ...) (call $f2v ...)))

Rather than

(func $fa ...
(block
(mv_set_local ($l1 $l2) (call $f2v ...))
(mv_set_local ($l3 $l4) (call $f2v ...))
(i32.add $l1 $l2)
) ... )

That's a big savings.

Whow, that's even more pervasive subtyping than I thought you'd be asking
for. I doubt any of the implementers would be willing to implement this.
The benefit would be far too tiny on the grand scale of things.

It not clear how addressing this could negatively impact code compactness
so if you could clarify you concern that the restrictions are 'needed for
code compactness, especially in the common case of calls'

That sentence wasn't referring to restrictions but to allowing short forms
with multiple arguments.

If you could give an example of 'short forms with multiple arguments' then
I might be able to address this.

The ones we were talking about: (br <expr>*>), (call <expr>*), etc.

The only benefit that comes to mind is an encoding optimization using the

context to reduce the number of arguments to br and return.

Could we accommodate these encoding optimizations in other ways, at a
lower layer than the AST? Alternatively could we simply have separate
opcodes, some that depend on context and some that do not. To be clear: to
add br and return operators that accept a single expression and pass
through the values of this expression irrespective of the context.

Add a separate operator for calling a function with the concatenation of
all the values from the arguments, rather than taking one value from each
argument as usual. E.g. (mv_call $f5v (call $f2v ...) (i32.const 1) (call
$f2v ...)) would pass five values to $f5v and would subsume the prior
functionality suggest for mv support. Yes, CL can also do this, not just
accept one mv expression as an argument for a mv function call.

Again, this seems like far too much featurism for a low-level language.

;; Example of discarding unused values.

let f1 x = (1, x);;
;; Complaints but allows the results of f1 to be discarded.
let f2 x = f1 10; f1 x;;
;; Can silence the warning - the point is that it is not a programming
error, but the syntax becomes verbose to work around it.
let f2 x = let _ = f1 10 in f1 x;;

All the above are directly expressible in (the originally proposed
multi-value extension of) Wasm.

I though it would be an arity mismatch to discard values, even all of
them, or was it just an arity mismatch to discard some of them. I am not
arguing that this could not be done, just that it would have require
destructing to discard them? Perhaps you could give an example of how
efficiently this could be expressed?

All the examples use multiple values. In the proposed multi-value
extension, there were no syntactic arity requirements, so no problem. It
feels like I have been clarified this more than once already.

;; Example of consuming only part of result values which can be done,

but verbose.

let f3 x = let (y,_) = f1 x in f1 y;;

;; Example of consuming only part of three return values.
let f4 x = (1, 2, x);;
;; Can be done but verbose, and requires destructuring all the values.
let f5 x = let (y, _, _) = f4 x in f4 y;;

These are also easily possible, even though there is no short-hand for the
(x, _, ) case. That case is no different from cases like (, , x) or
(
, x, _), which are just as frequent in practice. There is nothing
special about the former case that would justify a non-trivial subtyping
extension to Wasm just to make it slightly more convenient than the others.

It is not the case in CL and Lua that (_, , x) and (, x, _) are just as
common as (x, _, _). The language makes it easy to accept the head and
discard the tail so this is the predominant pattern.

Again, it is a trivial matter to implement this in wasm. I strongly
dispute that it is 'a non-trivial subtyping extension to Wasm just to make
it slightly more convenient than the others.' I am prepared to prove this,
but we don't have any mv implementations yet to modify. I could certainly
demonstrate it with AST validation code for the proposed mv support.

I'd refute "trivial". I'll accept "not difficult". et, in a low-level
language, such a feature doesn't carry its weight.

Note that even if Wasm had this, it would help exactly zero with
translating CL or Lua, because lacking static typing, they couldn't
translate their multi-values to Wasm multi-values.

;; Example of consuming only part of three return values, but this

time using lists.

let f6 x = [1; 2; x];;
;; Can be done but verbose, but at least only the consumed values need
destructuring.
let f7 x = let (y::_) = f6 x in f6 y;;

How is that case related to anything? A list does not compile to a
multi-value at all. Compilation of list destructuring has to perform tag
comparisons and projections.

The list expresses the pattern of matching the head of the list and
discarding the rest. It's a weakness of the SML compiler if it does not
optimize this case, just as it would be a weakness of an SML compiler to
box tuples above. With wasm discarding unused trailing values it is
possible to lower the above into very efficient wasm, efficient to
represent in the AST.

Lists are not tuples, they cannot be optimised like that. They are
variable-length and thus have to be boxed. You cannot even know at the call
site that f6 doesn't return nil.

@ghost
Copy link
Author

ghost commented Feb 2, 2016

Note that even if Wasm had this, it would help exactly zero with
translating CL or Lua, because lacking static typing, they couldn't
translate their multi-values to Wasm multi-values.

Perhaps this is a core misunderstanding of the importance of not being unnecessarily restrictive here. For you edification, below function f1 returns three values in registers and the consumer uses just one and accepts it in a register and ignores the rest. Could give larger examples, but the value returning would get lost in the noise.

(defun tst (x)
  (flet ((f1 (y)
           (declare (type (mod 16) y))
           (values y y y)))
    (+ (f1 x) (f1 1))))

(disassemble 'tst)
...
 ;; Function tst entry.
      38:       pop   dword ptr [rbp-16]
      3B:       lea   rsp, [rbp-64]
      3F:       mov   rcx, rdx
      42:       mov   rax, rbp
      45:       mov   rdx, rsp
      48:       sub   rsp, 64
      4C:       mov   [rdx-8], rax
      50:       mov   rbp, rdx
      53:       lea   rax, [rip+6]
      5A:       mov   [rbp-16], rax
      5E:       jmp   L0       ; First call to f1
      60:       mov   [rbp-24], rdx ; First result is in rdx, and is saved to the stack, other results ignored.
      64:       mov   rax, rbp
      67:       mov   rdx, rsp
      6A:       sub   rsp, 64
      6E:       mov   ecx, 8
      73:       mov   [rdx-8], rax
      77:       mov   rbp, rdx
      7A:       lea   rax, [rip+6]
      81:       mov   [rbp-16], rax
      85:       jmp   L0 ; Second call to f1, result is in rdx.
      87:       mov   rax, [rbp-24] ; Load the first argument to + form the stack
      8B:       add   rax, rdx ; Add it to the second results.
      8E:       mov   rdx, rax
      91:       mov   rcx, [rbp-16]
      95:       mov   rax, [rbp-8]
      99:       add   rcx, 3
      9D:       mov   rsp, rbp
      A0:       mov   rbp, rax
      A3:       jmp   ecx  ; Function return.
 ;; The Function 'f1. Accepts it's argument in rcx
      A5: L0:   mov   rdx, rcx ; Copies to result 1 in rdx
      A8:       mov   rbx, rcx ; Copies to result 2 in rbx, result 3 is in rcx.
      AB:       lea   rsp, [rbp-16]
      AF:       mov   rbp, [rbp-8]
      B3:       ret   8

This could be lowered basically into wasm:

(func $f1 (param $y i32) (result i32) (result i32) (result i32) 
  (values $y $y $y))
(func $tst (param $x i32) (result i32)
  (i32.add (call $f1 (get_local $x)) (call $f1 (i32.const 1))))

@rossberg
Copy link
Member

rossberg commented Feb 2, 2016

Fair enough, but does not change my overall assessment. Wasm is a low-level language. There can't be any expectation that it can represent every convenience feature from a chosen high-level language 1-to-1.

@ghost
Copy link
Author

ghost commented Feb 5, 2016

Here is another example of a real test that was needed. It is significantly convoluted due to the demanded arity[sic] checks. Realized it could be simplified a little while writing this up, see below.

(module
 (func $f1 (result i32)
   (local $i1 i32)
   (i32.add (block $l0
              (br_if (set_local $i1 (i32.const 1))
                     (set_local $i1 (i32.const 2))
                     $l0)
              (i32.const 0))
            (i32.const 0))
   (get_local $i1))
 (export "f1" $f1))

Without the arity checks this could have been as simple as:

(module
 (func $f1 (result i32)
   (local $i1 i32)
   (block $l0
     (br_if (set_local $i1 (i32.const 1))
            (set_local $i1 (i32.const 2))
            $l0))
   (get_local $i1))
 (export "f1" $f1))

It could also be reworked to the following to work around the arity checks, but I don't believe producers should have this burden of searching for a work around which might have more constraints than in this example.

(module
 (func $f1 (result i32)
   (local $i1 i32)
   (block $l0
     (br_if (set_local $i1 (i32.const 1))
            (set_local $i1 (i32.const 2))
            $l0)
     (get_local $i1)))
 (export "f1" $f1))

These checks seem to serve no purpose, they frustrate code production and the workarounds lead to convoluted code. If the producer wants to discard values it should be easy, and it should not differ between a block fall-through expression and a break expression.

There also seems to be agreement that these checks would not be applicable in future with mv support (although perhaps some flipping on this). Flagging them as invalid now seems to be leading to the encoding optimizing away the capability to express what are expected to be valid expressions in future, a very unfortunate outcome. If there is a desire to optimize the encoding based on the expected number of values in some contexts then it should be a simple encoding decision and not demand re-organization of the AST, for example separate opcodes, and can we take another look at that matter later when optimizing the encoding and fix the AST now please.

This patch will need a little reworking after the br_if changes, but could someone else please take a look at this issue?

@lukewagner
Copy link
Member

Consider a (pre- or post-order) decoder that encounters the br opcode. The decoder must now decide whether to decode a child subexpression (if pre-order) or pop a child expression (if post-order). With the current spec, this question is resolved by indexing the stack of enclosing block/loops with br's immediate and looking at that block/loop's type (None you're done, Some you decode/pop the child). If the change in the PR was made, we would instead need to introduce a second opcode or immediate (for each of br and br_if) since otherwise there would be no way to tell. This is the practical consequence of the "syntactic arity matching" @rossberg-chromium is describing above. We could perhaps justify adding opcodes if there were significant demonstrated size savings (of (br_if (foo)) vs (block (foo) (br_if)) in a void context) but then the arguments should be from measurements on large codes. Otherwise, I'd agree with @rossberg-chromium that we don't have sufficient justification to account for the practical cost.

@ghost
Copy link
Author

ghost commented Feb 5, 2016

@lukewagner The pre-order decoder has nothing to decide, it knows there is an expression and decodes it, just as for any other fixed argument opcode. For a post-order decoder a expression is popped. This matter is resolved. There is no arity[sic] issue here. It's just an operator with a fixed number of arguments. Yes, it would be a fixed argument operator, just as in the v8 encoding for br. This is not something foreign to wasm, there are lots of fixed argument operators.

The problem is that the semantics of the br expression differ from the fall through. This demands non-trivial re-organization of the AST to work around. This is the matter to be resolved, and it can not be settled by looking at the encoding efficiency alone - it complicates a producer.

If you want to optimize the number of arguments of operators based on their context then this could be done much more cleanly by introducting new operator variations. These could be substituted in a later encoding pass without restructuring the AST, and only in places where they fit without demanding a restructure of the AST. This would only be justified if it had an impact, and it might not even be justified, and we might just have clean AST operators.

You have also missed the point that with mv support, and with either proposal, that the very invalid code that supports the optimization you argue for will be valid, so this optimization will not be generally applicable for these same operators in future - it's a dead end.

I would like the AST to be clean now. I would accept it as a resolution to keep the context dependant operators, which could be an encoding option, if clean fixed argument count operators are also available and that these do not have these artificial arity[sic] restrictions. These would remain a fixed argument operator even with mv support and continue to have the same semantics as the fall-through expression result.

@ghost
Copy link
Author

ghost commented Feb 5, 2016

Perhaps we could take a break from this discussion, although I'm happy to answer any questions. I'll try to implement something concrete, a patch for v8 adding support for both context specific and fixed argument count break operators. The v8 encoding already uses a nop for fill, so context specific operators might help address the encoding efficiency use case, while the current nop-fill variant could handle the general cases. V8 already has a context specific return operator, so already uses some top-down context. So rather than a change request this would become more of an enhancement.

@lukewagner
Copy link
Member

I see; I didn't know the v8 encoding always wasted a byte by synthesizing (br (nop)) for (br). Seems like something we should avoid, given the prevalence of nullary (br).

Anyhow, after writing my last comment, I realized that a postorder encoding wouldn't have context available (duh; late night) and so a postorder encoding would also need two separate opcodes to distinguish nullary vs. unary. With the opcode table design, each opcode needs to be defined by some operator name string which does then imply separate br vs. br0 (or whatever you'd call the nullary br). With this design, the AST type of (unary) br would be Br of var * expr (unlike today's Br of var * expr option) and br0 would have type Br0 of var (recall that var is, curiously, just a source-location-annotated int index) which sidesteps the whole question over this PR (no None expr option case to consider).

I know post-order isn't decided (although @MikeHolman and the other Chakra guys were reaffirming their preference recently and I'm also now in favor), but so far this is the only type system issue that I know of that would get in the way; having arity depend on parent context seems to be a peculiar property of br. Scanning ast.ml for option, after #235, there's also Return, but if we view that as simple "macro-expanded" syntactic sugar, then we could say that the expansion of return is either into br or br0 (depending on already-declared return type).

@rossberg-chromium How about it?

@ghost
Copy link
Author

ghost commented Mar 27, 2016

Let me try another appeal that might now be compelling. There seems to be serious consideration of moving to a largely post-order encoding, and still a goal of single pass SSA conversion and validation. However a one pass validator could not know the consumed type of a block when break operations within the block scope to the block label are decoded and this seems to fundamentally block single pass validation with the current break arity checks.

Here is an example that currently fails validation with an arity error in the spec ml-proto, but would appear to be impossible to catch in a single pass post-order decoder validator.

(module
 (func
  (block
    (block $l (br $l (i32.const 1)))
    (nop))))

This can be seem in the proposed v8 post-order implementation https://codereview.chromium.org/1830663002/ which at the end of a block discards the results of all but the last expression by simply adjusting the AST node stack. It does not make another pass over the decoded expressions to detect these 'arity' conflicts, and doing so would be an added burden.

So I make an extra appeal on this issue from a decoder efficiency perspective.

@MikeHolman
Copy link
Member

One alternative to the methods mentioned here is that we give type to blocks. And then it is required that all brs targeting the block produce the correct type. This allows decoders to do any desired up-front reservation (e.g. for us, we would want to on entering a block reserve a tmp register which all brs would store to), and it allows for a simple lookup to know what to consume which we already do to find label we are branching to.

This would be similar in nature to how function calls behave -- we know what arguments to read by doing a lookup on the type of the function we are calling.

@lukewagner
Copy link
Member

@JSStats I think it's generally agreed at this point that the validation rules are going to need to change to support single-pass, bottom-up postorder validation. I think the concrete proposed changes are forthcoming, but, fwiw, in our current bottom-up algo in SM, when you enter a block, we push a block on a block stack that contains an out-of-band Any type; then we unify types of all branch values via:

static ExprType Unify(ExprType one, ExprType two) {
    if (one == AnyType) return two;
    if (two == AnyType) return one;
    if (one == two) return one;
    return ExprType::Void;
}

(where Void is our None) and the type of the block is the result of the unification. With this, the above example (and the first examples in this issue) validate.

@ghost
Copy link
Author

ghost commented Mar 29, 2016

@lukewagner Thank you.

The SM rule seems to validate this: (i32.eqz (block $l (br $l (unreachable))) if I understand correctly that the AnyType is the result of unreachable, and would also give an expected validation failure on this (i32.eqz (block $l (br $l (nop))) for the same reason as for (i32.eqz (nop)) where nop returns the void type. So this looks good.

The following also appears to validate (block (block $l (br $l (i32.const 1)) (f32.const 1.0)) (nop)) which would validate except for the 'arity' check disputed here? So that this same rule could be used for this currently valid case: (block (if (i32.const 1) (i32.const 1) (f32.const 1.0)) (nop). So this looks good too.

This rule seems to basically assert that where result branches have different types they must not be consumed, and it delays this check until they are consumed to support the post-order single-pass validation? Such a rule would appear to scale very nicely to multiple values were only a subset are consumed, but it would be good to extrapolate this to multiple values to understand this too.

Fwiw (probably bike-shedding) in CL a distinction is made between expected types and actual types because their validity is defined by a set relationship - the actual type must be a subset of the expected type. So the SM expected Void type would be (values &rest t) which means the set of any number of values each with any type, whereas nop would have type (values) which is a more specific type and a subset of (values &rest t). The 'Unify' operation seems to be basically a type union operation with only trivial unions validating, so complex-union could be an alternative name for the actual type named Void. There is a similar distinction under this set theory for the SM AnyType, which I guess means accepted anywhere, but as an expression result type it would be the empty-set so then it is a subset of any expected type and accepted anywhere.

@lukewagner
Copy link
Member

Correct on those examples validating in SM (specifically after bug 1254142 lands). I should be clear that this is in anticipation of upcoming bottom-up/post-order validation changes (we were forced to flip from top-down to bottom-up for the current binary encoding design which has no end marker for function bodies) and subject to change as the group hammers this out over the next few weeks.

FWIW, my interpretation of our ExprType is that it is a bounded meet-semilattice with AnyType as Top, Void as Bottom and Unify as the meet operator (I guess I was in a Hindley-Milner mood when I wrote it Unify ;). With multi-return, it seems like we could keep the same lattice structure but just expand the carrier set to be an infinite set of tuples of value types and AnyType instead of the finite set of types with AnyType and Void (which still leaves the question of whether <i32,f32> /\ <i32> is <i32> or <>).

@titzer
Copy link
Contributor

titzer commented Mar 29, 2016

On Tue, Mar 29, 2016 at 5:53 PM, Luke Wagner notifications@github.com
wrote:

Correct on those examples validating in SM (specifically after bug 1254142
https://bugzilla.mozilla.org/show_bug.cgi?id=1254142 lands). I should
be clear that this is in anticipation of upcoming bottom-up/post-order
validation changes (we were forced to flip from top-down to bottom-up for
the current binary encoding design which has no end marker for function
bodies) and subject to change as the group hammers this out over the next
few weeks.

FWIW, my interpretation of our ExprType is that it is a bounded
meet-semilattice with AnyType as Top, Void as Bottom and Unify as the
meet operator (I guess I was in a Hindley-Milner mood when I wrote it
Unify ;). With multi-return, it seems like we could keep the same lattice
structure but just expand the carrier set to be an infinite set of tuples
of value types and AnyType instead of the finite set of types with AnyType
and Void (which still leaves the question of whether <i32,f32> /\
is or <>).

I'll think we'll want arities for multi-returns to match.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#215 (comment)

@ghost
Copy link
Author

ghost commented Mar 29, 2016

@titzer We don't need arities because the number of values for multiple values expressions do not need to match rather it is only necessary that consumed values match. This can already be seen for the case of zero-values and one-value expressions, in (block (if (i32.const 1) (nop) (i32.const1)) (nop)) where the if node returns differing numbers of values but these are not consumed so it is not an issue to support. Similarly the following would be fine too (if (i32.const 1) (tuple (i32.const 1) (i32.const 2)) (i32.const 1)). A wasm type validator would just need to note the equivalent values types and those that conflict and check that only the equivalent values are consumed.

@titzer
Copy link
Contributor

titzer commented Mar 29, 2016

If I understand Luke's lattice proposal, Unify of mismatched arities will
result in top.

On Tue, Mar 29, 2016 at 6:37 PM, JSStats notifications@github.com wrote:

@titzer https://github.com/titzer We don't need arities because the
number of values for multiple values expressions do not need to match
rather it is only necessary that consumed values match. This can already be
seen for the case of zero-values and one-value expressions, in (block (if
(i32.const 1) (nop) (i32.const1)) (nop)) where the if node returns
differing numbers of values but these are not consumed so it is not an
issue to support. Similarly the following would be fine too (if
(i32.const 1) (tuple (i32.const 1) (i32.const 2)) (i32.const 1)). A wasm
type validator would just need to note the equivalent values types and
those that conflict and check that only the equivalent values are consumed.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#215 (comment)

@lukewagner
Copy link
Member

@titzer Well, bottom (i.e., Void) which means the result can only be dropped, never used. top (i.e. AnyType) means the value that can be used as anything since it's unreachable (due to branch).

@ghost
Copy link
Author

ghost commented Mar 30, 2016

@titzer But the same model could be extrapolated to multiple values in a different manner so that values that are the same type could be consumed while still failing on an attempt to consume a union of values of different types. So this seems an issue with the model not a constraint on wasm. Here is a model I propose and some examples. Although defined by bottom-up procedures below I believe it is equivalent to set theory relationships so applies bottom-up or top-down.

(defun validate-single-value-consumer (expected actual)
  (assert (member expected '(:i32 :i64 :f32 :f64)))
  (assert (member actual '(:i32 :i64 :f32 :f64 :complex-union :empty)))
  (and (not (eq actual :complex-union))
       (or (eq actual :empty)
           (eq actual expected))))

(defun validate-values-consumer (expected actual)
  (do ((expected expected (rest expected))
       (actual actual (rest actual)))
      ((endp expected) t)
    (unless (and actual (validate-single-value-consumer (first expected)
                                                        (first actual)))
      (return nil))))

(defun values-type-union (values1 values2)
  (assert (listp values1))
  (assert (listp values2))
  (let ((union nil))
    (do ((values1 values1 (rest values1))
         (values2 values2 (rest values2)))
        ((and (endp values1) (endp values2)))
      (if (or (endp values1) (endp values2))
          (push :complex-union union)
          (let ((values1* (first values1))
                (values2* (first values2)))
            (assert (member values1* '(:i32 :i64 :f32 :f64 :complex-union :empty)))
            (assert (member values2* '(:i32 :i64 :f32 :f64 :complex-union :empty)))
            (push (cond ((eq values1* :empty) values2*)
                        ((eq values2* :empty) values1*)
                        ((eq values1* values2*) values1*)
                        (t :complex-union))
                  union))))
    (nreverse union)))

;;; Let the actual type of wasm (nop) be ()
;;; Let the actual type of wasm (unreachable) be (:empty)
;;; Let the actual type of wasm (i32.const 1) be (:i32) etc

;;; Firstly some zero and one values cases to check that it is
;;; consistent with the current wasm type validation.

;;; Actual type of wasm: (if ... (i32.const 1) (i32.const 2))
;;; Actual type of wasm: (block $l (br $l (i32.const 1)) (i32.const 2))
(assert (equal (values-type-union '(:i32) '(:i32)) '(:i32)))
;;; Validate these being consumed by wasm i32.eqz
(assert (validate-values-consumer '(:i32) '(:i32)))

;;; Actual type of wasm: (block $l (br $l (i64.const 1)) (i64.const 2))
(assert (equal (values-type-union '(:i64) '(:i64)) '(:i64)))
;;; Validate failure being consumed by wasm i32.eqz
(assert (not (validate-values-consumer '(:i32) '(:i64))))

;;; Actual type of wasm: (block $l (br $l (nop)) (i32.const 2))
(assert (equal (values-type-union '() '(:i32)) '(:complex-union)))
;;; Validate failure being consumed by wasm i32.eqz
(assert (not (validate-values-consumer '(:i32) '(:complex-union))))
;;; Validate success being consumed by wasm (block (...) (nop))
(assert (validate-values-consumer '() '(:complex-union)))

;;; Actual type of wasm: (block $l (br $l (i32.const 1)) (i64.const 2))
(assert (equal (values-type-union '(:i32) '(:i64)) '(:complex-union)))
;;; Validate failure being consumed by wasm i32.eqz
(assert (not (validate-values-consumer '(:i32) '(:complex-union))))
;;; Validate success being consumed by wasm (block (...) (nop))
(assert (validate-values-consumer '() '(:complex-union)))

;;; Actual type of wasm: (block $l (br $l (unreachable)) (i32.const 2))
(assert (equal (values-type-union '(:empty) '(:i32)) '(:i32)))
;;; Validate success being consumed by wasm i32.eqz
(assert (validate-values-consumer '(:i32) '(:i32)))


;;; Extension to more than one values.
;;; Let the actual type of (tuple (i32.const 1) (i64.const 2)) be (:i32 :i64)

;;; Actual type of wasm: (if ... (tuple (i32.const 1) (i64.const 2)) i32.const 3)
;;; Actual type of wasm: (block $l (br $l (tuple (i32.const 1) (i64.const 2))) (i32.const 3)
(assert (equal (values-type-union '(:i32 :i64) '(:i32)) '(:i32 :complex-union)))
;;; Validate success being consumed by wasm i32.eqz
(assert (validate-values-consumer '(:i32) '(:i32 :complex-union)))

;;; Actual type of wasm: (block $l (br $l (tuple (i32.const 1) (i64.const 2))) (tuple (i64.const 1) (i32.const 2))
(assert (equal (values-type-union '(:i32 :i64) '(:i64 :i32)) '(:complex-union :complex-union)))
;;; Validate failure being consumed by wasm i32.eqz
(assert (not (validate-values-consumer '(:i32) '(:complex-union :complex-union))))
;;; Validate success being consumed by wasm (block (...) (nop))
(assert (validate-values-consumer '() '(:complex-union :complex-union)))

;;; Actual type of wasm: (block $l (br $l (tuple (i32.const 1) (unreachable))) (tuple (i32.const 1) (i32.const 2))
(assert (equal (values-type-union '(:i32 :empty) '(:i32 :i32)) '(:i32 :i32)))
;;; Validate success being consumed by wasm i32.eqz
(assert (validate-values-consumer '(:i32) '(:i32 :i32)))
;;; Validate success being consumed by wasm (mv_set_local ($a $b) (...))
(assert (validate-values-consumer '(:i32 :i32) '(:i32 :i32)))

;;; Let the actual type of wasm (tuple (i32.const 1) (i64.const 2) (i32.const 3)) be (:i32 :i64 :i32)
;;; Validate success being consumed by wasm (mv_set_local ($a $b $c) (...))
(assert (validate-values-consumer '(:i32 :i64 :i32) '(:i32 :i64 :i32)))
;;; Validate success being consumed by wasm (mv_set_local ($a $b) (...))
(assert (validate-values-consumer '(:i32 :i64) '(:i32 :i64 :i32)))
;;; Validate success being consumed by wasm i32.eqz
(assert (validate-values-consumer '(:i32) '(:i32 :i64 :i32)))

I believe I could demonstrate single pass SSA translation and validation with this model, where multiple value expression results are transformed into multiple SSA definitions one for each expression value and only the consumed values are validated and attempts to merge definitions with conflicting types would be dropped as they are either not used or result in a validation failure that would be caught by the propagated node type. At the end of single pass SSA translation the multiple value expressions would be all gone perhaps except for some annotations on nodes that are sources of them such as calls and at function exits where they are constructed.

It would be good to understand potential problems people see with this model to be able to contrast it with other proposals?

I understand supporting multiple value expressions is not necessary, that wasm need not even support blocks returning single values or even expressions at all. My arguments will be: that it's not too large a burden and I may well be able to demonstrate this in an implementation; and that it can lead to more compact wasm code compared with shuffling everything through local variables.

Even if wasm had no expression support at all and all operators wrote results to local variables, I would still like to see support for returning multiple values from a function and receiving them into multiple local variables in the callee because it could support returning these in hardware registers offering improved performance.

…cking consistent with the fall-through expression.

The prior code make the number of arguments to break and return operators dependent on the number of expected values, however this property does not hold in general. For example consider a single break with a single expression that returns the type (), the same type as `nop`, here the number of expected values could be zero yet there is one argument.

Further the prior code appeared to make an interpretation of the arguments as a tuple constructor that would increase with multiple-value support, yet this is not consistent with the fall-through which is a single expression.

The function return operator has the same issue. Even though the function expected results might be considered side-information, the number of arguments to return does not in general match the number of expected function results.

Wasm currently has the start of multiple value expressions, there are single value results and the empty-values result from nop, so even now this needs some consideration to be consistent.

There is some interest in exploring serializations of the AST that are not dependent on the context, and thus a need to have each operator have a defined number of arguments, not necessarily fixed but if not fixed then encoding in the operator.

There is also a strong interest in having an efficient encoding for the zero argument break and return operators, even if this means exposing this at the AST level.

The solution here adds separate zero expression break and return operators, which have a fixed number of arguments. These are only valid when the expected number of values is zero, but this does not add a new constraint and is still consistent with the fall-through expressions. The current break and return operators become fixed argument single expression operators for which their expression has the same semantics as the fall-through.

Future multi-value expression support would be expected to add an explicit tuple or values constructor, matters yet to be decided, and these would likely be usable for the fall-through and the break or return expression so no new break or return expressions would be needed.
@rossberg
Copy link
Member

rossberg commented Apr 4, 2016

@lukewagner, I don't think the spec needs to (or should) change for post-order. With type systems, you usually have two distinct concerns: a declarative specification of typing rules, and an algorithmic implementation (or many possible ones). The former is completely unaffected by the move to a post-order encoding. Implementations merely want a different algorithm for checking now. For a spec, however, you ultimately want to use the usual declarative formulation, and the current top-down formulation happens to be in almost 1:1 correspondence to that (in particular, because it does not require type variables or unification).

It might be useful, however, to add a bottom-up algorithm to the spec as a kind of "implementation note". In fact, that's already on my list of things to look into. But I don't think this algorithm should become the actual spec.

@lukewagner
Copy link
Member

@rossberg-chromium If the block/loop/if opcodes don't declare their type as an immediate then there is no expected type to check against the children of these AST nodes as the children are encountered during single-pass decoding and something more complicated will be necessary to raise validation errors in cases like the above. (Have you implemented the bottom-up/post-order decoding for this (without some sort of expected-type immediate) and there's something easy that I'm missing?)

@titzer
Copy link
Contributor

titzer commented Apr 4, 2016

I think Andreas is arguing that bottom up and top down should still be
equivalent and that single-pass bottom up is therefore not observable and
is an implementation detail, regardless of postorder.

On Mon, Apr 4, 2016 at 5:32 PM, Luke Wagner notifications@github.com
wrote:

@rossberg-chromium https://github.com/rossberg-chromium If the block/
loop/if opcodes don't declare their type as an immediate then there is no
expected type to check against the children of these AST nodes as the
children are encountered during single-pass decoding and something more
complicated will be necessary to raise validation errors in cases like the
above
#215 (comment).
(Have you implemented the bottom-up/post-order decoding for this (without
some sort of expected-type immediate) and there's something easy that I'm
missing?)


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#215 (comment)

@lukewagner
Copy link
Member

I understand that, but I'm asking if implementing the precise validation rules in ml-proto will cause undue hardship for a post-order/bottom-up impl. E.g., I think V8 does not precisely implement these rules and, e.g., accepts the above example, right? Are you excited to make that fail to validate :) ?

@rossberg
Copy link
Member

rossberg commented Apr 4, 2016

@lukewagner, yes, I'm aware of the problem, but that's an algorithmic problem entirely, it does not affect the definition of the type system. Declaratively, typing rules have no direction -- they don't check one type against another, they just require certain types to be in suitable relations. A type system's specification does not need to say how to actually come up with the types in this relation (as long as we can show that there is an algorithm that can, and we can give several in this case).

Since you mentioned Hindley/Milner, note the similar difference between the H/M type system and the algorithm W for type checking it. The former is much simpler than the latter, and needs no notion of unification or anything like that.

@rossberg
Copy link
Member

rossberg commented Apr 4, 2016

(Oops, sent before I saw your latest comment.)

@lukewagner
Copy link
Member

Yes, sorry if I was unclear: I'm asking if the definition needs to change to admit a simpler algorithm.

@rossberg
Copy link
Member

rossberg commented Apr 4, 2016

@lukewagner, the example is one we've already been discussing independent of post-order, so I would argue it's mostly orthogonal. But you are right that the ease of a post-order algorithm may be another argument for allowing it.

@lukewagner
Copy link
Member

mostly :)

@ghost
Copy link
Author

ghost commented Apr 5, 2016

I just noticed that a cool pre-order AST single pass compiler to a stack machine byte code, and an interpreter for this byte code, landed on sexpr-wast. I am interest in adapting this to try and demonstrate some different designs for multiple-value support and in the limit for zero and one value expressions for the MVP.

The compiler does not appear to exploit the known expected type of blocks, and could I ask if this is a deliberate design decision and perhaps in anticipation of a post-order AST compiler?

One approach I could explore is to determine the expected type for blocks when they are created (at the shift operation). The expected type appears to be available with the pre-order decoder by looking at the top of the expression stack. Then the breaks could just compare to this without the 'unify' logic. I could scale this to multiple values by having the break operators compile in discard/keep operators that discard excess values and keep the expected number of values. Unfortunately I don't think this would work with a single pass compiler of a post-order AST as it would not know the expected type of a block before the break operators.

It's not immediately clear if this compiler would work single-pass with a post-order AST encoding?

The other approach that comes to mind is to use the type (number of values) of the first block break as the actual result type of the block. If other break operators passed more values then the excess would be discarded, and if less then the compiler might need to emit fill values and validate that they are never used. If break operators had conflicting value types for some of the values then these conflicts could be ignored too as there would either never be used or cause a validation error.

@drom
Copy link

drom commented Apr 5, 2016

@JSStats stack machine? interesting. Can you give a link?

@ghost
Copy link
Author

ghost commented Apr 5, 2016

@binji
Copy link
Member

binji commented Apr 5, 2016

The compiler does not appear to exploit the known expected type of blocks, and could I ask if this is a deliberate design decision and perhaps in anticipation of a post-order AST compiler?

You know, I did experiment with this, but I can't remember why I moved away from it. I just looked at the diff in my reflog, but I still can't remember why. :-}

I definitely ran into issues type-checking top-down, especially making this work with the expected value stack -- i.e. what values if any should be discarded. But then again there were a number of issues with the code at that point, so it doesn't mean that it couldn't have worked.

@ghost
Copy link
Author

ghost commented Apr 16, 2016

My current thinking is that blocks should not return values, rather use local variables for the use case.

@ghost ghost closed this Apr 16, 2016
ngzhian added a commit to ngzhian/spec that referenced this pull request Nov 4, 2021
* Fix string comparison

Classic newbie mistake of using != on strings. Plus I got the
conditional wrong - it should error if s is none of the valid simd
shapes.
dhil pushed a commit to dhil/webassembly-spec that referenced this pull request Mar 2, 2023
Fixes WebAssembly#215.

This should indicate that an engine is still conforming if it does not attach a stack trace, even if `traceStack` is requested by JavaScript code (by setting it to `true`).
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants