Conversation
document/macros.rst
Outdated
| @@ -0,0 +1,146 @@ | |||
| .. |X| mathmacro:: \mathit | |||
| .. |K| mathmacro:: \mathsf | |||
| .. |F| mathmacro:: \mathrm | |||
There was a problem hiding this comment.
Does "X" have a meaning by which one could remember that it means "LaTeX \mathit"? Ditto for the others here.
There was a problem hiding this comment.
Yes: X - variable, K - keyword, F - function. I added comments to this and other parts of the macro file.
(These should in fact be use for all the defs below, but mathmacro.py doesn't support iterative expansion.)
document/macros.rst
Outdated
| .. |evalto| mathmacro:: \hookrightarrow | ||
| .. |lsem| mathmacro:: {[\![} | ||
| .. |rsem| mathmacro:: {]\!]} | ||
| .. |sem| mathmacro:: \def\sem#1{[\![#1]\!]}\sem |
There was a problem hiding this comment.
Can you clarify the purpose of these macros?
There was a problem hiding this comment.
Defines semantic brackets. Unused in this PR, removed.
document/syntax/conventions.rst
Outdated
|
|
||
| * :math:`r.\K{field}` denotes the :math:`\K{field}` component of :math:`r`. | ||
|
|
||
| * :math:`r,\K{field}\,s` denotes the same record as :math:`r` but with the sequence :math:`s` appended to its :math:`\K{field}` component. |
There was a problem hiding this comment.
While this syntax for appending an element to a field may be natural to some, it's very terse and doesn't leave any clues for people who don't recognize it to look it up with. Do you anticipate a particularly frequent purpose for this? As this is a prose spec, I think it would be better to use prose to explain things, rather than specialized syntax.
| ~~~~~~~ | ||
|
|
||
| *Vectors* are bracketed sequences of the form :math:`[A^n]` (or :math:`[A^\ast]`), | ||
| where the :math:`A`-s can either be values or complex constructions. |
There was a problem hiding this comment.
What's a bracketed sequence? This seems to be conflating the notation used in the spec with the abstract concept of a sequence.
There was a problem hiding this comment.
Agreed, not a good choice of word. I was looking for a term distinguishing these from other forms of sequences, like instruction sequences that need a terminator.
Replaced with "self-contained", but not great either. Open to suggestions.
document/syntax/values.rst
Outdated
| Conventions | ||
| ........... | ||
|
|
||
| * The meta variables :math:`m, n, i, j, k` range over unsigned integers. |
There was a problem hiding this comment.
Is this a useful convention that will make the document more clear? (And same for b meaning bytes, and so on.)
One of the main use cases for spec documents is jumping into the middle of it to answer a specific question without having the full context of the document's conventions in mind. Someone may know basically how wasm works, but need to know a specific detail, and may not be holding the whole spec document in their head to remember that all these letters mean "unsigned". My suggestion is to focus on writing the prose spec in a way that doesn't require subtle and terse conventions like this.
There was a problem hiding this comment.
The intention is to only use it where clear from immediate syntactic context. For example, you want to be able to write "i32.const n" or "get_local x" etc. based on some established convention.
That said, I probably over-allocated there, so removed some of the vars for natural numbers until actually needed.
document/macros.rst
Outdated
| .. |byte| mathmacro:: \mathrm{byte} | ||
| .. |bytes| mathmacro:: \mathrm{bytes} | ||
| .. |encode| mathmacro:: \enc | ||
| .. |encodex| mathmacro:: \encx |
There was a problem hiding this comment.
What's an "encodex"? I'm asking many of these questions because jumping back and forth between this document and the rendered form or the macro definitions is somewhat tedious, so I'd find it helpful if macros and other utilities were fairly obvious, so that one can mostly get by by reading the source without jumping around.
There was a problem hiding this comment.
Also not currently used, removed. (Left-overs from playing with notations for expressing encoding.)
document/macros.rst
Outdated
| .. |encodex| mathmacro:: \encx | ||
| .. |encoding| mathmacro:: \def\enc#1{{[\![#1]\!]}} \def\encx#1#2{{[\![#1]\!]}_{#2}} | ||
|
|
||
| .. |uX| mathmacro:: \def\uX#1{\mathit{uint}_{#1}}\uX |
There was a problem hiding this comment.
If the rendered document uses names like "uint8", my suggestion would be to use names like "uint8" in the macro names. Or, if it's desirable to be more concise in the source, my suggestion would be to spell these as "u8" in the rendered form, for when one does have to jump back and forth between rendered form and source.
There was a problem hiding this comment.
Fair enough. It is not clear to me yet which naming works best over the length of the whole spec, so I'd prefer to leave this alone for now to avoid churn.
document/macros.rst
Outdated
| .. |vec| mathmacro:: \href{../syntax/values.html#syntax-vec}{\mathit{vec}} | ||
| .. |string| mathmacro:: \href{../syntax/values.html#syntax-string}{\mathit{string}} | ||
| .. |name| mathmacro:: \href{../syntax/values.html#syntax-name}{\mathit{name}} | ||
| .. |sec| mathmacro:: \mathit{sec} |
There was a problem hiding this comment.
"sec" here is short for "section", right? Would it make sense to just say "section", or is this frequent enough to be worth abbreviating?
There was a problem hiding this comment.
Currently unused, removed.
document/mathmacro.py
Outdated
| 'numpydoc', | ||
| 'mathmacro'] | ||
|
|
||
| From https://bitbucket.org/fluiddyn/fluiddyn/src/doc/mathmacro.py |
There was a problem hiding this comment.
This appears to be a broken link.
From a little searching, this file appears to be copied from a codebase using the CeCILL-B license. I am not a lawyer. Do you know if the CeCILL-B license's requirements, possibly including but not limited to its attribution requirements, are acceptable for inclusion in this project?
There was a problem hiding this comment.
Fixed link.
Good question, I have no idea. In the worst case, we can reimplement it or just run sed over the sources instead (which may be faster anyway, and might fix the underscore restriction).
@Others, anybody knows about the CeCILL-B license?
document/syntax/values.rst
Outdated
|
|
||
| * The meta variables :math:`m, n, i, j, k` range over unsigned integers. | ||
|
|
||
| * Numbers may be denoted by simple arithmetics. |
There was a problem hiding this comment.
What are "simple arithmetics?" in this context?
There was a problem hiding this comment.
Clarified by pointing to example.
document/macros.rst
Outdated
| .. |IMPORTS| mathmacro:: \mathsf{imports} | ||
| .. |EXPORTS| mathmacro:: \mathsf{exports} | ||
|
|
||
| .. |UNREACHABLE| mathmacro:: \mathsf{unreachable} |
There was a problem hiding this comment.
Other mathmacros are lower-case; is there a reason the instructions are upper-case? It would seem to minimize source versus rendered-text differences to spell the instruction names as lower-case too.
There was a problem hiding this comment.
The convention is: keywords uppercase, others lowercase. Note that various names naturally have definitions as both (e.g. func, global, and other keywords heading respective forms), so you need some convention to distinguish. And uppercase for keywords also happens to improve readability of the Latex somewhat.
document/macros.rst
Outdated
| .. |ELSE| mathmacro:: \mathsf{else} | ||
| .. |END| mathmacro:: \mathsf{end} | ||
| .. |BR| mathmacro:: \mathsf{br} | ||
| .. |BRIF| mathmacro:: \mathsf{br\_if} |
There was a problem hiding this comment.
And in any case, preserving the underscores in the instruction names that have them would improve readability of the source, and searchability as well.
There was a problem hiding this comment.
Agreed, but unfortunately, mathmacro.py does not appreciate underscores in macro names. :(
rossberg
left a comment
There was a problem hiding this comment.
Thanks for the review. Comments addressed as noted. In particular, I cleaned up and commented the macro file, after realising that comments are actually possible. :)
document/macros.rst
Outdated
| @@ -0,0 +1,146 @@ | |||
| .. |X| mathmacro:: \mathit | |||
| .. |K| mathmacro:: \mathsf | |||
| .. |F| mathmacro:: \mathrm | |||
There was a problem hiding this comment.
Yes: X - variable, K - keyword, F - function. I added comments to this and other parts of the macro file.
(These should in fact be use for all the defs below, but mathmacro.py doesn't support iterative expansion.)
document/macros.rst
Outdated
| .. |evalto| mathmacro:: \hookrightarrow | ||
| .. |lsem| mathmacro:: {[\![} | ||
| .. |rsem| mathmacro:: {]\!]} | ||
| .. |sem| mathmacro:: \def\sem#1{[\![#1]\!]}\sem |
There was a problem hiding this comment.
Defines semantic brackets. Unused in this PR, removed.
document/macros.rst
Outdated
| .. |rsem| mathmacro:: {]\!]} | ||
| .. |sem| mathmacro:: \def\sem#1{[\![#1]\!]}\sem | ||
|
|
||
| .. |void| mathmacro:: \def\void#1{}\void |
There was a problem hiding this comment.
That's to comment out math stuff easily, highly useful. Added comment.
document/macros.rst
Outdated
| .. |byte| mathmacro:: \mathrm{byte} | ||
| .. |bytes| mathmacro:: \mathrm{bytes} | ||
| .. |encode| mathmacro:: \enc | ||
| .. |encodex| mathmacro:: \encx |
There was a problem hiding this comment.
Also not currently used, removed. (Left-overs from playing with notations for expressing encoding.)
document/macros.rst
Outdated
| .. |encodex| mathmacro:: \encx | ||
| .. |encoding| mathmacro:: \def\enc#1{{[\![#1]\!]}} \def\encx#1#2{{[\![#1]\!]}_{#2}} | ||
|
|
||
| .. |uX| mathmacro:: \def\uX#1{\mathit{uint}_{#1}}\uX |
There was a problem hiding this comment.
Fair enough. It is not clear to me yet which naming works best over the length of the whole spec, so I'd prefer to leave this alone for now to avoid churn.
document/syntax/values.rst
Outdated
| ~~~~~ | ||
|
|
||
| The simplest form of value are raw uninterpreted *bytes*. | ||
| In the abstract they are represented as hexadecimal literals. |
There was a problem hiding this comment.
Typo, was supposed to say "in the abstract syntax". :)
| The latter class defines *uninterpreted* integers, whose signedness interpretation can vary depending on context. | ||
| A 2's complement interpretation is assumed for out-of-range values. | ||
| That is, semantically, when interpreted as unsigned, negative values :math:`-n` convert to :math:`2^N-n`, | ||
| and when interpreted as signed, positive values :math:`n \geq 2^{N-1}` convert to :math:`n-2^N`. |
document/syntax/values.rst
Outdated
|
|
||
| * The meta variables :math:`m, n, i, j, k` range over unsigned integers. | ||
|
|
||
| * Numbers may be denoted by simple arithmetics. |
There was a problem hiding this comment.
Clarified by pointing to example.
document/syntax/values.rst
Outdated
| Conventions | ||
| ........... | ||
|
|
||
| * The meta variables :math:`m, n, i, j, k` range over unsigned integers. |
There was a problem hiding this comment.
The intention is to only use it where clear from immediate syntactic context. For example, you want to be able to write "i32.const n" or "get_local x" etc. based on some established convention.
That said, I probably over-allocated there, so removed some of the vars for natural numbers until actually needed.
| ~~~~~~~ | ||
|
|
||
| *Vectors* are bracketed sequences of the form :math:`[A^n]` (or :math:`[A^\ast]`), | ||
| where the :math:`A`-s can either be values or complex constructions. |
There was a problem hiding this comment.
Agreed, not a good choice of word. I was looking for a term distinguishing these from other forms of sequences, like instruction sequences that need a terminator.
Replaced with "self-contained", but not great either. Open to suggestions.
document/syntax/instructions.rst
Outdated
|
|
||
| Some integer instructions come in two flavours, | ||
| where a signedness annotation |sx| distinguishes whether the operands are to be interpreted as :ref:`unsigned <syntax-uint>` or :ref:`signed <syntax-sint>` integers. | ||
| For the other integer instructions, the sign interpretation is irrelevant under a 2's complement interpretation. |
There was a problem hiding this comment.
Pedantic nit: 2's complement is a signed interpretation. So if you want to mention 2's complement here, it'd be better to mention it in in the previous sentence with the signed integer instructions here. For the sign-agnostic instructions, perhaps say something like "For the other integer instructions, there is no inherent numerical interpretation", maybe adding something like "(the use of 2's complement for the signed interpretation means that common arithmetic operations behave the same regardless of whether the data is considered signed or unsigned)" if it's useful to clarify.
There was a problem hiding this comment.
Sounds good. The latter sentence actually is to the point, so I went with that.
| The latter class defines *uninterpreted* integers, whose signedness interpretation can vary depending on context. | ||
| A 2's complement conversion is assumed for values that are out-of-range for a chosen interpretation. | ||
| That is, semantically, when interpreted as unsigned, negative values :math:`-n` convert to :math:`2^N-n`, | ||
| and when interpreted as signed, positive values :math:`n \geq 2^{N-1}` convert to :math:`n-2^N`. |
There was a problem hiding this comment.
I'm still a little unclear about whether there's a subtle distinction being made here. In a purely abstract context, it makes sense to have integers, and various subsets of integers such as non-negative integers, non-negative integers less than 2³², integers in [-2³¹,2³¹), or similar. The concepts of unsigned and signed are more relevant to encoding.
It's not obvious to me that a rigorous separation between abstract syntax, validation, semantics, and representations is important for a document such as this in the first place. However, taking the text here for what it is right now, the distinction is being made, and it doesn't seem to be entirely clear.
There was a problem hiding this comment.
Well, you don't want the AST to contain programs that are not actually expressible in the concrete syntax. Otherwise you gonna run into annoying impedance mismatches. For example, you'd have to specify what happens for out-of-bounds values (e.g. an offset >= 2^32) in places, or impose range restrictions on a meta level, which is more difficult to handle.
Signedness is especially relevant for semantics, think natural vs integer domains. In particular, most numbers in the AST are natural, and defining that directly avoids the need to say anything about negative values all over the place.
You may be right, though, that expressing this in terms of signedness is mixing levels a little. It might make sense to distinguish static (all natural) numbers as they occur in the AST, from dynamic values like used for computation and only occurring in the AST with const instructions. Would naming the former "nat" make that distinction clearer?
The bit about conversion of uninterpreted integers is intended more as an explanation than a definition here. The actual definition will occur in the places of the spec where such values are actually produced/consumed. But I thought it's useful to mention upfront how they will be dealt with. But if the explanation is unclear it probably needs improvement. I reworded it somewhat, not sure it's better now. Also, it should link to the part of the spec that makes this more precise; I added a todo for that.
| ~~~~~ | ||
|
|
||
| The simplest form of value are raw uninterpreted *bytes*. | ||
| In the abstract syntax they are represented as hexadecimal literals. |
There was a problem hiding this comment.
In the abstract syntax, it seems most natural for bytes to just be integers (a subset thereof), not hexadecimal integers, because wasm doesn't have any semantics formulated in terms of hexadecimal digits.
There was a problem hiding this comment.
This is supposed to convey the difference between numbers and something that (even semantically) is just a bit pattern with no inherent interpretation. They are different types, and require explicit conversion in the semantics. For example, the memory consists of bytes. I'm not wed to using hex notation here, but it feels kind of natural, and can be reused nicely for describing the binary decoding (which operates on bytes, too).
| ~~~~~~~ | ||
|
|
||
| *Vectors* are self-contained sequences of the form :math:`A^n` (or :math:`A^\ast`), | ||
| where the :math:`A`-s can either be values or complex constructions. |
There was a problem hiding this comment.
Now I'm confused by the phrase "self-contained" ;-). It seems how the length of a sequence is represented, via a size field or a terminator or something else, is an encoding concern, which doesn't belong in the abstract syntax.
I guess "vectors" are being defined in anticipation of having a common way to describe things which are encoded as length+elements, since that's a common pattern in wasm encoding. My suggestion for wording would be to just define them as plain "sequences", and then add a note or so mentioning that vectors have a particular role in the binary encoding, making the minor reach into a different domain explicit.
There was a problem hiding this comment.
Yes, it wasn't great. But I realised what the actual property is that these have in contrast to arbitrary sequences: their length is bounded to 32 bit. So I reworded to "bounded" and refined the grammar to express that restriction.
document/syntax/values.rst
Outdated
| Floating-point Numbers | ||
| ~~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| *Floating-point numbers* are represented as binary values according to the `IEEE-754 <http://ieeexplore.ieee.org/document/4610935/>`_ standard. |
There was a problem hiding this comment.
At the risk of being even more pedantic, "represented as" is a representation concern which doesn't seem to belong in an abstract syntax. And "Numbers" is sometimes taken to exclude NaN. And there's no dash between "IEEE" and "754". :-}
In short, I recommend naming this section "Floating-point Data", and saying: "WebAssembly includes support for binary floating-point data, as described by the IEEE 754 standard [url]".
document/syntax/values.rst
Outdated
| Conventions | ||
| ........... | ||
|
|
||
| * The meta variable :math:`z` ranges over floating point values. |
There was a problem hiding this comment.
My concern with these single-letter conventions is the use case of jumping into the middle of the document, with a general understanding of wasm, but without full context of the document's conventions. This is something I've found difficult in the interpreter.
It seems to me to make sense to define meta variables in context, rather than globally, so that any subtle issues, like what the specific set of valid values is, or what the consequences are if someone uses such a value outside that set, can be handled in context. For example, it looks like n is sometimes used to abstract over i32 and i64. It seems worth saying that n can only be those two values, in that context, and in that case, it's fully defined for that context and doesn't need a global convention.
Alternatively, would it be possible to make n be a hyperlink to the place where it's defined?
There was a problem hiding this comment.
Agreed for ones that are used rarely, such as this one. Removed this one.
Some others, e.g. n or t, occur very frequently. Their meaning should always be clear from immediate context (and I believe all use sites so far are cross-linked heavily), but pointing out a consistent convention upfront still seems beneficial.
|
It's quite difficult to review changes when the rendered form linked above has changes from multiple PRs as well as apparently changes which are not yet submitted as PRs. Is there an easier way to view the rendered form containing just the changes for a specific PR? |
|
It's quite difficult to review changes when the rendered form linked above
has changes from multiple PRs as well as apparently changes which are not
yet submitted as PRs. Is there an easier way to view the rendered form
containing just the changes for a specific PR?
@sunfishcode, acknowledged. Any suggestion where we could upload builds of
multiple PRs without too much hassle? GH pages is a PITA when it comes to
these things.
Right now the version on GH.io should be the accumulation of this and the
other PR. What other changes are you seeing?
|
|
The mathmacro plugin had serious issues. In particular, it created gigantic regexps and overall regexp compile times were quadratic in the number of macros -- builds became superslow. So I reimplemented a similar thing, but much faster and with more functionality. In particular, it supports macros with params and allows using previously defined macros in expansions. The only thing I couldn't get to work was allowing underscores in def names, because that is in inherent conflict with the special role of underscores in TeX (TeX's own \def mechanism substitutes token sequences instead of just individual names, so can handle it, but that cannot be simulated with simple search+replace). The new thing also supports a somewhat cleaner handling of crossrefs from math, which now work in the PDF as well. Still not perfect but hopefully good enough. |
4284e11 to
e92b0a9
Compare
|
Not reading the code, but just having spent some time reading over the rendered document, overall this is looking really good. I don't know if this is the right PR for it, but some basic UX-level questions/requests:
|
|
Ooh, I also just noticed and like that "Index of Instructions" at the end, that should be quite useful. |
It seems like what Luke's referring to is MathJax running asynchronously |
|
- when I'm on some page in the rhs content pane, could the associated
lhs TOC item be bolded?
I like that suggestion, but the theme does not seem to offer that option.
If I find time (later) I might try to investigate if it can be hacked
somehow.
- is there any way, at some point in time, to optimize loading so
there isn't the visible restyling delay?
As @distransient mentions, that is MathJax in operation, which is dog
slow. The only other option would be KaTeX. Unfortunately, I already tried
that, and while being much faster it had showstopper render bugs (like not
handling super/subscripts correctly, or putting the bar of `\frac` an
entire cm too high, striking out the premise instead). Maybe once that has
matured more we will be able to switch.
- at some point, would each concept/phase in the Overview page should
get an anchor and get linked to pervasively?
Yes, that's the idea. There are various dangling xrefs right now as well.
Once all material is there I intend to make a pass to tie up all that.
|
|
Righto, makes sense, lgtm! |
|
Okay, merging based on LGTM above, some offline discussion, and (I believe) no outstanding comments. Expecting further iterations anyways. |
[js-api] Fix type check for externrefs
Some progress on the textual spec:
You can preview the result at http://webassembly.github.io/spec/