[spec] Allow impls to limit code point range#488
Conversation
jfbastien
left a comment
There was a problem hiding this comment.
Fine with me, @sunfishcode approved in a prior discussion (and @lukewagner agreed later), but as I mentioned here I want to make sure this is discussed independently.
Specifically, I'd like to get feedback from @annevk, @domenic, and @tabatkins.
|
As long as it's a subset (such as ASCII) and not a different charset entirely (like Shift-JIS), yeah, no problem here. |
|
I don't really understand the spec text for this restriction, but it seems like other people do, so maybe it's fine. Reading the other threads, it seems like the actual interpretation is that implementations are free to reject incoming source bytes if certain bits in those bytes are set to certain values? E.g. an implementation is free to reject incoming source bytes if the most-significant-bit is set, effectively only allowing ASCII names? It would be a lot clearer to me if things were stated that way, but as I said, it seems like others aren't having this comprehension problem, so maybe it's fine. |
|
@tabatkins @domenic The terms "code point" and "common subsets" are clear enough to me that we're still talking about Unicode values, not binary bytes/bits. It might be helpful to be more explicit about this distinction, though. |
|
How are we talking about Unicode values? Isn't this spec discussing implementation-specific limitations on the inputs, which are definitely bytes? |
|
@domenic I have to look at the whole file; not shown in he GitHub "Files changed" feature are the section headings, which add more context to the changes. |
|
Right, I guess I don't understand what the first change applies to, i.e. the "Syntactic Limits" heading. |
|
@domenic, it's described in terms of the abstract syntax, which defines names as sequences of Unicode code points. That makes it independent of the concrete input format (e.g. binary or text format). |
|
Seems like there is approval and no objections, so I'll merge. |
Includes [spec] Allow impls to limit code point range (#488).
[test] Unify the error message of `"null structure reference"`.
As discussed on WebAssembly/design#1016, make it legal for implementations in environments that do not understand (all of) Unicode to only support smaller character subsets.