Update discussion of indirect calls and function pointers#392
Conversation
AstSemantics.md
Outdated
There was a problem hiding this comment.
I think "used" might be interpreted as saying that call_indirect will be relative to the module (which isn't true) when what I think you're getting at is that the as-yet-unspecified address-of expressions which refer to the dylib's static local indirect function table will be automatically adjusted so that they correctly refer to the index of that function in the dynamic instance indirect function table at runtime. If that explanation grows unwieldy, perhaps you want to move it to a new section in DynamicLinking.md and link to that section from here?
There was a problem hiding this comment.
I've updated this section to be a bit more clear.
|
Other than small nit, lgtm! |
There was a problem hiding this comment.
Is the main function table local to each module?
Why has the indirect function table been introduced?
Are the elements of the indirect function table observable to the wasm code? Can it be accessed to load the index?
No provision has been made for homogeneous function tables. These seem important for code that wants to avoid a runtime signature check.
There was a problem hiding this comment.
On Wed, Oct 7, 2015 at 8:55 AM, JSStats notifications@github.com wrote:
In AstSemantics.md
#392 (comment):- *
addressof: obtain a function pointer value for a given function-Function-pointer values are comparable for equality and the
addressofoperator
-is monomorphic. Function-pointer values can be explicitly coerced to and from
-integers (which, in particular, is necessary when loading/storing to memory
-since memory only provides integer types). For security and safety reasons,
-the integer value of a coerced function-pointer value is an abstract index and-does not reveal the actual machine code address of the target function.
-In the MVP, function pointer values are local to a single module. The
-dynamic linking feature is necessary for
-two modules to pass function pointers back and forth.
+
+Functions from the main function table are made addressable by defining an
+indirect function table that consists of a sequence of indices
+into the module's main function table. A function from the main table may appear more thanIs the main function table local to each module?
Yes, each module will have its own (local) main function table that
declares the functions in that module.Why has the indirect function table been introduced?
The indirect function table allows the module to declare which functions
are addressable and arrange them into a table. Thus the assignment of
integers to functions pointers is under the control of the module. Since
the indirect function table allows functions to appear more than once, it
allows, e.g. a compiler to map vtables into the one big indirect table. So
a C++ table dispatch can be as simple as:
call_indirect(i32_add(i32_load(obj, #0), #meth_num), ... args ...)
So C++ objects store the "vtable base" in the object header, add a
method-specific offset, and then call that.
This has the advantage that vtables can be placed outside the linear memory
for safety (and performance--one less memory access).
Are the elements of the indirect function table observable to the wasm
code? Can it be accessed to load the index?No, wasm code cannot directly read the indirect function table.
No provision has been made for homogeneous function tables. These seem
important for code that wants to avoid a runtime signature check.That's an optimization that we can consider adding in the future, e.g. by
denoting an expected range within the single table where all the functions
have the same signature. AFAICT, that trades one branch (signature check)
for a subtract before the bounds check. Seems like it could be worthwhile,
but let's add that when we have data.—
Reply to this email directly or view it on GitHub
https://github.com/WebAssembly/design/pull/392/files#r41356981.
|
lgtm |
There was a problem hiding this comment.
Should this be i64 for wasm64? I don't have a specific use case that needs more than 4 billion indirect function table entries, but since functions can appear multiple times in the table, one could imagine possibilities.
There was a problem hiding this comment.
On the other hand, having the a priori i32 limit would allow the C++ compiler so simply make func-ptrs 32-bit. (Is that easy to do in LLVM, or is that baked into the LLP model?)
There was a problem hiding this comment.
On Wed, Oct 7, 2015 at 3:06 PM, Dan Gohman notifications@github.com wrote:
In AstSemantics.md
#392 (comment):
call_import: call imported function directly-Indirect calls may be made to a value of function-pointer type. A
-function-pointer value may be obtained for a given function as specified by its index
-in the function table.
+Indirect calls allow calling target functions that are unknown at compile time.
+The target function is an expression of local typei32and is always the firstShould this be i64 for wasm64? I don't have a specific use case that
needs more than 4 billon indirect function table entries, but since
functions can appear multiple times in the table, one could imagine
possibilities.The tables would have to be sparse somehow, otherwise, declaring 4 billion
indirect tables is going to be a pretty big wasm module :-)—
Reply to this email directly or view it on GitHub
https://github.com/WebAssembly/design/pull/392/files#r41386881.
There was a problem hiding this comment.
That's a nice way to put it. That addresses my only nit.
|
lgtm; we can discuss whether |
|
@titzer Thank you for the explanation - some of those points would be useful to have in the patch text too, such as making it clear when tables a local to modules etc, and it tables a dense or sparse arrays.
Perhaps the module signature table could reserve index 0 to represent an undefined signature, then `addressOf(#, ) could return an index either from a heterogeneous (for an index of zero), or an index from a homogeneous function table (index greater than zero). The runtime would maintain separate concatenated indirect tables for both the instance heterogeneous indirect function array and the homogeneous indirect function arrays. A module might only use one or a few homogeneous indirect function tables, so the burden would not be too high, and proportional to the need. A separate operation to make an indirect homogeneous function call would be fine, and the signature index would then refer to the homogeneous function table to use. |
|
On Wed, Oct 7, 2015 at 10:28 PM, JSStats notifications@github.com wrote:
There are other implementation techniques that we've discussed, such as
|
|
@titzer Wasm has a fundamental design property that the signature must match between the caller and callee. This leads to more efficient indirect functions calls. This requires homogeneous functions tables. The use case that C code can compare function pointers for equality should not compromise the efficiency of the wasm design, not just to meet this use case. This alone results in the need to do runtime signature checking. This needs to be a special case, and not the general case. Concatenation semantics are only a problem for homogeneous ranges in a heterogeneous tables, not for dense homogeneous, and this is entirely a product of a C use case. Sorry these tables need to be dense arrays. If these need to be caches etc it is DOA. If C code really needs to distinguish functions based on their signature then the onus is on C code to store both the signature and the homogeneous table index and compare both, to explicitly check the signature if it needs to, or propose some other scheme in addition to the core wasm support. |
|
Merging based on lgtm's above |
Update discussion of indirect calls and function pointers
|
We can revisit the homogenous signature case when we are closer to a full end-to-end system from C++ -> LLVM -> wasm. A long-term goal is to have strongly-typed function pointers as part of wasm as well, which would not be integers. |
No description provided.