Skip to content

Add wasmtime-hostmodule DSL crate #175

Closed
eira-fransham wants to merge 7 commits intobytecodealliance:mainfrom
eira-fransham:wasmtime-hostmodule
Closed

Add wasmtime-hostmodule DSL crate #175
eira-fransham wants to merge 7 commits intobytecodealliance:mainfrom
eira-fransham:wasmtime-hostmodule

Conversation

@eira-fransham
Copy link
Contributor

@eira-fransham eira-fransham commented Jun 19, 2019

This adds a crate defining a DSL for creating host modules. I debated whether to include this in wasmtime proper or whether to define an external crate, but I could definitely imagine that most people using wasmtime will want something like this in order to define host functions etc. I haven't added any code that reexports this from the wasmtime root crate but I could imagine that that would be helpful (plus, it would mean we can remove the workspace glob that I added in 1a59141).

The DSL looks like so:

use wasmtime_hostmodule::{
    exports,
    BindArgType,
    Func,
    Global,
    Instantiate,
    Memory,
    Table,
    TableElementType
};

fn hello_world() {
    println!("Hello, world!");
}

fn print_and_return(val: i32) -> i32 {
    println!("{}", val);
    val
}

let mut counter = 100;
let my_closure = move |inc: u32| -> u32 {
    counter += inc;
    counter
};

let my_module = exports! {
    do_thing: Func(hello_world.bind::<()>()),
    print_and_return: Func(print_and_return.bind::<(i32,)>()),
    counting_func: Func(my_closure.bind::<(u32,)>()),
    my_glob: Global(100u64, Default::default()),
    memory: Memory {
        minimum: 1,
        maximum: Some(2),
        shared: false,
    },
    table: Table {
        ty: TableElementType::Func,
        minimum: 10,
        maximum: Some(20),
    },
};

my_module.instantiate().unwrap();

Run cargo doc from the crate root for more info. This isn't a proc macro, the macro does almost nothing at all. All the magic is done in the type system, as God intended.

@sunfishcode
Copy link
Member

cc @alexcrichton

@alexcrichton
Copy link
Member

This looks pretty cool! I like how the macro primarily relies on traits and I also like how the macro itself is pretty small. I'm not super familiar with wasmtime so it seems a little odd there's no parts of the macro where you actually provide a module, but I'm gonna assume that's working as intended.

I also think it's a really cool idea to get all this hooked up with safe code and less implicit casting. That way it's easier to get things right and avoid things like off-by-one errors.

I found it pretty hard to wrap my head around the type system here unfortunately. I was thinking it might be good to try to simplify the various hierarchies of traits and how it all interacts. For example it's somewhat unfortunate to call .bind::<...> on functions and the usage of hlist in the internals wasn't exactly something I could wrap my head around. Is it possible to structure the traits to avoid these needs?

I was thinking types like Value aren't too too useful since we've already got from_bits and into_bits in Rust so we don't need and extra type to have the raw bits of a floating point value.

Stylistically BTW there's no need to put leading underscores in front of type variables, no need to have accidentally C++-isms leak in!

@eira-fransham
Copy link
Contributor Author

eira-fransham commented Jun 21, 2019

@alexcrichton Thanks for the feedback! So some of the unsafe is unavoidable - VMContext::host_state is an unsafe function (but if you call it from within a host function it's always safe AIUI, maybe @sunfishcode can clarify there). I'd love if it was made safe somehow but I don't understand the specifics. The use of unsafe to synthesise ZSTs is to avoid the problem of having to do the cost of downcasting every time for the common case that you're using fn functions and not closures, but it's definitely possible that it just costs the price of a call+cmp+jne+ret and so isn't worth the unsafety. I should note that even removing the unsafe keyword from this module entirely doesn't make it safe - there is unsafe code later down the line which relies on the correctness of this code. You could just provide std::ptr::null() for all the function pointers and it would cause a segfault. Yes, that's unsound and unsafe shouldn't rely on safe code, but that's the way it is right now and it's out of the scope of this PR.

You don't need to call .bind if you're using function pointers, it's only if you're using concrete types that just implement FnMut. Unfortunately, this means it also applies to regular fn functions, which are not function pointers but merely unique unnameable types that coerce to function pointers, just like closures. There's no way around this AFAIK, you have to specify the argument types somehow.

Value isn't required just to specify the bits of a float, it's also so you can choose different types for the global at runtime. I don't know why you'd want to do that, but there's not reason to disallow it. to_bits and from_bits isn't enough, as the bits of floats can change even if you don't access them. You need to allow people to provide a specific bit pattern as an integer or it would be impossible to be certain that you're setting the right value. I allow you to pass a float, though, since it's much more convenient to do so and if you're not trying to set the global to NaN then it's not a problem.

The leading underscores within the macro bodies is to avoid them clashing with parameter names passed in from the outside - it's deliberately a violation of style guidelines, exploiting the fact that the calling code will follow the styleguide in order to avoid name clashes. Unfortunately Rust has no hygiene on type parameters, you can try removing the underscores to see what I mean.

The HList isn't strictly necessary, but an implementation that doesn't use it would have to do all the work in the macro, it would probably work by creating a struct and implementing some CreateExports trait on it within the macro. This means that the moment you have something that the macro can't do you're basically back to square zero. With the HList method you can be more flexible. A possible extension is allowing users to define a trait that implements a single export type, instead of implementing a trait for Cons<ExportDef<Func<...>>, ...> we'd just implement a supplementary public-facing trait for Func and have a generic impl for Cons<ExportDef<impl SomeTrait>, ...>.

Finally, can I ask what you mean about implicit casting? I tried to make it as unmagical as possible as far as that goes, I'm not sure I know what you mean. I'd definitely be happy to fix that though.

@alexcrichton
Copy link
Member

Oh sorry I meant to say it's great to be removing unsafe code from callers by having some sort of abstraction to call through, I'd naturally assume some unsafe in the implementation here of course would be required.

You don't need to call .bind if you're using function pointers, it's only if you're using concrete types that just implement FnMut

This to me seems like a bit of an odd distinction? Couldn't it just be for for all types that implement FnMut or such? (that should include function pointers)

Value isn't required just to specify the bits of a float, it's also so you can choose different types for the global at runtime.

I was thinking about this recently as well, and I think that we could take this even further and remove both Value and the macro itself? The macro is already requiring a static structure of the exports but I'd sort of expect a builder where you'd do something like:

let mut builder = ...;
builder.add_global_i32("foo", "bar", 3);
builder.add_func("foo", "baz", |x: i32| x + 2);
let imports = builder.finish();

(or something like that)

That way you'd pick at runtime which method to call and you'd also have more flexibility over the structure of the imports/exports since it doesn't have to be statically determined in the code.

as the bits of floats can change even if you don't access them

Does that apply to Rust? Do you have an example in Rust where bits change in meaningful ways that matter?

The leading underscores within the macro bodies is to avoid them clashing with parameter names passed in from the outside

I understand what you mean here but you're being "hygienic" with one precise internal callsite. That doesn't seem wortwhile to sprinkle unnecessary __. It also doesn't really seem necessary to support closures of up to 16 arguments IMO. The raw APIs can always be used as well.

With the HList method you can be more flexible

I guess I wouldn't disagree with that, but I would at least personally have no way to make heads or tails of this implementation. It seems like it should be more straightforward than it probably currently is and shouldn't need hlist, but I haven't tried to implement it myself.

Finally, can I ask what you mean about implicit casting?

Sorry I may have mistated, but I was basically saying the macro is a good thing.

winapi = "0.3"

[workspace]
members = ["./wasmtime-*"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this line for? Is it intentional to exclude lightbeam?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wasmtime doesn't import wasmtime-hostmodule, so we have to do this. I would be in favour of making the wasmtime root crate reexport hostmodule but I wanted to get some feedback first. I actually mention this in the first comment, although I wasn't clear enough about it:

I haven't added any code that reexports this from the wasmtime root crate but I could imagine that that would be helpful (plus, it would mean we can remove the workspace glob that I added in 1a59141).

@eira-fransham
Copy link
Contributor Author

eira-fransham commented Jun 26, 2019

@alexcrichton

This to me seems like a bit of an odd distinction? Couldn't it just be for for all types that implement FnMut or such? (that should include function pointers)

Yes, you can use bind on function pointers since they use FnMut, but you don't have to. For closures and regular functions you have to either call bind or cast to a function pointer. This isn't avoidable, unfortunately, any generic Rust code that needs to operate on function arguments and wants to support functions with different argument lists must do something like this.

I was thinking about this recently as well, and I think that we could take this even further and remove both Value and the macro itself?

This is fair but would make closures a lot slower and reduces the amount done at compile-time, as well as being much less convenient to use. I basically want to make this as zero-overhead as possible. My wager is that the structure of the host module (for example, the names of the exports) is very unlikely to need to change at runtime, but the values are more likely to change. I support allowing different types for globals basically only because it's trivial to do so, but I would expect it to be unlikely to be necessary in practice.

Does that apply to Rust? Do you have an example in Rust where bits change in meaningful ways that matter?

Yes, I do: http://troubles.md/posts/i-triple-equals/. I linked it in the original comment but it was definitely too easy to miss, sorry about that. That post only applies to x86 32-bit but there's nothing stopping other architectures from doing the same so you have to at least allow people to specify floats bitwise without going through the float type.

I understand what you mean here but you're being "hygienic" with one precise internal callsite. That doesn't seem wortwhile to sprinkle unnecessary __. It also doesn't really seem necessary to support closures of up to 16 arguments IMO. The raw APIs can always be used as well.

You're right, I could just use Z or whatever. It's just a habit/pattern that I'm used to using when writing macros. I can change it if you like but since it's useful for external macros I don't see any reason not to be consistent with internal macros.

I guess I wouldn't disagree with that, but I would at least personally have no way to make heads or tails of this implementation. It seems like it should be more straightforward than it probably currently is and shouldn't need hlist, but I haven't tried to implement it myself.

Yeah you don't need hlist, but if you don't use it (or a hlist-like structure like nested tuples) then you can't support one or more of: combining export lists, arbitrary-size export lists, non-rust-identifier field names (less of a problem now we have raw identifiers). Plus IMO it would be equally difficult to understand no matter the implementation, although that's more subjective. Even the runtime-based builder-pattern style will still have some highly generic code in order to allow the function parameter type-to-signature magic (you wouldn't be able to get rid of the bind stuff either).

@eira-fransham
Copy link
Contributor Author

@alexcrichton @sunfishcode Could we get some movement on this?

@alexcrichton
Copy link
Member

Note that I can't actually merge here, I was just trying to help out with review. This is basically too far over my head that I can't really review it. It looks like this isn't too tied to wasmtime itself so it may be best as an external crate you maintain?

@sunfishcode
Copy link
Member

@Vurich What would you think about landing this without the wasmtime-wast parts for now? Then folks can go ahead and build on it, and we can see how it works out in practice.

@eira-fransham
Copy link
Contributor Author

Sure, but I feel like the likelihood of it being used is slim if it's not part of wasmtime proper. Not because it's not useful, but because people will say "oh, how do I add a host module" and they'll just look into wasmtime's source code for the answer and find the wasmtime-wast code that does it manually. Landing it separately is a decent idea, but I don't think that we'll see many people building on it if it's a separate crate. I don't even know if people would think to search for it.

@sunfishcode
Copy link
Member

I'm looking for a way forward here. I'm not comfortable having a system which, in its current form, and with my current experience with Rust, is beyond my understanding, as the main connection point between modules. At the same time, it's clear this provides valuable functionality. I'd like to do something. I myself will need more time to learn more about parts of Rust that I haven't previously encountered, and also to play with the code and break stuff to see how it breaks when I change it, and so on.

Another option is, in addition to removing the wasmtime-wasi changes, we could re-add the "spectest" module binding to a different place in the tree, and leave the current "spectest" module binding in the tree, to serve as a usage example, and also to allow us to easily compare various aspects of the manual approach with the wasmtime-hostmodule approach.

I'm open to suggestions for other ways to proceed too.

@eira-fransham
Copy link
Contributor Author

So I've been giving an implementation that doesn't use any type-level trickery a go, and here are the problems I'm finding:

  • Although a type-level implementation can generate trampolines to pass into wasmtime-jit (since it runs inside the compiler), a runtime implementation cannot without calling out to an in-process compiler such as Cranelift, which seems overly heavyweight. This means that every function used in a host module must have a first parameter that is VmCtx and so user code must use unsafe to access host data. With the type-level system we don't need unsafe, we can use Rust's closures to ensure safety.
  • Since we can't generate trampolines, we have to pass function pointers directly from user code to wasmtime-jit in a form that wasmtime-jit expects. Any implementation has to have some kind of bind-style annotation around functions, either a .bind call like my current implementation or an explicit as fn(_, _) -> _ (maybe with more underscores inside the ()) to cast to a function pointer.
  • We can't even have some slightly-less-type-level system that just removes HList - if we want to generate trampolines without using a runtime compiler we must generate a HostData struct and refer to it inside the trampoline. This means that generating the struct must be done at compile-time.

We can avoid the second by making the function pointers that wasmtime-jit expects monomorphic, but that would make calls slower and it would make the first problem even worse, since now every function would have to be a fn(VmCtx, &[WasmValue]), which is even less ergonomic. Plus, a non-type-level system will never be able to support any functions that aren't function pointers (so both fn-style functions and closures without casting) without generating trampolines via Cranelift. The final issue means that if we want to have an API anything like what we have in the current form of the library, the only reasonable way to do so is to have a HList-based implementation. Otherwise we lose control over the form of user functions and the benefit of the library is reduced to just function signature detection and slightly less typing.

@eira-fransham
Copy link
Contributor Author

Also, if you need a good walkthrough of how type-level computation works in Rust (especially W.R.T. HLists) here's an article about it: https://beachape.com/blog/2017/03/12/gentle-intro-to-type-level-recursion-in-Rust-from-zero-to-frunk-hlist-sculpting/

kubkon pushed a commit that referenced this pull request Nov 7, 2019
* fix Linux `isatty` implementation

* defer `WasiCtxBuilder` errors to `build()`; don't change API yet

This changes the fields on the builder to types that let the various `.arg()`, `.env()`, etc methods
infallible, so we don't have to worry about handling any errors till we actually build. This reduces
line noise when using a builder in a downstream application.

Deferring the processing of the builder fields also has the advantage of eliminating the opening and
closing of `/dev/null` for the default stdio file descriptors unless they're actually used by the
resulting `WasiCtx`.

Unicode errors when inheriting arguments and environment variables no longer cause a panic, but
instead go through `OsString`. We return `ENOTCAPABLE` at the end if there are NULs, or if UTF-8
conversion fails on Windows.

This also changes the bounds on some of the methods from `AsRef<str>` to `AsRef<[u8]>`. This
shouldn't break any existing code, but allows more flexibility when providing arguments. Depending
on the outcome of https://github.com/WebAssembly/WASI/issues/8 we may eventually want to require
these bytes be UTF-8, so we might want to revisit this later.

Finally, this fixes a tiny bug that could arise if we had exactly the maximum number of file
descriptors when populating the preopens.

* make `WasiCtxBuilder` method types less restrictive

This is a separate commit, since it changes the interface that downstream clients have to use, and
therefore requires a different commit of `wasmtime` for testing. That `wasmtime` commit is currently
on my private fork, so this will need to be amended before merging.

Now that failures are deferred until `WasiCtxBuilder::build()`, we don't need to have `Result` types
on the other methods any longer.

Additionally, using `IntoIterator` rather than `Iterator` as the trait bound for these methods is
slightly more general, and saves the client some typing.

* enforce that arguments and environment variables are valid UTF-8

* remove now-unnecessary platform-specific OsString handling

* `ENOTCAPABLE` -> `EILSEQ` for failed arg/env string conversions

* fix up comment style

* Apply @acfoltzer's fix to isatty on Linux to BSD
@alexcrichton alexcrichton changed the base branch from master to main June 25, 2020 18:45
@bjorn3
Copy link
Contributor

bjorn3 commented Jun 12, 2021

Does this PR still make sense now that we have wasmtime-witx?

@sunfishcode
Copy link
Member

Indeed, Wasmtime's API and bindings support has changed a lot.

dhil added a commit to dhil/wasmtime that referenced this pull request May 20, 2024
avanhatt pushed a commit to wellesley-prog-sys/wasmtime that referenced this pull request Oct 23, 2024
Implement conditional branch joining logic for if/else branches that
assign to distinct targets.

Until now all conditionals in ASLp programs have assigned to the same
targets on both branches. The merging logic or "phi nodes" only
implemented the logic for this simpler case. The `VecMisc` instructions
required for `popcnt` verification include `if` statements without
`else`, and therefore fail in this step. This PR refactors the code to
handle the distinct case.

The change is a no-op on current specs.

Updates #35 #34
avanhatt pushed a commit to wellesley-prog-sys/wasmtime that referenced this pull request Oct 23, 2024
Implement conditional branch joining logic for if/else branches that
assign to distinct targets.

Until now all conditionals in ASLp programs have assigned to the same
targets on both branches. The merging logic or "phi nodes" only
implemented the logic for this simpler case. The `VecMisc` instructions
required for `popcnt` verification include `if` statements without
`else`, and therefore fail in this step. This PR refactors the code to
handle the distinct case.

The change is a no-op on current specs.

Updates #35 #34
dicej pushed a commit to dicej/wasmtime that referenced this pull request Jun 3, 2025
…http

Migrate http tests to `wasmtime-wasi-http`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants