Skip to content

Add support to wit-component to polyfill WASI #338

Merged
alexcrichton merged 10 commits intobytecodealliance:mainfrom
alexcrichton:preview1
Oct 4, 2022
Merged

Add support to wit-component to polyfill WASI #338
alexcrichton merged 10 commits intobytecodealliance:mainfrom
alexcrichton:preview1

Conversation

@alexcrichton
Copy link
Member

This commit is an addition to the wit-component tool to be able to
polyfill WASI imports today using wasi_snapshot_preview1 with a
component-model-using interface in the future. This is a large extension
to the functionality of wit-component internally since the generated
component is much "fancier".

The support in this commit is modeled as the addition of "adapter
modules" into the wit-component tool. An adapter module is understood
to translate from some core-wasm ABI into a component-model using ABI.
The intention is that for any previous API prior to the component model
an adapter module could be written which would translate from the prior
API to the new API. For example in WASI today there is:

(@interface func (export "random_get")
  (param $buf (@WitX pointer u8))
  (param $buf_len $size)
  (result $error (expected (error $errno)))
)

whereas a component-model-using API would look more like:

random-get: func(size: u32) -> list<u8>

This component-model version can be adapted with a module such as:

(module $wasi_snapshot_preview1
  (import "new-wasi" "random_get" (func $new_random_get (param i32 i32)))
  (import "env" "memory" (memory 0))

  (global $last_ptr (mut i32) i32.const 0)

  (func (export "random_get") (param i32 i32) (result i32)
    ;; store buffer pointer in a saved global for `cabi_realloc`
    ;; later
    (global.set $last_ptr (local.get 0))

    ;; 1st argument: the `size: u32`
    local.get 1

    ;; 2nd argument: return pointer for `list<u8>`
    i32.const 8

    call $new_random_get

    ;; return a "success" return code
    i32.const 0
  )

  ;; When the canonical ABI allocates space for the list return value
  ;; return the original buffer pointer to place it directly in the
  ;; target buffer
  (func (export "cabi_realloc") (param i32 i32 i32 i32) (result i32)
    global.get $last_ptr)
)

Using this adapter module the internal structure of the generated
component can be done such that everything is wired up in all the right
places meaning that when the original module calls
wasi_snapshot_preview1::random_get it actually calls this shim module
which then calls the actual new-wasi::random_get import. There's a few
details I'm glossing over here like the stack used by the shim module
but this suffices to describe the general shape.

My plan in the future is to use this support to generate a component
from all test cases that this repository supports. That means that,
specifically for wit-bindgen tests, a fresh new interface representing
"future WASI" will be created and the WASI functions used by tests will
be adapted via this adapter module. In this manner components will now
be generated for all tests and then the next step is #314, actually
ingesting these components into hosts.

@alexcrichton
Copy link
Member Author

Some procedural notes:

  • I'm opening this as a draft since I still need to clean up, comment, and add some more tests
  • This is built on Make encodings in wit-component harder to get wrong #337
  • One major piece of this is that the adapter wasm modules provided to wit-component are automatically "shrunk" to the smallest adapter necessary with the exports used by the module in question. This means, for example, that if a preview1.wasm is supplied which polyfills every single function defined in WASI the final module won't actually import every single WASI function, only those necessary. This "gc" pass is relatively significant and I've done a fair bit of macro-magic to try to make it more robust in the face of new future wasm instructions.

@alexcrichton
Copy link
Member Author

It's also worth saying that my goal is that the preview1.wasm or otherwise the supplied adapter modules are intended to be previously compiled with wit-bindgen itself, meaning that they get source-level access to all the bindgen-generated niceties you'd expected from importing an API.

@alexcrichton
Copy link
Member Author

alexcrichton commented Oct 3, 2022

Ok @pchickey this is ready to go. I've verified that this works locally with a larger integration with:

wasi_snapshot_preview1.wasm
#![allow(unused_variables)]
#![feature(asm_experimental_arch)]

wit_bindgen_guest_rust::import!("../../my-wasi.wit");

std::arch::global_asm!(
    "
    .globaltype cabi_realloc_ptr, i32
    .global cabi_realloc_ptr 
    cabi_realloc_ptr: 
"
);

fn get_cabi_realloc_ptr() -> usize {
    unsafe {
        let ret: usize;
        std::arch::asm!(
            "
                global.get cabi_realloc_ptr
                local.set {}
            ",
            out(local) ret, options(nostack)
        );
        ret
    }
}

fn set_cabi_realloc_ptr(val: usize) {
    unsafe {
        std::arch::asm!(
            "
                local.get {}
                global.set cabi_realloc_ptr
            ",
            in(local) val, options(nostack)
        );
    }
}

#[no_mangle]
pub unsafe extern "C" fn fd_write(
    fd: u32,
    iovecs: *const wasi::Ciovec,
    niovecs: usize,
    result: *mut usize,
) -> wasi::Errno {
    if fd != 1 {
        std::arch::wasm32::unreachable()
    }
    if niovecs == 0 {
        *result = 0;
    } else {
        let iovec = *iovecs;
        let slice = std::slice::from_raw_parts(iovec.buf, iovec.buf_len);
        *result = my_wasi::print(slice) as usize;
    }
    wasi::ERRNO_SUCCESS
}

#[no_mangle]
pub unsafe extern "C" fn environ_sizes_get(nvars: *mut usize, size: *mut usize) -> wasi::Errno {
    *nvars = 0;
    *size = 0;
    wasi::ERRNO_SUCCESS
}

#[no_mangle]
pub unsafe extern "C" fn environ_get(environ: *mut *mut u8, environ_buf: *mut u8) -> wasi::Errno {
    std::arch::wasm32::unreachable()
}

#[no_mangle]
pub unsafe extern "C" fn cabi_realloc(
    old_ptr: *mut u8,
    old_size: usize,
    align: usize,
    new_size: usize,
) -> *mut u8 {
    if !old_ptr.is_null() || old_size != 0 {
        std::arch::wasm32::unreachable()
    }
    let ret = get_cabi_realloc_ptr();
    set_cabi_realloc_ptr(0);
    if ret == 0 {
        std::arch::wasm32::unreachable()
    }
    if align == 0 {
        std::arch::wasm32::unreachable()
    }
    if ret % align != 0 {
        std::arch::wasm32::unreachable()
    }
    ret as *mut u8
}

#[no_mangle]
pub unsafe extern "C" fn proc_exit(code: u32) {
    std::arch::wasm32::unreachable()
}

#[no_mangle]
pub unsafe extern "C" fn random_get(buf: *mut u8, size: usize) -> wasi::Errno {
    set_cabi_realloc_ptr(buf as usize);
    let list = my_wasi::random_get(size as u32);
    std::mem::forget(list);
    wasi::ERRNO_SUCCESS
}
guest.wasm
wit_bindgen_guest_rust::import!("../../imports.wit");
wit_bindgen_guest_rust::export!("../../exports.wit");

use rand::Rng;

pub struct Exports;

impl exports::Exports for Exports {
    fn run() {
        imports::my_fancy_host_function();

        println!(
            "guest random number is: {}",
            rand::thread_rng().gen::<u32>()
        );
    }
}
*.wit files
// imports.wit
my-fancy-host-function: func()
// exports.wit
run: func()
// my-wasi.wit
print: func(data: list<u8>) -> u32
random-get: func(size: u32) -> list<u8>
host.rs
use anyhow::Result;
use rand::Rng;
use std::io::Write;
use wasmtime::component::*;
use wasmtime::{Config, Engine, Store, StoreContextMut};

fn main() -> Result<()> {
    // Set up our engine/store for executing the component
    let mut config = Config::new();
    config.wasm_component_model(true);
    let engine = Engine::new(&config)?;
    let mut store = Store::new(&engine, ());

    // Compile the output of our build script
    let component = Component::from_file(&engine, env!("GUEST"))?;

    // Define the host functionality within a `Linker`. Squint a bit and imagine
    // that this is eventually wit-bindgen-generated with a nice trait one day
    // for a better API.
    let mut linker = Linker::new(&engine);

    // This is the `imports.wit` interface that the guest imports -- a custom
    // instance import for just this component
    linker
        .instance("imports")?
        .func_wrap("my-fancy-host-function", || {
            println!("in a custom host function");
            Ok(())
        })?;

    // This is the `my-wasi.wit` interface which is not imported by the guest
    // but is instead imported by `wasi_snapshot_preview1.wasm`
    let mut wasi = linker.instance("my-wasi")?;
    wasi.func_wrap("random-get", |size: u32| {
        let mut data = vec![0u8; size as usize];
        rand::thread_rng().fill(data.as_mut_slice());
        Ok((data,))
    })?;
    wasi.func_wrap(
        "print",
        |mut store: StoreContextMut<'_, ()>, list: WasmList<u8>| {
            let list = list.as_le_slice(&store);
            Ok((std::io::stdout().write(list)? as u32,))
        },
    )?;

    // Instantiate and call the `run` function
    let instance = linker.instantiate(&mut store, &component)?;
    let func = instance.get_typed_func::<(), (), _>(&mut store, "run")?;
    func.call(&mut store, ())?;
    Ok(())
}

There's a gnarly build.rs that weaves this all together for now but the gist of it is:

  • Compile guest.wasm to wasm32-wasi as you normally would.
  • Compile wasi_snapshot_preview1.wasm with the wasm32-unknown-unknown target, release mode, and --import-memory passed to the linker.
  • Invoke wit-component --import ./imports.wit --interface ./exports.wit --adapt ./wasi_snapshot_preview1.wasm:./my-wasi.wit ./guest.wasm -o component.wasm

As I write all this I realized there's still an issue with the global management with cabi_realloc, so I need to bottom that out and tweak the "initialize the stack pointer" code a bit, but that shouldn't be too hard. Otherwise this should be good to review.

@alexcrichton
Copy link
Member Author

Ok the most recent commit is a pretty bad hack around finding the stack pointer (searching the name section). I was hoping that better could be done but I'm not sure we can at this time. At the very least this should enable C & Rust to be used to write these adapter modules.

Copy link
Contributor

@pchickey pchickey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I need a little more help understanding the validator, but the tests look good to me

This commit is an addition to the `wit-component` tool to be able to
polyfill WASI imports today using `wasi_snapshot_preview1` with a
component-model-using interface in the future. This is a large extension
to the functionality of `wit-component` internally since the generated
component is much "fancier".

The support in this commit is modeled as the addition of "adapter
modules" into the `wit-component` tool. An adapter module is understood
to translate from some core-wasm ABI into a component-model using ABI.
The intention is that for any previous API prior to the component model
an adapter module could be written which would translate from the prior
API to the new API. For example in WASI today there is:

    (@interface func (export "random_get")
      (param $buf (@WitX pointer u8))
      (param $buf_len $size)
      (result $error (expected (error $errno)))
    )

whereas a component-model-using API would look more like:

    random-get: func(size: u32) -> list<u8>

This component-model version can be adapted with a module such as:

    (module $wasi_snapshot_preview1
      (import "new-wasi" "random_get" (func $new_random_get (param i32 i32)))
      (import "env" "memory" (memory 0))

      (global $last_ptr (mut i32) i32.const 0)

      (func (export "random_get") (param i32 i32) (result i32)
        ;; store buffer pointer in a saved global for `cabi_realloc`
        ;; later
        (global.set $last_ptr (local.get 0))

        ;; 1st argument: the `size: u32`
        local.get 1

        ;; 2nd argument: return pointer for `list<u8>`
        i32.const 8

        call $new_random_get

        ;; return a "success" return code
        i32.const 0
      )

      ;; When the canonical ABI allocates space for the list return value
      ;; return the original buffer pointer to place it directly in the
      ;; target buffer
      (func (export "cabi_realloc") (param i32 i32 i32 i32) (result i32)
        global.get $last_ptr)
    )

Using this adapter module the internal structure of the generated
component can be done such that everything is wired up in all the right
places meaning that when the original module calls
`wasi_snapshot_preview1::random_get` it actually calls this shim module
which then calls the actual `new-wasi::random_get` import. There's a few
details I'm glossing over here like the stack used by the shim module
but this suffices to describe the general shape.

My plan in the future is to use this support to generate a component
from all test cases that this repository supports. That means that,
specifically for `wit-bindgen` tests, a fresh new interface representing
"future WASI" will be created and the WASI functions used by tests will
be adapted via this adapter module. In this manner components will now
be generated for all tests and then the next step is bytecodealliance#314, actually
ingesting these components into hosts.
Should help with debugging structure ideally
This unfortunately suffers greatly from false negatives, but at this
time it's unclear if this can be done better.
@alexcrichton
Copy link
Member Author

Oops sorry I forgot to write documentation for the new structures, but I've added that all now.

@alexcrichton alexcrichton merged commit fc35377 into bytecodealliance:main Oct 4, 2022
@alexcrichton alexcrichton deleted the preview1 branch October 4, 2022 18:50
alexcrichton added a commit to alexcrichton/wit-bindgen that referenced this pull request Oct 5, 2022
This commit is the next step in integrating `wasm32-wasi`,
`wit-bindgen`, tests, and components all together. Tests are now again
compiled with `wasm32-wasi` and use a repo-specific adapter module
(with support from bytecodealliance#338) to support transforming the final module into
an actual component.

In supporting this feature the support from bytecodealliance#331 is refactored into a
new `extract` Rust module so the functionality can be shared between
the ingestion of the main module as well as ingestion of adapter
modules. Adapter modules now are also supported on the CLI as a
standalone file without having to specify other options.

Note that the actual `wasi_snapshot_preview1.wasm` adapter is
non-functional in this commit and doesn't do anything fancy. The tests
in this repository don't really need all that much and I suspect all
we'll really need to implement is `fd_write` for fd 1 (as that's
stdout).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants