Introduce pass to lower memory.copy and memory.fill by dschuff · Pull Request #7021 · WebAssembly/binaryen

dschuff · 2024-10-18T00:42:23Z

This pass lowers away memory.copy and memory.fill operations. It generates a function that implements the each of the instructions and replaces the instructions with calls to those functions.
It does not handle other bulk memory operations (e.g. passive segments and table operations) because they are not used by emscripten to enable targeting old browsers that don't support bulk memory.

dschuff · 2024-10-18T22:54:01Z

@kripken I haven't actually tested this for real yet, but can you give it a quick look-over for general sanity? I've not got a lot of experience using these APIs.

dschuff · 2024-10-18T23:03:48Z

Hmph. I don't know if I agree with clang-format on the best way to format this, but I guess it's best to keep everything conforming :D

kripken

A few corner cases here, but otherwise looks right to me.

With the corner cases handled, this may be an annoying amount of code to write in a declarative way. Another option might be to write wat code (edit: or compile to wat) and merge that in. RemoveNonJSOps.cpp does that with wasm-intrinsics.wat. I'm not sure if it would be better though.

kripken · 2024-10-18T22:57:45Z

src/passes/MemoryCopyFillLowering.cpp

+
+    if (needsMemoryCopy) {
+      Index dst = 0, src = 1, size = 2, temp = 3;
+      Name memory = module->memories.front()->name;


We might need a separate copy function per memory?

It's worse than that, because there can also be cross-memory copies. And one or both memories could be 64-bit. There's a check up top that bails if the module has multi-memory or memory64 support.

Heh, right, cross-memory copies too... sgtm to limit to single memories, I missed that part.

kripken · 2024-10-18T23:01:30Z

src/passes/MemoryCopyFillLowering.cpp

+        b.makeBinary(BinaryOp::MulInt32,
+          b.makeMemorySize(memory),
+          b.makeConst(Memory::kPageSize))));
+      // if dst + size > memsize or src + size > memsize, then trap.


Overflows can also happen here, unfortunately. See

binaryen/src/wasm-interpreter.h

Lines 3928 to 3934 in 679c26f

if (sourceVal + sizeVal > sourceMemorySize * Memory::kPageSize ||

destVal + sizeVal > destMemorySize * Memory::kPageSize ||

// FIXME: better/cheaper way to detect wrapping?

sourceVal + sizeVal < sourceVal || sourceVal + sizeVal < sizeVal ||

destVal + sizeVal < destVal || destVal + sizeVal < sizeVal) {

trap("out of bounds segment access in memory.copy");

}

kripken · 2024-10-18T23:04:24Z

src/passes/MemoryCopyFillLowering.cpp

+            b.makeBinary(BinaryOp::EqInt32,
+              b.makeLocalGet(dst, Type::i32),
+              b.makeLocalGet(temp, Type::i32))),
+          // --dst; --src;


I think we may need to pick the direction, up or down, based on the overlap? wasm-interpreter.h does that.

kripken · 2024-10-18T23:06:00Z

src/passes/MemoryCopyFillLowering.cpp

+      Name memory = module->memories.front()->name;
+      Block* body = b.makeBlock();
+
+      // if dst + size > memsize in bytes, then trap.


Wrapping is possible here too.

dschuff · 2024-10-18T23:21:51Z

Ah, so here is where we have to decide how general-purpose we want the pass to be. If this is just for lowering away LLVM-produced memcpy and supporting older browsers, then we don't need to support multi-memory, memory64, or threads (because browsers that support those always support bulk memory). Not supporting threads means that the order of byte copies doesn't have to match the spec, because it's not observable. If we further assume that clang is the source of the memory ops, then we don't have to worry about pointer overflow because it's UB.
For now I'm pretty happy with the former assumption because supporting more feature interactions would add a fair amount of complexity for no known use case. The latter I'm less sure about. I suppose it's possible we'd be running this lowering over mixed C+Rust code?

kripken · 2024-10-18T23:31:22Z

Those sound like reasonable assumptions to me.

But IIANM the order of copying matters when the ranges overlap, too:

[----------]
   [----------]

Copying the former over the latter in the forward direction is fine, but in reverse will smear.

But maybe LLVM never emits a memory.copy where this might matter? Does it lower llvm.memcpy and/or llvm.memmove?

Btw, an issue I realized with limiting these passes to LLVM's output is that we can't run the spec tests on them or fuzz them, not normally at least.

dschuff · 2024-10-18T23:41:26Z

Ah, good point about the order. LLVM does lower both to memory.copy (see WebAssemblySelectionDAGInfo.cpp).
Yeah, my way of testing was just to run the passes unconditionally and send them through all of the emscripten tests. But that's probably not as good as including spec tests or fuzzing.

src/passes/MemoryCopyFillLowering.cpp

…update test

sbc100 · 2024-11-07T22:13:29Z

src/passes/MemoryCopyFillLowering.cpp

+  void VisitTableFill(TableCopy* curr) {
+    Fatal() << "table.fill instruction found. Memory copy lowering is not "
+    "designed to work on modules with bulk table operations";
+  }


Same for table.init and memory.init?

The init and drop instructions are only valid when there are passive segments, so it should be sufficient to only check for passive segments.

src/passes/MemoryCopyFillLowering.cpp

dschuff · 2024-11-08T18:10:10Z

Btw, an issue I realized with limiting these passes to LLVM's output is that we can't run the spec tests on them or fuzz them, not normally at least.

I realized that another way to implement the trapping behavior inside wasm32 would be to extend the addresses to i64 and do the trap check in i64. Obviously it would add more complexity; do you think that would be worth it?

kripken · 2024-11-08T18:23:53Z

I don't follow, how would extending to i64 help here? Would we be able to check things we don't currently (what)?

dschuff · 2024-11-08T18:37:28Z

Sorry, I was thinking of the comments above about possible overflow when checking for out of bounds trapping. Currently we don't handle that because UB, but if we wanted to run spec tests or fuzz, we'd presumably need to handle traps that also overflow the i32 range.

kripken · 2024-11-08T18:47:16Z

I see. We can also test for that overflow in 32-bit (x + y < x implies overflow if both are unsigned), but I don't know what's simpler.

I'm also not sure if full fuzzing and spec tests are worth it. We can get some fuzzing through emcc (using csmith to fuzz from C code).

dschuff · 2024-11-08T18:50:37Z

Yeah my thinking was that extending to i64 would be simpler than adding the extra condition you mentioned for both src and dst. But I'm also leaning toward not bothering with the spec tests.
I'm running all the emscripten tests with the pass enabled (and I'm considering adding another test mode that builds for MVP wasm which would have them enabled continuously? WDYT about that?)
To get extra testing beyond the emscripten tests, I'm also going to run them with the LLVM testsuite tests.

dschuff · 2024-11-08T18:58:40Z

OK, assuming we keep the current trapping behavior, and the current structure (i.e. of generating rather than merging the polyfill functions) I think this is ready.

kripken · 2024-11-08T19:37:16Z

I'm running all the emscripten tests with the pass enabled (and I'm considering adding another test mode that builds for MVP wasm which would have them enabled continuously? WDYT about that?)
To get extra testing beyond the emscripten tests, I'm also going to run them with the LLVM testsuite tests.

I like the idea to add an MVP test mode. Off on CircleCI, just running on the testsuite bot?

dschuff · 2024-11-08T19:40:37Z

I like the idea to add an MVP test mode. Off on CircleCI, just running on the testsuite bot?

Yeah, along with the ASan and similar modes.

kripken

I think we could add some execution tests for this, under test/lit/exec/. The tests can contain some copy/fill operations, and print memory contents after the operation. By running them with --lower --fuzz-exec we would verify that the output is unchanged after the lowering.

src/passes/MemoryCopyFillLowering.cpp

test/lit/passes/memory-copy-fill-lowering.wast

dschuff · 2024-11-12T22:03:46Z

I think we could add some execution tests for this, under test/lit/exec/. The tests can contain some copy/fill operations, and print memory contents after the operation. By running them with --lower --fuzz-exec we would verify that the output is unchanged after the lowering.

I'm trying this out, but I'm not sure how to print the memory. The execution engine in wasm-opt doesn't seem to have the same magic spectest.print imports that wasm-shell has.

kripken · 2024-11-12T22:11:24Z

You can call the logging imports, see

binaryen/test/lit/exec/fuzzing-api.wast

Lines 25 to 35 in 52cae11

    
           ;; CHECK:      [fuzz-exec] calling logging 
        
           ;; CHECK-NEXT: [LoggingExternalInterface logging 42] 
        
           ;; CHECK-NEXT: [LoggingExternalInterface logging 3.14159] 
        
           (func $logging (export "logging") 
        
            (call $log-i32 
        
             (i32.const 42) 
        
            ) 
        
            (call $log-f64 
        
             (f64.const 3.14159) 
        
            ) 
        
           )

and their definitions earlier in that file.

dschuff · 2024-11-12T23:53:08Z

Thanks, WDYT of the memory copy test? It's derived from the wast spectest but turned into a fuzz-exec test.

kripken · 2024-11-13T00:13:22Z

test/lit/exec/memory-copy.wat

@@ -0,0 +1,175 @@
+;; NOTE: Assertions have been generated by update_lit_checks.py --output=fuzz-exec and should not be edited.
+
+;; RUN: wasm-opt %s --enable-bulk-memory --fuzz-exec-before -q -o /dev/null 2>&1 | filecheck %s


No need for this line, I think: --fuzz-exec on the next line will run both before and after the pass, print, and compare.

Suggested change

;; RUN: wasm-opt %s --enable-bulk-memory --fuzz-exec-before -q -o /dev/null 2>&1 | filecheck %s

Is there an internal comparison, or just filecheck? One difference is that the memory.copy traps in the original and the polyfill executes an unreachable in the lowered version. This results in a different printout.

Nevermind, ignore what I just deleted; I think I have it right now.

I don't think we need to filecheck the second execution.

For me, --pass --fuzz-exec not erroring is proof that a pass does not break execution, and is the main benefit of this test.

Yeah; the test auto-update does actually include the second execution in the filecheck (and it has the different trap message), so that's also covered anyway for free.

kripken · 2024-11-13T00:17:18Z

Test looks great! lgtm with the same for memory.fill.

kripken reviewed Oct 18, 2024

View reviewed changes

dschuff added 6 commits November 5, 2024 17:43

Lower memory.copy

0ed5b59

update help tests

9bbd758

reverse loop condition

16285bb

add memory fill

47dc98f

copy in both directions, add var names

c373c13

Fix bugs

46aaa06

dschuff force-pushed the copy-fill-lowering branch from d2c174a to 46aaa06 Compare November 6, 2024 23:37

sbc100 reviewed Nov 7, 2024

View reviewed changes

src/passes/MemoryCopyFillLowering.cpp Show resolved Hide resolved

src/passes/MemoryCopyFillLowering.cpp Outdated Show resolved Hide resolved

src/passes/MemoryCopyFillLowering.cpp Outdated Show resolved Hide resolved

use Fatal instead of throw; error on other bulk-memory instructions; …

8c74874

…update test

sbc100 reviewed Nov 7, 2024

View reviewed changes

dschuff added 2 commits November 7, 2024 14:24

review

d0f3c87

clang-format

914def0

dschuff marked this pull request as ready for review November 7, 2024 22:27

dschuff changed the title ~~[WIP] Lower memory.copy~~ Introduce pass to lower memory.copy and memory.fill Nov 7, 2024

fix func names

c758b5f

sbc100 reviewed Nov 7, 2024

View reviewed changes

src/passes/MemoryCopyFillLowering.cpp Show resolved Hide resolved

improve formatting

02201ef

dschuff added 2 commits November 8, 2024 10:52

suppress clang-format

2854ff3

revert formatting changes; not worth it

3ccb64f

kripken reviewed Nov 8, 2024

View reviewed changes

src/passes/MemoryCopyFillLowering.cpp Outdated Show resolved Hide resolved

src/passes/MemoryCopyFillLowering.cpp Show resolved Hide resolved

src/passes/MemoryCopyFillLowering.cpp Outdated Show resolved Hide resolved

test/lit/passes/memory-copy-fill-lowering.wast Show resolved Hide resolved

dschuff added 2 commits November 8, 2024 11:59

review suggestions

621e6b5

add explanation

7752824

Add memory copy test

0b7c98a

kripken reviewed Nov 13, 2024

View reviewed changes

update tests

91b60ad

kripken approved these changes Nov 13, 2024

View reviewed changes

dschuff enabled auto-merge (squash) November 13, 2024 01:10

dschuff merged commit 496d92b into main Nov 13, 2024

dschuff deleted the copy-fill-lowering branch November 13, 2024 01:33

	if (sourceVal + sizeVal > sourceMemorySize * Memory::kPageSize \|\|
	destVal + sizeVal > destMemorySize * Memory::kPageSize \|\|
	// FIXME: better/cheaper way to detect wrapping?
	sourceVal + sizeVal < sourceVal \|\| sourceVal + sizeVal < sizeVal \|\|
	destVal + sizeVal < destVal \|\| destVal + sizeVal < sizeVal) {
	trap("out of bounds segment access in memory.copy");
	}

		@@ -0,0 +1,175 @@
		;; NOTE: Assertions have been generated by update_lit_checks.py --output=fuzz-exec and should not be edited.

		;; RUN: wasm-opt %s --enable-bulk-memory --fuzz-exec-before -q -o /dev/null 2>&1 \| filecheck %s

Conversation

dschuff commented Oct 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dschuff commented Oct 18, 2024

Uh oh!

dschuff commented Oct 18, 2024

Uh oh!

kripken left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dschuff commented Oct 18, 2024

Uh oh!

kripken commented Oct 18, 2024

Uh oh!

dschuff commented Oct 18, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dschuff commented Nov 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kripken commented Nov 8, 2024

Uh oh!

dschuff commented Nov 8, 2024

Uh oh!

kripken commented Nov 8, 2024

Uh oh!

dschuff commented Nov 8, 2024

Uh oh!

dschuff commented Nov 8, 2024

Uh oh!

kripken commented Nov 8, 2024

Uh oh!

dschuff commented Nov 8, 2024

Uh oh!

kripken left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dschuff commented Nov 12, 2024

Uh oh!

kripken commented Nov 12, 2024

Uh oh!

dschuff commented Nov 12, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dschuff Nov 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

dschuff commented Oct 18, 2024 •

edited

Loading

kripken left a comment •

edited

Loading

dschuff commented Nov 8, 2024 •

edited

Loading

dschuff Nov 13, 2024 •

edited

Loading