Shield compiled modules from their appended metadata#4609
Merged
alexcrichton merged 1 commit intobytecodealliance:mainfrom Aug 10, 2022
Merged
Shield compiled modules from their appended metadata#4609alexcrichton merged 1 commit intobytecodealliance:mainfrom
alexcrichton merged 1 commit intobytecodealliance:mainfrom
Conversation
This commit fixes bytecodealliance#4600 in a somewhat roundabout fashion. Currently the `main` branch of Wasmtime exhibits unusual behavior: * If `./ci/run-tests.sh` is run then the `cache_accounts_for_opt_level` test does not fail. * If `cargo test -p wasmtime --lib` is run, however, then the test fails. This test is indeed being run as part of `./ci/run-tests.sh` and it's also passing in CI. The exact failure is that part of the debuginfo support we have takes an existing ELF image, copies it, and then appends some information to inform profilers/gdb about the image. This code is all quite old at this point and not 100% optimal, but that's at least where we're at. The problem is that the appended `ProgramHeader64` is not aligned correctly during `cargo test -p wasmtime --lib`, which is the panic that happens causing the test to fail. The reason, however, that this test passes with `./ci/run-tests.sh` is that the alignment of `ProgramHeader64` is 1 instead of 8. The reason for that is that the `object` crate has an `unaligned` feature which forcibly unaligns all primitives to 1 byte instead of their natural alignment. During `cargo test -p wasmtime --lib` this feature is not enabled but during `./ci/run-tests.sh` this feature is enabled. The feature is currently enabled through inclusion of the `backtrace` crate which only happens for some tests in some crates. The alignment issue explains why the test fails on a single crate test but fails on the whole workspace tests. The next issue I investigated was if this test ever passed. It turns out that on v0.39.0 this test passed, and the regression to main was introduced during bytecodealliance#4571. That PR, however, has nothing to do with any of this! The reason that this showed up as causing a "regression" however is because it changed cranelift settings which changed the size of serialized metadata at the end of a Wasmtime cache object. Wasmtime compiled artifacts are ELF images with Wasmtime-specific metadata appended after them. This appended metadata was making its way all the way through to the gdbjit image itself which mean that while the end of the ELF file itself was properly aligned the space after the Wasmtime metadata was not aligned. This metadata changes in size over time as Cranelift settings change which explains why bytecodealliance#4571 was the "source" of the regression. The fix in this commit is to discard the extra Wasmtime metadata when creating an `MmapVec` representing the underlying ELF image. This is already supported with `MmapVec::drain` so it was relatively easy to insert that. This means that the gdbjit image starts with just the ELF file itself which is always aligned at the end, which gets the test passing with/without the `unaligned` feature in the `object` crate.
Subscribe to Label Actioncc @peterhuene DetailsThis issue or pull request has been labeled: "wasmtime:api"Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
jameysharp
approved these changes
Aug 9, 2022
Contributor
jameysharp
left a comment
There was a problem hiding this comment.
I needed several read-throughs of both the commit message and the patch to understand this, but eventually I was able to convince myself that this patch does what you said it needed to: limit the returned SerializedModule::artifacts to refer only to the ELF object, and not any trailers.
I think it was MmapVec::drain that caused me the most trouble as it only loosely resembles the corresponding method on Vec. Perhaps a future improvement would be to model that method after Vec::split_off instead?
alexcrichton
added a commit
to alexcrichton/wasmtime
that referenced
this pull request
Aug 10, 2022
As suggested on bytecodealliance#4609
alexcrichton
added a commit
that referenced
this pull request
Aug 15, 2022
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This commit fixes #4600 in a somewhat roundabout fashion. Currently the
mainbranch of Wasmtime exhibits unusual behavior:./ci/run-tests.shis run then thecache_accounts_for_opt_leveltest does not fail.
cargo test -p wasmtime --libis run, however, then the testfails.
This test is indeed being run as part of
./ci/run-tests.shand it'salso passing in CI. The exact failure is that part of the debuginfo
support we have takes an existing ELF image, copies it, and then appends
some information to inform profilers/gdb about the image. This code is
all quite old at this point and not 100% optimal, but that's at least
where we're at.
The problem is that the appended
ProgramHeader64is not alignedcorrectly during
cargo test -p wasmtime --lib, which is the panic thathappens causing the test to fail. The reason, however, that this test
passes with
./ci/run-tests.shis that the alignment ofProgramHeader64is 1 instead of 8. The reason for that is that theobjectcrate has anunalignedfeature which forcibly unaligns allprimitives to 1 byte instead of their natural alignment. During
cargo test -p wasmtime --libthis feature is not enabled but during./ci/run-tests.shthis feature is enabled. The feature is currentlyenabled through inclusion of the
backtracecrate which only happensfor some tests in some crates.
The alignment issue explains why the test fails on a single crate test
but fails on the whole workspace tests. The next issue I investigated
was if this test ever passed. It turns out that on v0.39.0 this test
passed, and the regression to main was introduced during #4571. That
PR, however, has nothing to do with any of this! The reason that this
showed up as causing a "regression" however is because it changed
cranelift settings which changed the size of serialized metadata at the
end of a Wasmtime cache object.
Wasmtime compiled artifacts are ELF images with Wasmtime-specific
metadata appended after them. This appended metadata was making its way
all the way through to the gdbjit image itself which mean that while the
end of the ELF file itself was properly aligned the space after the
Wasmtime metadata was not aligned. This metadata changes in size over
time as Cranelift settings change which explains why #4571 was the
"source" of the regression.
The fix in this commit is to discard the extra Wasmtime metadata when
creating an
MmapVecrepresenting the underlying ELF image. This isalready supported with
MmapVec::drainso it was relatively easy toinsert that. This means that the gdbjit image starts with just the ELF
file itself which is always aligned at the end, which gets the test
passing with/without the
unalignedfeature in theobjectcrate.