cranelift: Specialize StackAMode::FPOffset#8292
Merged
jameysharp merged 2 commits intobytecodealliance:mainfrom Apr 3, 2024
Merged
cranelift: Specialize StackAMode::FPOffset#8292jameysharp merged 2 commits intobytecodealliance:mainfrom
jameysharp merged 2 commits intobytecodealliance:mainfrom
Conversation
The StackAMode::FPOffset address mode was always used together with fp_to_arg_offset, to compute addresses within the current stack frame's argument area. Instead, introduce a new StackAMode::ArgOffset variant specifically for stack addresses within the current frame's argument area. The details of how to find the argument area are folded into the conversion from the target-independent StackAMode into target-dependent address-mode types. Currently, fp_to_arg_offset returns a target-specific constant, so I've preserved that constant in each backend's address-mode conversion. However, in general the location of the argument area may depend on calling convention, flags, or other concerns. Also, it may not always be desirable to use a frame pointer register as the base to find the argument area. I expect some backends will eventually need to introduce new synthetic addressing modes to resolve argument-area offsets after register allocation, when the full frame layout is known. I also cleaned up a couple minor things while I was in the area: - Determining argument extension type was written in a confusing way and also had a typo in the comment describing it. - riscv64's AMode::offset was only used in one place and is clearer when inlined.
abrown
approved these changes
Apr 3, 2024
cranelift/codegen/src/isa/x64/abi.rs
Outdated
| StackAMode::FPOffset(off, _ty) => { | ||
| StackAMode::ArgOffset(off, _ty) => { | ||
| let off = i32::try_from(off) | ||
| .expect("Offset in FPOffset is greater than 2GB; should hit impl limit first"); |
Member
There was a problem hiding this comment.
Suggested change
| .expect("Offset in FPOffset is greater than 2GB; should hit impl limit first"); | |
| .expect("Offset in ArgOffset is greater than 2GB; should hit implementation limit first"); |
Contributor
There was a problem hiding this comment.
The + 16 to compute the final frame pointer offset can now overflow too, right?
Contributor
Author
There was a problem hiding this comment.
Good catches! With regard to overflow, I'm moving this +16 inside the i32::try_from so the add happens at i64 instead. That makes its overflow behavior the same as current, which already had unchecked i64 addition. Similarly, the aarch64 and riscv64 targets are still doing the addition at i64 after this patch, and s390x doesn't add anything so can't overflow.
@bjorn3 correctly pointed out that I had changed the overflow behavior of this address computation. The existing code always added the result of `fp_to_arg_offset` using `i64` addition. It used Rust's default overflow behavior for addition, which panics in debug builds and wraps in release builds. In this commit I'm preserving that behavior: - s390x doesn't add anything, so can't overflow. - aarch64 and riscv64 use `i64` offsets in `FPOffset` address modes, so the addition is still using `i64` addition. - x64 does a checked narrowing to `i32`, so it's important to do the addition before that, on the wider `i64` offset.
jameysharp
added a commit
to jameysharp/wasmtime
that referenced
this pull request
Apr 18, 2024
This reverts the key parts of e3a08d4 (bytecodealliance#8151), because it turns out that we didn't need that abstraction. Several changes in the last month have enabled this: - bytecodealliance#8292 and then bytecodealliance#8316 allow us to refer to either incoming or outgoing argument areas in a (mostly) consistent way - bytecodealliance#8327, bytecodealliance#8377, and bytecodealliance#8383 demonstrate that we never need to delay writing stack arguments directly to their final location
jameysharp
added a commit
to jameysharp/wasmtime
that referenced
this pull request
Apr 18, 2024
This reverts the key parts of e3a08d4 (bytecodealliance#8151), because it turns out that we didn't need that abstraction. Several changes in the last month have enabled this: - bytecodealliance#8292 and then bytecodealliance#8316 allow us to refer to either incoming or outgoing argument areas in a (mostly) consistent way - bytecodealliance#8327, bytecodealliance#8377, and bytecodealliance#8383 demonstrate that we never need to delay writing stack arguments directly to their final location prtest:full
github-merge-queue bot
pushed a commit
that referenced
this pull request
Apr 18, 2024
This reverts the key parts of e3a08d4 (#8151), because it turns out that we didn't need that abstraction. Several changes in the last month have enabled this: - #8292 and then #8316 allow us to refer to either incoming or outgoing argument areas in a (mostly) consistent way - #8327, #8377, and #8383 demonstrate that we never need to delay writing stack arguments directly to their final location prtest:full
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The StackAMode::FPOffset address mode was always used together with fp_to_arg_offset, to compute addresses within the current stack frame's argument area.
Instead, introduce a new StackAMode::ArgOffset variant specifically for stack addresses within the current frame's argument area. The details of how to find the argument area are folded into the conversion from the target-independent StackAMode into target-dependent address-mode types.
Currently, fp_to_arg_offset returns a target-specific constant, so I've preserved that constant in each backend's address-mode conversion.
However, in general the location of the argument area may depend on calling convention, flags, or other concerns. Also, it may not always be desirable to use a frame pointer register as the base to find the argument area. I expect some backends will eventually need to introduce new synthetic addressing modes to resolve argument-area offsets after register allocation, when the full frame layout is known.
I also cleaned up a couple minor things while I was in the area: