Cranelift: Fix FPR saving and shadow space allocation for Windows x64.#1734
Cranelift: Fix FPR saving and shadow space allocation for Windows x64.#1734peterhuene merged 4 commits intobytecodealliance:masterfrom
Conversation
This commit fixes both how FPR callee-saved registers are saved and how the shadow space allocation occurs when laying out the stack for Windows x64 calling convention. Importantly, this commit removes the compiler limitation of stack size for Windows x64 that was imposed because FPR saves previously couldn't always be represented in the unwind information. The FPR saves are now performed without using stack slots, much like how the callee-saved GPRs are saved. The total CSR space is given to `layout_stack` so that it is included in the frame size and to offset the layout of spills and explicit slots. The FPR saves are now done via an RSP offset (post adjustment) and they always follow the GPR saves on the stack. A simpler calculation can now be made to determine the proper offsets of the FPR saves for representing the unwind information. Additionally, the shadow space is no longer treated as an incoming argument, but an explicit stack slot that gets laid out at the lowest address possible in the local frame. This prevents `layout_stack` from putting a spill or explicit slot in this reserved space. In the future, `layout_stack` should take advantage of the *caller-provided* shadow space for spills, but this commit does not attempt to address that. The shadow space is now omitted from the local frame for leaf functions. Fixes bytecodealliance#1728. Fixes bytecodealliance#1587. Fixes bytecodealliance#1475.
|
See this comment for a "before and after" codegen. |
Subscribe to Label Actioncc @bnjbvr DetailsThis issue or pull request has been labeled: "cranelift"Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
iximeow
left a comment
There was a problem hiding this comment.
This is great, thank you! Taking sp as an argument to avoid dealing with stack slots is a good idea, wish I thought of it :)
"requesting changes" because I left a few thoughts about alignment details particularly for non-64bit platforms. I'm also not sure if those platforms have callee-save FPRs in the first place, so it might be an academic concern. Either way, that's my only hesitance towards approving.
|
I think that this may be all that's necessary to remove this ignore block as well perhaps? |
|
Good catch! I forgot we had disabled tests because of this issue. I'll re-enable. |
48f8035 to
54f9cd2
Compare
iximeow
left a comment
There was a problem hiding this comment.
Thank you! (modulo happy CI run)
This commit prevents updating the XMM save unwind operation offsets when a frame pointer is not used, even though currently Cranelift always uses a frame pointer. This will prevent incorrect unwind information in the future when we start omitting frame pointers.
|
@bnjbvr would you mind reviewing or suggesting a reviewer from cranelift core? Thanks! |
|
After discussing with Dan, I'm proceeding with merging this with @iximeow's sign-off. |
|
Oops, sorry I didn't see this among the github-bot mentions; I entirely trust @iximeow's reviews when it comes to Cranelift, for what it's worth :) |
This PR fixes both how FPR callee-saved registers are saved and how the
shadow space allocation occurs when laying out the stack for Windows x64
calling convention.
Importantly, this PR removes the compiler limitation of stack size for
Windows x64 that was imposed because FPR saves previously couldn't always be
represented in the unwind information.
The FPR saves are now performed without using stack slots, much like how the
callee-saved GPRs are saved. The total CSR space is given to
layout_stacksothat it is included in the frame size and to offset the layout of spills and
explicit slots.
The FPR saves are now done via an RSP offset (post adjustment) and they always
follow the GPR saves on the stack. A simpler calculation can now be made to
determine the proper offsets of the FPR saves for representing the unwind
information.
Additionally, the shadow space is no longer treated as an incoming argument,
but an explicit stack slot that gets laid out at the lowest address possible in
the local frame. This prevents
layout_stackfrom putting a spill or explicitslot in this reserved space. In the future,
layout_stackshould takeadvantage of the caller-provided shadow space for spills, but this PR does
not attempt to address that.
The shadow space is now omitted from the local frame for leaf functions.
Fixes #1728.
Fixes #1587.
Fixes #1475.