Use F64X2 as type when saving and restoring XMM registers#1499
Use F64X2 as type when saving and restoring XMM registers#1499iximeow merged 1 commit intobytecodealliance:masterfrom
Conversation
Subscribe to Label Actioncc @bnjbvr DetailsThis issue or pull request has been labeled: "cranelift"Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
cranelift/codegen/src/isa/x86/abi.rs
Outdated
| @@ -911,6 +907,7 @@ fn insert_common_prologue( | |||
| for reg in csrs.iter(FPR) { | |||
| // Append param to entry Block | |||
| let csr_arg = pos.func.dfg.append_block_param(block, types::F64); | |||
There was a problem hiding this comment.
This line should be removed.
|
Rustfmt failed: Diff in /home/runner/work/wasmtime/wasmtime/cranelift/codegen/src/isa/x86/abi.rs at line 1031:
let mut fpr_offset = 0;
for reg in csrs.iter(FPR) {
- let value = pos
- .ins()
- .load(types::F64X2, ir::MemFlags::trusted(), stack_addr, fpr_offset);
+ let value = pos.ins().load(
+ types::F64X2,
+ ir::MemFlags::trusted(),
+ stack_addr,
+ fpr_offset,
+ );
fpr_offset += types::F64X2.bytes() as i32;
if let Some(ref mut cfa_state) = cfa_state.as_mut() { |
When adding floating-point registers as callee-saved register to block- and function parameter lists add them as `F64X2` arguments.
iximeow
left a comment
There was a problem hiding this comment.
I hadn't even expected someone to pick this up so quickly - thank you!
| ; nextln: version: 1, | ||
| ; nextln: flags: 0, | ||
| ; nextln: prologue_size: 26, | ||
| ; nextln: prologue_size: 25, |
There was a problem hiding this comment.
The astute reader may ask, "why did prologue sizes change if this was supposed to be a no-op cleanup-style change?"
It turns out my F64 causes a whole XMM register to be preserved anyway. comment became a lie between when I wrote it and when #1216 landed. Or maybe I was wrong the whole time? It was generating movsd (for example, f2 41 0f 11 33), where with this change Cranelift generates movups (-> 41 0f 11 33, one byte shorter).
This is why prologues are shorter, but more importantly, movsd only works with the low 64 bits of its xmm argument! movups is the correct instruction (movaps would also be acceptable), which moves the entire 128 bits.
There was a problem hiding this comment.
Aha! I was about to ask that very question. (Don't know how astute of a reader I am, though.) Thanks for the explanation.
When adding floating-point registers as callee-saved register to block- and function parameter lists add them as
F64X2arguments.Fixes #1497