Skip to content

Use F64X2 as type when saving and restoring XMM registers#1499

Merged
iximeow merged 1 commit intobytecodealliance:masterfrom
samrat:fix-xmm-type
Apr 13, 2020
Merged

Use F64X2 as type when saving and restoring XMM registers#1499
iximeow merged 1 commit intobytecodealliance:masterfrom
samrat:fix-xmm-type

Conversation

@samrat
Copy link
Contributor

@samrat samrat commented Apr 11, 2020

When adding floating-point registers as callee-saved register to block- and function parameter lists add them as F64X2 arguments.

Fixes #1497

@github-actions github-actions bot added the cranelift Issues related to the Cranelift code generator label Apr 11, 2020
@github-actions
Copy link

Subscribe to Label Action

cc @bnjbvr

Details This issue or pull request has been labeled: "cranelift"

Thus the following users have been cc'd because of the following labels:

  • bnjbvr: cranelift

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@@ -911,6 +907,7 @@ fn insert_common_prologue(
for reg in csrs.iter(FPR) {
// Append param to entry Block
let csr_arg = pos.func.dfg.append_block_param(block, types::F64);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should be removed.

@bjorn3
Copy link
Contributor

bjorn3 commented Apr 11, 2020

Rustfmt failed:

Diff in /home/runner/work/wasmtime/wasmtime/cranelift/codegen/src/isa/x86/abi.rs at line 1031:
         let mut fpr_offset = 0;
 
         for reg in csrs.iter(FPR) {
-            let value = pos
-                .ins()
-                .load(types::F64X2, ir::MemFlags::trusted(), stack_addr, fpr_offset);
+            let value = pos.ins().load(
+                types::F64X2,
+                ir::MemFlags::trusted(),
+                stack_addr,
+                fpr_offset,
+            );
             fpr_offset += types::F64X2.bytes() as i32;
 
             if let Some(ref mut cfa_state) = cfa_state.as_mut() {

When adding floating-point registers as callee-saved register to
block- and function parameter lists add them as `F64X2` arguments.
Copy link
Contributor

@iximeow iximeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't even expected someone to pick this up so quickly - thank you!

; nextln: version: 1,
; nextln: flags: 0,
; nextln: prologue_size: 26,
; nextln: prologue_size: 25,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The astute reader may ask, "why did prologue sizes change if this was supposed to be a no-op cleanup-style change?"

It turns out my F64 causes a whole XMM register to be preserved anyway. comment became a lie between when I wrote it and when #1216 landed. Or maybe I was wrong the whole time? It was generating movsd (for example, f2 41 0f 11 33), where with this change Cranelift generates movups (-> 41 0f 11 33, one byte shorter).

This is why prologues are shorter, but more importantly, movsd only works with the low 64 bits of its xmm argument! movups is the correct instruction (movaps would also be acceptable), which moves the entire 128 bits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha! I was about to ask that very question. (Don't know how astute of a reader I am, though.) Thanks for the explanation.

@iximeow iximeow merged commit 4d34c22 into bytecodealliance:master Apr 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift Issues related to the Cranelift code generator

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cranelift: express x86 callee-save FPR correctly

4 participants