Cranelift AArch64: Migrate Splat to ISLE#4521
Conversation
cfallin
left a comment
There was a problem hiding this comment.
LGTM, and thanks for the fixes to preferred/non-preferred vec regs as well -- seems to have helped in a few places!
|
Happy to merge once merge conflicts are resolved. |
ff5054d to
02c8044
Compare
|
@cfallin I have also noticed that |
|
@akirilov-arm yes, unfortunately... regalloc2 actually doesn't need any temporaries anymore, but aarch64 itself does. The reason is that a spillslot may be at a greater offset from If I recall correctly, |
|
To follow up on that, I guess an alternative approach would be to reserve a small-offset slot to spill another victim to if we need to compute a spillslot address at a large distance away -- so we can bootstrap our way there with no registers initially free. That's a little more complexity but I wouldn't be averse to reviewing a PR if someone wants to attempt that. |
Copyright (c) 2022, Arm Limited.
02c8044 to
2983e08
Compare
|
Another alternative is to reserve a vector register, which would give us the same space as 2 GPRs. One further idea - if the |
Ah, that's a really interesting idea actually. I guess it's not needed for unwind (that starts from |
|
The optimal solution would be to adjust the set of allocatable registers on a per-function basis, so that we could do the same for non-leaf functions (or leaf functions that use the stack) when P.S. Actually I didn't mean to use |
|
I'm curious about the small-offset spillslot idea. I haven't looked at aarch64 details and don't know enough about Cranelift internals yet, but my assumptions are:
Something along those lines? |
|
@jameysharp yep, that's more or less it. @akirilov-arm re:
we could definitely change that! The only thing I want to hold as a hard requirement is that we don't build it dynamically per-function (because there are lots of tiny functions and that would be a nontrivial cost); right now we build it once when the compiler backend is constructed. We could perhaps build a few versions of it though, and return the right one in the Anyway we're getting on quite a tangent here but if you're interested, please feel free to file an issue to "reclaim spilltmp registers on aarch64" as a future enhancement to track this! |
No description provided.