Skip to content
This repository was archived by the owner on Jun 26, 2020. It is now read-only.

Allow pinning a certain register in Cranelift, and use it as the heap base#960

Merged
bnjbvr merged 3 commits intobytecodealliance:masterfrom
bnjbvr:pinned-reg
Sep 6, 2019
Merged

Allow pinning a certain register in Cranelift, and use it as the heap base#960
bnjbvr merged 3 commits intobytecodealliance:masterfrom
bnjbvr:pinned-reg

Conversation

@bnjbvr
Copy link
Member

@bnjbvr bnjbvr commented Aug 30, 2019

Based on #959. So this is very experimental, probably a bit sketchy. This allows pinning one register (r15 on x86 64 bits) so it's entirely in the control of the embedder. In particular, it tries to exclude it from register allocation. Assertions in this code plus all my tests passing make me confident it isn't horribly wrong, but it doesn't prove this is incredibly good either. (I'm a bit worried about local diversions in particular.) Also, this is not enough tested at the moment, I'll try to add more test cases and fuzz this a bit (?).

Second commit adds a second setting that enables use of the pinned register as the heap base when legalizing heap_addr instructions. If the embedder does the Right Thing, and correctly sets the heap base with set_pinned_reg, this is enough to avoid reloading the heap base before every single memory access, and is a nice win in Spidermonkey (up to 10% in some of our benchmarks).

@iximeow
Copy link
Collaborator

iximeow commented Aug 31, 2019

🎉🎉🎉 i was hoping to make time next week to do just this with the Spidermonkey changes you mentioned recently! instead i'll test run Lucet's tests + wasm spec tests with these changes instead :)

are you aware of use cases for pinning n>1 registers? there's a few we've talked about with Lucet, but we haven't measured what improvement we'd get yet.

@bnjbvr
Copy link
Member Author

bnjbvr commented Sep 2, 2019

tadatadatada i was hoping to make time next week to do just this with the Spidermonkey changes you mentioned recently! instead i'll test run Lucet's tests + wasm spec tests with these changes instead :)

Note you still need to use the new instruction to set the heap base, before entering code compiled with Cranelift, and in certain precise points (when translating a call to an import, or an indirect call, or after wasm's mem.grow, and maybe others I'm not thinking of right now).

are you aware of use cases for pinning n>1 registers? there's a few we've talked about with Lucet, but we haven't measured what improvement we'd get yet.

Well, ideally this would be its own mechanism in register allocation, maybe as a hint that some value is very important and should be preserved in its own register for the entire function's lifetime. So you could extend this mechanism to multiple heaps (when that comes in wasm), or you could imagine a very important value to live in its own register (emulated stack pointer, e.g.), or on some RISC platforms it could contain an interesting constant value (zero register, as on MIPS).

Spidermonkey also pins a local instance's storage (called thread local storage), that's the "VmCtx" parameter that's passed in the Baldrdash signatures. It was a bit more complicated to emulate what's done for VmCtx, especially to tell Cranelift how to recompute the heap base (this is done in the wasm->clif translation that's done on Spidermonkey's side).

Did you have other uses in mind?

@bnjbvr
Copy link
Member Author

bnjbvr commented Sep 2, 2019

This is in better shape, with even more restrictions compared to previous iterations:

  • the pinned register may not be the target of local moves generated by the solver (regmove/regfill).
  • the pinned register may only be a definition of the current instruction if it's get_pinned_reg.

Any other case will trigger an assertion. Thanks to Julian's recent patches, the pinned reg (r15) was much likely to be taken as a fill's target, so I've got more confidence in this new version, which passes all of Cranelift, Embenchen and Spidermonkey testing. This is ready for review.

@bnjbvr bnjbvr marked this pull request as ready for review September 2, 2019 16:55
@bnjbvr bnjbvr requested a review from lars-t-hansen September 2, 2019 16:55
@iximeow
Copy link
Collaborator

iximeow commented Sep 4, 2019

poppin' back in to say that Lucet's tests all pass, wasm spectests pass (well, as much as they did before), and most benchmarks look ~10-20% better! We have a few that measure slower, and I'm going to investigate why. I suspect it's unrelated to this change.

Well, ideally this would be its own mechanism in register allocation, maybe as a hint that some value is very important and should be preserved in its own register for the entire function's lifetime.

This is about what I expected, too, yeah. I've been wondering if it makes sense to think about a register-pinned value as a particular flavor of GlobalValue, but that's not right since a global value would have a (much) wider lifetime than a "keep this value in this register" guidance that pinning a register really is.

Did you have other uses in mind?

We have a bit of code in Lucet to optionally count the number of wasm operations executed, which currently works by adding to a counter tacked on to our VmCtx via wasm instrumented as we hand it to cranelift-wasm. It'd be interesting to measure the impact of keeping that counter in a pinned register as well. If memory serves, we go through ~4 memory accesses per counter update (so, control flow changes), which could be 2 if VmCtx was less spill-y, and probably a lower but non-zero average if we were more careful about not immediately writing the counter back to memory. Right now the performance is not great, especially in tight loops :(

Copy link
Collaborator

@lars-t-hansen lars-t-hansen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me, as far as my knowledge goes -- I know little about the solver.

@bnjbvr bnjbvr merged commit 8e8b83f into bytecodealliance:master Sep 6, 2019
@bnjbvr bnjbvr deleted the pinned-reg branch September 6, 2019 14:18
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants