-
-
Notifications
You must be signed in to change notification settings - Fork 14.2k
Introduce deduced parameter attributes, and use them for deducing readonly on indirect immutable freeze by-value function parameters.
#103172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt |
|
(rust-highfive has picked a reviewer for you, use r? to override) |
9cdbc31 to
2d670be
Compare
This comment has been minimized.
This comment has been minimized.
2d670be to
9603c77
Compare
|
The patch is updated to use the Visitor to detect mutations of parameters. I'll mark it as non-draft if there are no more comments once the tests pass locally. |
9603c77 to
0e8a4e6
Compare
This comment has been minimized.
This comment has been minimized.
|
This seems ready. Those two failures confuse me—I don't mutate the MIR at all, and these aren't codegen tests… |
0e8a4e6 to
bf18d56
Compare
This comment has been minimized.
This comment has been minimized.
|
@bors try @rust-timer queue |
|
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
|
⌛ Trying commit bf18d564e3d54e08ba4372e1d4b20ef1b6e3afaa with merge 9077e397fbfc2a1a5945228575b6ce77985fd1f8... |
bf18d56 to
11fc0a7
Compare
|
Updated the PR to address comments. I added a new test to ensure that we don't mark non-freeze types as readonly. I also added a comment explaining why I don't think that the fact that moves semantically store undef to the moved-from value invalidates the optimization. |
|
☀️ Try build successful - checks-actions |
|
Queued a5cf94e7f6c6d3272682f3eeeb831ec529decd2f with parent dcb3761, future comparison URL. |
|
Finished benchmarking commit (a5cf94e7f6c6d3272682f3eeeb831ec529decd2f): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)This benchmark run did not return any relevant results for this metric. CyclesThis benchmark run did not return any relevant results for this metric. Footnotes |
|
@bors delegate+ code and perf lgtm now. r=me with commits squashed |
|
✌️ @pcwalton can now approve this pull request |
…adonly` on
indirect immutable freeze by-value function parameters.
Right now, `rustc` only examines function signatures and the platform ABI when
determining the LLVM attributes to apply to parameters. This results in missed
optimizations, because there are some attributes that can be determined via
analysis of the MIR making up the function body. In particular, `readonly`
could be applied to most indirectly-passed by-value function arguments
(specifically, those that are freeze and are observed not to be mutated), but
it currently is not.
This patch introduces the machinery that allows `rustc` to determine those
attributes. It consists of a query, `deduced_param_attrs`, that, when
evaluated, analyzes the MIR of the function to determine supplementary
attributes. The results of this query for each function are written into the
crate metadata so that the deduced parameter attributes can be applied to
cross-crate functions. In this patch, we simply check the parameter for
mutations to determine whether the `readonly` attribute should be applied to
parameters that are indirect immutable freeze by-value. More attributes could
conceivably be deduced in the future: `nocapture` and `noalias` come to mind.
Adding `readonly` to indirect function parameters where applicable enables some
potential optimizations in LLVM that are discussed in [issue 103103] and [PR
103070] around avoiding stack-to-stack memory copies that appear in functions
like `core::fmt::Write::write_fmt` and `core::panicking::assert_failed`. These
functions pass a large structure unchanged by value to a subfunction that also
doesn't mutate it. Since the structure in this case is passed as an indirect
parameter, it's a pointer from LLVM's perspective. As a result, the
intermediate copy of the structure that our codegen emits could be optimized
away by LLVM's MemCpyOptimizer if it knew that the pointer is `readonly
nocapture noalias` in both the caller and callee. We already pass `nocapture
noalias`, but we're missing `readonly`, as we can't determine whether a
by-value parameter is mutated by examining the signature in Rust. I didn't have
much success with having LLVM infer the `readonly` attribute, even with fat
LTO; it seems that deducing it at the MIR level is necessary.
No large benefits should be expected from this optimization *now*; LLVM needs
some changes (discussed in [PR 103070]) to more aggressively use the `noalias
nocapture readonly` combination in its alias analysis. I have some LLVM patches
for these optimizations and have had them looked over. With all the patches
applied locally, I enabled LLVM to remove all the `memcpy`s from the following
code:
```rust
fn main() {
println!("Hello {}", 3);
}
```
which is a significant codegen improvement over the status quo. I expect that
if this optimization kicks in in multiple places even for such a simple
program, then it will apply to Rust code all over the place.
[issue 103103]: rust-lang#103103
[PR 103070]: rust-lang#103070
e4e37f0 to
da630ac
Compare
|
cc @rust-lang/wg-unsafe-code-guidelines From my understanding, this optimization won't change the behavior of any sound programs. If we create a pointer/reference to a function argument (e.g. However, I haven't seen any mention of |
|
☀️ Test successful - checks-actions |
1 similar comment
|
☀️ Test successful - checks-actions |
|
Finished benchmarking commit (eecde58): comparison URL. Overall result: ❌✅ regressions and improvements - no action needed@rustbot label: -perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Footnotes |
|
@Aaron1011 is there a summary of what happens here for someone who doesn't live and breathe LLVM IR?^^ My question is basically the same as in #103103: What do requirements do we need to impose on the MIR level to justify this attribute? "indirect immutable freeze by-value function parameter" is using a lot of terms that don't exist in MIR so I don't understand what this means. How can a parameter be both indirect and by-value?!? |
| PlaceContext::MutatingUse(..) | ||
| | PlaceContext::NonMutatingUse(NonMutatingUseContext::Move) => { | ||
| // This is a mutation, so mark it as such. | ||
| self.mutable_args.insert(local.index() - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NonMutatingUseContext::Move is a mutation? Looks like naming went wrong somewhere...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a mutation for borrowck purposes, since you don't need let mut to move out. The opsem might disagree with that, but we should decide the opsem first and then consider renaming something here
What is |
|
@RalfJung as far as I can tell, this only affects |
Introduce deduced parameter attributes, and use them for deducing
readonlyonindirect immutable freeze by-value function parameters.
Right now,
rustconly examines function signatures and the platform ABI whendetermining the LLVM attributes to apply to parameters. This results in missed
optimizations, because there are some attributes that can be determined via
analysis of the MIR making up the function body. In particular,
readonlycould be applied to most indirectly-passed by-value function arguments
(specifically, those that are freeze and are observed not to be mutated), but
it currently is not.
This patch introduces the machinery that allows
rustcto determine thoseattributes. It consists of a query,
deduced_param_attrs, that, whenevaluated, analyzes the MIR of the function to determine supplementary
attributes. The results of this query for each function are written into the
crate metadata so that the deduced parameter attributes can be applied to
cross-crate functions. In this patch, we simply check the parameter for
mutations to determine whether the
readonlyattribute should be applied toparameters that are indirect immutable freeze by-value. More attributes could
conceivably be deduced in the future:
nocaptureandnoaliascome to mind.Adding
readonlyto indirect function parameters where applicable enables somepotential optimizations in LLVM that are discussed in issue 103103 and PR
103070 around avoiding stack-to-stack memory copies that appear in functions
like
core::fmt::Write::write_fmtandcore::panicking::assert_failed. Thesefunctions pass a large structure unchanged by value to a subfunction that also
doesn't mutate it. Since the structure in this case is passed as an indirect
parameter, it's a pointer from LLVM's perspective. As a result, the
intermediate copy of the structure that our codegen emits could be optimized
away by LLVM's MemCpyOptimizer if it knew that the pointer is
readonly nocapture noaliasin both the caller and callee. We already passnocapture noalias, but we're missingreadonly, as we can't determine whether aby-value parameter is mutated by examining the signature in Rust. I didn't have
much success with having LLVM infer the
readonlyattribute, even with fatLTO; it seems that deducing it at the MIR level is necessary.
No large benefits should be expected from this optimization now; LLVM needs
some changes (discussed in PR 103070) to more aggressively use the
noalias nocapture readonlycombination in its alias analysis. I have some LLVM patchesfor these optimizations and have had them looked over. With all the patches
applied locally, I enabled LLVM to remove all the
memcpys from the followingcode:
which is a significant codegen improvement over the status quo. I expect that if this optimization kicks in in multiple places even for such a simple program, then it will apply to Rust code all over the place.