Use scope for par_slice instead of join#152375
Conversation
|
r? @jieyouxu rustbot has assigned @jieyouxu. Use Why was this reviewer chosen?The reviewer was selected based on:
|
|
2 threads:
4 threads:
8 threads:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Sanity check on single-threaded execution: @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Use `scope` for `par_slice` instead of `join`
|
(Since we're doing a check, r=me if it comes back sane :D) |
This comment has been minimized.
This comment has been minimized.
|
The bot didn't want to tell us the results, but they're ready: neutral on icounts, but wall-time improves on the only @bors r=jieyouxu,lqd |
This comment has been minimized.
This comment has been minimized.
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing ce0bf0b (parent) -> 4c37f6d (this PR) Test differencesShow 2 test diffs2 doctest diffs were found. These are ignored, as they are noisy. Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard 4c37f6d78c9c83e05caa0d1e43ce7d0c3f27538e --output-dir test-dashboardAnd then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
|
Finished benchmarking commit (4c37f6d): comparison URL. Overall result: no relevant changes - no action needed@rustbot label: -perf-regression Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)Results (primary 3.5%, secondary 7.2%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary 2.4%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary -0.0%, secondary -0.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 479.824s -> 480.702s (0.18%) |
|
This change broke parallel frontend pretty severely, perhaps by unearthing some pre-existing issues perhaps not. Zulip discussion: #t-compiler/parallel-rustc > Steps to enable/stabilize multi-threaded rustc front-end @ 💬. |
| rustc_thread_pool::scope(|s| { | ||
| let proof = items.derive(()); | ||
| let group_size = std::cmp::max(items.len() / 128, 1); | ||
| for group in items.chunks_exact_mut(group_size) { |
There was a problem hiding this comment.
Hold on. chunks_exact_mut and not chunks_mut?
There was a problem hiding this comment.
Yes, looks like some items are simply not processed now.
There was a problem hiding this comment.
Well, that would skew performance numbers too.
|
Benchmark results above aren't real as this PR basically disables some code from running because of |
|
Updated benchmarks with 2 threads:
4 threads:
8 threads:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hmm, such large variance doesn't look good. Rcb benchmarks are very unreliable for multiple threads. Although all greens and above 5% stats gives me a reason to believe this is not a fluke. |
…rochenkov,jieyouxu Fix wrong par_slice implementation #152375 (comment)
…rochenkov,jieyouxu Fix wrong par_slice implementation rust-lang/rust#152375 (comment)
…rochenkov,jieyouxu Fix wrong par_slice implementation rust-lang/rust#152375 (comment)
…rochenkov,jieyouxu Fix wrong par_slice implementation rust-lang/rust#152375 (comment)
…,Zoxc Make `par_slice` consistent with single-threaded execution #152375 removed this consistency by switching from order preserving join to scope, which does not preserve order as stated in `par_fns` as well. This also makes `par_slice` behavior consistent with `par_fns`.
This uses
scopeinstead of nestedjoins inpar_sliceso that each group of items are independent and do not end up blocking on another.