Skip to content

Cleanup the xarch perfscore handling#127109

Merged
tannergooding merged 6 commits intodotnet:mainfrom
tannergooding:perf-score
Apr 19, 2026
Merged

Cleanup the xarch perfscore handling#127109
tannergooding merged 6 commits intodotnet:mainfrom
tannergooding:perf-score

Conversation

@tannergooding
Copy link
Copy Markdown
Member

There were multiple instructions that did not have perf score handling, some instructions which overwrote the memory latency altogether, some which failed to account for EA_64BYTE, and then memory latency didn't account for SIMD operations being more expensive.

This fixes those issues by:

  • adding the missing perf score info or handling
  • changing how the result type is created by tracking memory vs instruction info separately and combining them consistently at the end of the method before return
  • updating the various handlers to check for opSize >= EA_32BYTE or to check for each op size with an assert
  • ensuring that memory latency factors in the more expensive SIMD and floating-point loads

Copilot AI review requested due to automatic review settings April 18, 2026 19:36
@github-actions github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 18, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR cleans up xarch perfscore modeling in the JIT emitter by filling in missing instruction perf metadata and making memory vs. instruction costs combine consistently (including better handling for larger operand sizes like EA_64BYTE and SIMD/FP memory indirections).

Changes:

  • Extend/adjust xarch instruction metadata (instrsxarch.h) to provide latency/throughput values for previously-illegal or missing cases.
  • Refactor emitter::getInsExecutionCharacteristics to compute memory and instruction characteristics separately, then combine them deterministically.
  • Add/adjust perfscore constants in emit.h to support newly modeled costs.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.

File Description
src/coreclr/jit/instrsxarch.h Updates many instruction entries to have explicit perfscore latency/throughput instead of ILLEGAL.
src/coreclr/jit/emitxarch.cpp Refactors perfscore computation to separate memory vs instruction modeling; adds/updates handling for more instructions and operand sizes.
src/coreclr/jit/emit.h Adjusts/adds perfscore constant definitions used by the updated modeling logic.

Comment thread src/coreclr/jit/emit.h Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 18, 2026 19:40
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR cleans up and expands xarch perfscore modeling by filling in missing instruction perf data, separating memory vs. instruction cost modeling in getInsExecutionCharacteristics, and improving handling for larger operand sizes (including EA_64BYTE) and SIMD/floating-point memory indirections.

Changes:

  • Add/adjust per-instruction throughput/latency entries in instrsxarch.h to cover previously unhandled cases and prevent overwriting memory latency.
  • Refactor emitter::getInsExecutionCharacteristics to compute memory costs separately and combine them consistently with instruction costs at the end.
  • Extend perfscore constants and memory throughput helpers in emit.h to support newly-modeled instructions and costs.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 3 comments.

File Description
src/coreclr/jit/instrsxarch.h Updates instruction table perfscore entries (latency/throughput) across many xarch instructions, including SIMD/APX additions.
src/coreclr/jit/emitxarch.cpp Refactors perfscore computation to separate memory vs instruction cost, adds new per-instruction handlers for additional SIMD ops, and updates size handling.
src/coreclr/jit/emit.h Fixes/extends perfscore throughput/latency constants and adds memory throughput helper defines for xarch.

Comment thread src/coreclr/jit/emit.h Outdated
Comment thread src/coreclr/jit/emitxarch.cpp
Comment thread src/coreclr/jit/emitxarch.cpp
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 18, 2026 19:44
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates xarch (x86/x64) perf score modeling in the JIT emitter to provide more complete/consistent instruction throughput/latency handling, with special attention to larger operand sizes (e.g., EA_64BYTE) and SIMD memory costs.

Changes:

  • Fill in missing instruction perf score table entries for various xarch instructions.
  • Refactor getInsExecutionCharacteristics to model memory and instruction costs separately and combine them consistently at the end.
  • Add new perf score constants and introduce explicit memory throughput macros (RD/WR/RW).

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/coreclr/jit/instrsxarch.h Adds/updates per-instruction perf score latency/throughput metadata for many xarch instructions.
src/coreclr/jit/emitxarch.cpp Refactors perf score calculation to separate memory vs instruction cost modeling and expands handling for more SIMD/size cases.
src/coreclr/jit/emit.h Fixes/extends perf score constant definitions and adds memory throughput helper macros.

Comment thread src/coreclr/jit/emitxarch.cpp
Comment thread src/coreclr/jit/emitxarch.cpp Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 18, 2026 19:49
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refines xarch perf score modeling in the JIT emitter by filling in missing instruction perf metadata and by restructuring getInsExecutionCharacteristics to compute memory vs. instruction costs separately, then combine them consistently. This improves throughput/latency estimates (notably for larger vector widths and SIMD/floating-point memory operations) and avoids previous cases where instruction handling accidentally overwrote memory costs.

Changes:

  • Expanded/updated per-instruction latency/throughput metadata in instrsxarch.h (replacing many ILLEGAL entries).
  • Refactored emitter::getInsExecutionCharacteristics to track memory vs. instruction throughput/latency independently and merge at the end, including SIMD-sensitive memory latency adjustments.
  • Added/adjusted perfscore constants and memory throughput helpers in emit.h to support new modeling cases.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/coreclr/jit/instrsxarch.h Populates missing per-instruction perfscore latency/throughput values so more instructions can use table-driven perf modeling.
src/coreclr/jit/emitxarch.cpp Refactors perfscore calculation to separate memory vs. instruction characteristics and combine them consistently, with new SIMD memory-latency adjustments and added instruction handlers.
src/coreclr/jit/emit.h Fixes/adds perfscore constants and defines memory throughput helpers used by the updated emitter perf modeling.

Comment thread src/coreclr/jit/emitxarch.cpp
Comment thread src/coreclr/jit/emit.h Outdated
@tannergooding
Copy link
Copy Markdown
Member Author

tannergooding commented Apr 18, 2026

CC. @dotnet/jit-contrib, @EgorBo, @kg for review

@tannergooding tannergooding merged commit f942875 into dotnet:main Apr 19, 2026
131 of 139 checks passed
@tannergooding tannergooding deleted the perf-score branch April 19, 2026 20:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants