JIT optimization: value re-numbering in CSE doesn't help span loops as much as array loops

CSE [updates the value numbers](https://github.com/dotnet/coreclr/blob/3c76c1fff9473f8a0051907453c7e1902a6c3647/src/jit/optcse.cpp#L2020-L2023), when a value being CSEd has a single definition, so that all uses have that same conservative value number.  This improves subsequent range check elimination, because assertions generated for those nodes will have the same value number.  This works just as well for spans as it does for arrays.
CSE also [updates the value numbers of comparisons against checked bounds](https://github.com/dotnet/coreclr/blob/3c76c1fff9473f8a0051907453c7e1902a6c3647/src/jit/optcse.cpp#L2035-L2065), since the code in range check opt that computes ranges for operands will consult the VNs on the compares, and only benefit if the VNs for the compares are functions of the same VNs for the bounds.  This part needs to be able to find the compares whose VNs need updating (which are compares that are functions of the expressions that become CSE uses).  For efficiency, rather than searching the trees at this point, CSE [builds a map](https://github.com/dotnet/coreclr/blob/3c76c1fff9473f8a0051907453c7e1902a6c3647/src/jit/optcse.cpp#L706-L712) eariler, during `optValnumCSE_Locate`, mapping checked bound uses to the compares that are functions of them.  Only compares that get inserted as values of this map get updated.  So ideally this map would include as keys all nodes that are occurrences of  any CSE candidate which has a checked bound VN.  But computing "does this candidate have a checked bound VN?" requires visiting all the occurrences of the candidate, and `optValnumCSE_Locate` runs while we are building the candidate lists, before we can enumerate them.  So this code would be [a bit more effective](https://github.com/dotnet/coreclr/pull/11521#issuecomment-300658768) if it were re-worked to update the compares that are functions of all the occurrences of relevant CSE candidates.

This optimization currently works better for arrays than for spans, since [this code](https://github.com/dotnet/coreclr/blob/3c76c1fff9473f8a0051907453c7e1902a6c3647/src/jit/valuenum.cpp#L3461-L3469) proactively reports all array length VNs as checked bound VNs, and so they all get included in the map built by `optValnumCSE_Locate`.  This special case for arrays could be removed if the update gets reworked.

I'm creating this issue so it doesn't get lost, but I'm putting it in the "Future" mile stone because it's not at all clear that it's worth pursuing; even adding this update for compares of arrays in the first place [was debatable](https://github.com/dotnet/coreclr/pull/9615#issuecomment-280144998), and the "motivating" case here is [a test in Span/IndexerBench](https://github.com/dotnet/coreclr/blob/3c76c1fff9473f8a0051907453c7e1902a6c3647/tests/src/JIT/Performance/CodeQuality/Span/Indexer.cs#L217-L220) not getting optimized as well as the equivalent array case, but with the subsequent change to promote implicit byrefs (#10453), we're already optimizing that "motivating" case at array parity without this change.
category:cq
theme:cse
skill-level:expert
cost:medium
impact:small

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT optimization: value re-numbering in CSE doesn't help span loops as much as array loops #8177

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

JIT optimization: value re-numbering in CSE doesn't help span loops as much as array loops #8177

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions