Hash the AssertionDsc struct for faster lookup #72700

kunalspathak · 2022-07-22T21:55:58Z

Create a hash table to store the mapping of AssertionDsc to AssertionIndex.

Design:

Since vnBased is passed around in Equals() method of AssertionDsc, it was not possible to use it in HashTable's KeyFuncs type. I tried various approaches like createing template, function pointers, inheritance, etc. but none of them solved the problem. In the end, I have just included the bool vnBased in AssertionDsc. Although this field value is redundant since this is already stored on Compiler object ~~and it does increase the size of AssertionDsc from 48 bytes to 56 bytes. However, most popular maximum size of AssertionDsc is 64, we end up taking 512 bytes extra during jitting per method.~~ I also tried to have a static field over AssertionDscKeyFuncs but there was a worry that if multiple methods are being compiled at the same time, the value of vnBased might get outdated for one of the thread.
The hashMap will be used only for checking the already added assertions. There is a scope for having separate hashmaps for other usage sites where we would try finding the assertion, but on the instrumentation I did I didn't see them iterating too many times.
Hash code description: TODO
Debug check against existing linear approach.

Fixes: #10592

ghost · 2022-07-22T21:56:28Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Create a hash table to store the mapping of AssertionDsc to AssertionIndex.

Fixes: #10592

Author:	kunalspathak
Assignees:	-
Labels:	`area-CodeGen-coreclr`
Milestone:	-

src/coreclr/jit/compiler.h

AndyAyersMS · 2022-07-23T16:08:29Z

Some thoughts:

I wonder if you should revisit the analysis I did in JIT: look at cost impact of assertion dup detection #10592 and see if it still seems to ring true.
Might be interesting to extract MC's with high assertion counts and just look at the TP impact on those. There's no easy way to do this sort of extraction currently.
Did you verify the hash functions gave decent hash distributions?
There are other places where we search the assertion table where it would be nice to do this same sort of fast lookup

kunalspathak · 2022-07-27T22:50:06Z

Some thoughts:

I wonder if you should revisit the analysis I did in JIT: look at cost impact of assertion dup detection #10592 and see if it still seems to ring true.

I did some instrumentation which is not exactly similar to what you had.

Here is how I instrumented it. I have added instrumentation for the 5 types of lookups we do today: AddAssertion, SubType, SubRange, EqualOrNotEqual and NoNull. For each of the 5 categories, I added two counters, the one that measures the number of times we found a match (counter name ending with Count) and number of iterations we did to find that match (counter name ending with Iter). From those 2 counters, if I do Iter / Count, I get average iterations we did before we found the match. Correct me if that is not accurate metrics to measure.

With that instrumentation, I ran superpmi benchmarks, libraries.pmi and asp.net.

benchmarks

Type	Count	Iter	Avg. (iteration / match)
AddAssertion	304815	3238908	10.62581566
SubType	1226	5262	4.292006525
SubRange	66	671	10.16666667
EqualOrNotEqual	2718	4476	1.646799117
NoNull	5957	8479	1.423367467

And here is graph that maps number of methods (Y-axis) to the number of iterations / match.

libraries.pmi

Type	Count	Iter	Avg. (iteration / match)
AddAssertion	2170199	14038080	6.46856809
SubType	5503	16903	3.071597311
SubRange	1182	21236	17.96615905
EqualOrNotEqual	26052	48974	1.879855673
NoNull	52621	83920	1.594800555

asp

Type	Count	Iter	Ratio
AddAssertion	447038	4735534	10.59
SubType	2848	7971	2.80
SubRange	23	577	25.09
EqualOrNotEqual	2108	3578	1.70
NoNull	6893	9866	1.43

Essentially, if I am interpreting the data correctly, there is not much benefit in hashing other categories for faster lookup other than AddAssertion.

I have not instrumented the places where we iterate through all the assertions and try to find the best matching assertion (e.g. optAssertionIsNonNullInternal, optGlobalAssertionIsEqualOrNotEqualZero, etc.) because there are more detailed filtering that happens which I cannot generalize it inside hash function. I might have to create a separate hash table(s) if I have to do the lookups for each of this category as well as others like SubType and SubRange. We can certainly brainstorm to see how we can come up with such a hash function and that would unlock us from removing the assertion count limit IMO.

Might be interesting to extract MC's with high assertion counts and just look at the TP impact on those.

There is some noise, but overall I see improvements. Below are the results of running superpmi 5 times on handful of method contexts that has high assertion count/iteration (last 2 columns).

BEFORE	AFTER	Diff	MethodID	AddAssertionCount	AddAssertionIter
0.7833	0.7959	1.61%	1288	1005	23305
0.8283	0.8188	-1.15%	1288	1005	23305
0.8287	0.8628	4.11%	1288	1005	23305
0.8139	0.8304	2.03%	1288	1005	23305
0.891	0.8553	-4.01%	1288	1005	23305

0.8704	0.8336	-4.23%	6792	954	30966
1.0456	0.848	-18.90%	6792	954	30966
0.7753	0.7897	1.86%	6792	954	30966
0.7818	0.8297	6.13%	6792	954	30966
0.7933	0.8978	13.17%	6792	954	30966

0.8098	0.7786	-3.85%	7322	1619	163731
0.8415	0.7977	-5.20%	7322	1619	163731
0.7746	0.7546	-2.58%	7322	1619	163731
0.8081	0.7753	-4.06%	7322	1619	163731
0.7757	0.7963	2.66%	7322	1619	163731

0.8003	0.7895	-1.35%	10982	588	20762
0.7804	0.7331	-6.06%	10982	588	20762
0.8053	0.9039	12.24%	10982	588	20762
0.7354	0.7928	7.81%	10982	588	20762
0.8371	0.8674	3.62%	10982	588	20762

0.8473	1.2294	45.10%	14206	669	19263
0.8092	0.8866	9.57%	14206	669	19263
0.8201	0.8509	3.76%	14206	669	19263
0.8275	0.7842	-5.23%	14206	669	19263
0.8763	0.777	-11.33%	14206	669	19263

0.7974	0.7645	-4.13%	16249	1142	94749
0.7687	0.8726	13.52%	16249	1142	94749
0.8487	0.8185	-3.56%	16249	1142	94749
0.8022	0.8126	1.30%	16249	1142	94749
0.8002	0.7818	-2.30%	16249	1142	94749

There's no easy way to do this sort of extraction currently.

If you think this instrumentation will be helpful in future, I can send a PR for the existing instrumentation I have done in kunalspathak@b549c14.

Did you verify the hash functions gave decent hash distributions?

I do see 13 entries sometimes...let me try to narrow down.

There are other places where we search the assertion table where it would be nice to do this same sort of fast lookup

I think AddAssertion is good enough. We should discuss how we can fit lookups while doing assertion prop in it.

kunalspathak · 2022-07-27T23:53:47Z

I do see 13 entries sometimes...let me try to narrow down.

In benchmark superpmi, I see 3-4 methods that has 13 entries, but number of iterations we did for the match was 18497 and found 122 matches, so roughly 150ish iterations to find a match. So, I guess something < 15 still looks good.

kunalspathak · 2022-07-28T00:43:20Z

because there are more detailed filtering that happens which I cannot generalize it inside hash function.

In addition to that, we are essentially scanning through bunch of AssertionDsc to find the best match and not trying to extract a particular entry which makes me feel that probably, we can have some sort of 1-to-many data structure (again a hashmap of key to list of values), where we can just iterate through e.g. the ones whose assertionKind == OAK_EQUAL.

Here is the data for benchmarks collection and there are average 4 iterations to find the match, so not sure it will be worth doing. On the contrary, if we increase the assertion limit, we might see these avg. iterations count / match increase.

Type	Count	Iter	Ratio
PropLclVar	`1277662`	5621756	4.40
PropEqualOrNot	2102	7371	3.51
PropEqualZero	109	2028	18.61
PropNonNull	133846	338982	2.53
PropBndChk	5	18	3.60

Here are the numbers for libraries.pmi:

Type	Count	Iter	Ratio
PropLclVar	2573831	10841450	4.21
PropEqualOrNot	18970	69778	3.68
PropEqualZero	411	8143	19.81
PropNonNull	804509	2387847	2.97
PropBndChk	5	39	7.80

AndyAyersMS · 2022-07-28T01:55:55Z

Correct me if that is not accurate metrics to measure

I'm not sure I fully understand what you are measuring. Is this right?

Seems like good metrics would be

number of times we look for an assertion (broken out say by the 5 categories of lookup)
number of times we match (which you have: count)
number of assertions scanned when we match (which you have: iter)
number of times we fail to match
number of assertions scanned when we fail to match

The interesting cases to me are the non-AddAssertion cases when we don't find a match -- in those cases we are doing work for nothing and would end up doing even more work for nothing if we made the table bigger.

kunalspathak · 2022-07-28T15:04:47Z

Correct me if that is not accurate metrics to measure

I'm not sure I fully understand what you are measuring. Is this right?

Seems like good metrics would be

number of times we look for an assertion (broken out say by the 5 categories of lookup)

number of times we match (which you have: count)

That's right.

number of assertions scanned when we match (which you have: iter)

That's right.

number of times we fail to match

number of assertions scanned when we fail to match

Yes, I realized that I should include these two and will do it today.

The interesting cases to me are the non-AddAssertion cases when we don't find a match -- in those cases we are doing work for nothing and would end up doing even more work for nothing if we made the table bigger.

Agree.

kunalspathak · 2022-07-28T22:55:31Z

Here is the aggregated information for benchmarks collection collected on windows/x64 using kunalspathak@f98c74b and sorted by the call count.

Category: Call sites at which measurements were collected.
Call count: Number of times method was called.
Calls / method: There are 37835 methods in the collection. This column represents average calls per method.
Match count: Number of times we found a match.
Matched iterations: Number of iterations performed before we found a match.
Iteration / match: Average number of iterations took to find the matching assertion.
Missed count: Number of times we did not find a match.
Missed iterations: Number of iterations conducted before we realized there is no match.
Iterations / missed: Average number of iterations took before we realized there is no match.

Category	Call count	Calls / method	Match count	Matched iterations	Iteration / match	Missed count	Missed iterations	Iteration / missed
optAssertionProp_LclVar	`2059107`	54.42	114601	473927	4.14	1944506	8676449	4.46
AddAssertion	1218385	32.20	304815	3238908	10.63	913570	7026107	7.69
optGlobalAssertionIsEqualOrNotEqualZero	220768	5.84	109	2028	18.61	220659	`1061317`	4.81
optAssertionIsNonNullInternal	199981	5.29	139803	347461	2.49	60178	303178	5.04
SubType	96002	2.54	1226	5262	4.29	94776	424824	4.48
optLocalAssertionIsEqualOrNotEqual	68958	1.82	2718	4476	1.65	66240	149151	2.25
optGlobalAssertionIsEqualOrNotEqual	47196	1.25	2102	7371	3.51	45094	233733	5.18
optAssertionProp_BndsChk	34049	0.90	5	18	3.60	24860	199594	8.03
SubRange	5571	0.15	66	671	10.17	5505	26797	4.87

Observations:

AddAssertion seems the 2nd hottest method that performs look-up and having hash table should improve the performance.
We should optimize the lookups made in optAssertionProp_LclVar by probably adding another hash table for assertions that has (assertionKind == OAK_EQUAL) && (op1.kind == O1K_LCLVAR).
For others, given the calls / method is low and the query criteria is different for each of them, we should better not touch them. I am thinking of doing smaller optimizations for them though like have a flag for each of those and if we never created those assertions, then short-circuit the method.
Since I have put in lot of work to add instrumentation, I will probably have a separate PR to get that in.

kunalspathak · 2022-07-28T23:43:38Z

I am inclined to merge this as is and I will have follow-up PRs for:

Hash table to improve optAssertionProp_LclVar
Instrumentation for assertion stats
Micro-optimizations for other assertion lookups.

kunalspathak · 2022-07-28T23:43:55Z

@dotnet/jit-contrib

src/coreclr/jit/compiler.h

kunalspathak · 2022-08-01T14:54:45Z

spmi asmdiffs and replay errors are related to re-publishing the log files.

kunalspathak · 2022-08-22T16:45:08Z

@dotnet/jit-contrib

SingleAccretion · 2022-08-22T16:54:50Z

src/coreclr/jit/assertionprop.cpp

+    if (optAssertionDscMap->Set(*newAssertion, optAssertionCount + 1, AssertionDscMap::SetKind::SkipIfExist,
+                                &fastAnswer))


It is not obvious - why do we need SetKind::SkipIfExist if we have LookupPointer?

Because you will call the GetHashCode() twice, first through LookupPointer and then during Set. With SkipIfExist, you just call it once.

AndyAyersMS · 2022-08-26T19:47:48Z

SPMI doesn't see this as a TP win. Are there reasons to believe this is not accurate?

kunalspathak · 2022-08-26T21:16:22Z

SPMI doesn't see this as a TP win. Are there reasons to believe this is not accurate?

Not sure what the noise range is, but I agree that for this PR, we should expect to at least show some TP improvements. I will think about it.

SingleAccretion · 2022-08-26T21:23:26Z

Not sure what the noise range is

The instrumentation is very precise.

kunalspathak · 2022-08-26T21:39:41Z

Not sure what the noise range is

The instrumentation is very precise.

Thanks for reminding that. In that case, it could be that the hash function itself is expensive or time is consumed in traversing the chain.

SingleAccretion · 2022-08-26T21:57:11Z

FWIW, I would also expect some slowdowns from the introduced copying of AssertionDscs, especially in the "few assertions" cases (common during local propagation).

kunalspathak · 2022-09-26T14:10:34Z

I will close this PR for now and get to it when I get time.

ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 22, 2022

ghost assigned kunalspathak Jul 22, 2022

kunalspathak changed the title ~~Assertion prop~~ Hash the AssertionDsc struct for faster lookup Jul 23, 2022

kunalspathak force-pushed the assertion_prop branch from c72cf71 to 1e57b63 Compare July 23, 2022 05:52

jakobbotsch reviewed Jul 23, 2022

View reviewed changes

src/coreclr/jit/compiler.h Outdated Show resolved Hide resolved

kunalspathak marked this pull request as ready for review July 28, 2022 23:43

kunalspathak mentioned this pull request Jul 29, 2022

Assertion prop instrumentation #73035

Closed

SingleAccretion reviewed Jul 29, 2022

View reviewed changes

src/coreclr/jit/compiler.h Outdated Show resolved Hide resolved

JulieLeeMSFT added this to the 8.0.0 milestone Aug 1, 2022

kunalspathak added 10 commits August 18, 2022 17:10

wip: assertionProp hashtable

b307d08

code cleanup

78ce372

wip for global/local

cf08b1f

wip

e1f6eff

more wip funcs_global/funcs_local

5b75fe6

more wip funcs_global/funcs_local

fa8a7f3

working model

dc20e11

better hash code function

cc1dc59

working better SkipIfExist

85d460e

Add vnBased in AssertionDsc

82b4917

kunalspathak added 10 commits August 18, 2022 17:10

Remove comments

3fc3a03

Enable hashtable

611a9c0

jit format

cfd8317

fix release build

a63eda8

Review comments

e096ee6

jit format

d8a27a7

Minor comment removal

0c64a82

Fix failures

75ab22a

Reduce size of AssertionDsc

b7c1e54

fix asmdiff regression

5195133

kunalspathak force-pushed the assertion_prop branch from bb165aa to 5195133 Compare August 19, 2022 00:11

Merge remote-tracking branch 'origin/main' into assertion_prop

d930d8c

SingleAccretion reviewed Aug 22, 2022

View reviewed changes

AndyAyersMS mentioned this pull request Aug 26, 2022

[JIT] Assertion Prop Issues and Ideas #74671

Open

kunalspathak closed this Sep 26, 2022

ghost locked as resolved and limited conversation to collaborators Oct 26, 2022

		if (optAssertionDscMap->Set(*newAssertion, optAssertionCount + 1, AssertionDscMap::SetKind::SkipIfExist,
		&fastAnswer))

Hash the AssertionDsc struct for faster lookup #72700

Hash the AssertionDsc struct for faster lookup #72700

Uh oh!

Conversation

kunalspathak commented Jul 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ghost commented Jul 22, 2022

Uh oh!

Uh oh!

AndyAyersMS commented Jul 23, 2022

Uh oh!

kunalspathak commented Jul 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

benchmarks

libraries.pmi

asp

Uh oh!

kunalspathak commented Jul 27, 2022

Uh oh!

kunalspathak commented Jul 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AndyAyersMS commented Jul 28, 2022

Uh oh!

kunalspathak commented Jul 28, 2022

Uh oh!

kunalspathak commented Jul 28, 2022

Uh oh!

kunalspathak commented Jul 28, 2022

Uh oh!

kunalspathak commented Jul 28, 2022

Uh oh!

Uh oh!

kunalspathak commented Aug 1, 2022

Uh oh!

kunalspathak commented Aug 22, 2022

Uh oh!

SingleAccretion Aug 22, 2022

Choose a reason for hiding this comment

Uh oh!

kunalspathak Aug 22, 2022

Choose a reason for hiding this comment

Uh oh!

AndyAyersMS commented Aug 26, 2022

Uh oh!

kunalspathak commented Aug 26, 2022

Uh oh!

SingleAccretion commented Aug 26, 2022

Uh oh!

kunalspathak commented Aug 26, 2022

Uh oh!

SingleAccretion commented Aug 26, 2022

Uh oh!

kunalspathak commented Sep 26, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

kunalspathak commented Jul 22, 2022 •

edited

Loading

kunalspathak commented Jul 27, 2022 •

edited

Loading

kunalspathak commented Jul 28, 2022 •

edited

Loading