[CompilerPerf] Changed Map.count + Set.count from O(n) to O(1) (AVL logic based on size not height) #5365

manofstick · 2018-07-22T07:27:45Z

Splitting #5360.

Currently Map stores the height of each node, and uses that to determine if it should rebalance itself. This change stores the count of values instead of the height - and rebalances a node when one side is over twice the size of the other - I believe algorithmically equivalent - although in practice it may slightly change performance, as for particular cases a better/worse tree could be in an inner loop. Intuitively I think this should build better trees with more information, but I haven't attempted to mathematically prove this.

This makes Map.count an O(1) operation rather than an O(n) operation.

manofstick · 2018-07-22T07:39:34Z

This gist was about the same:

https://gist.github.com/manofstick/11b5ac3c3cf993ce32e69d04a6549161

This gist was slightly better - but less than 5% across the board (but possibly performance improvement is being masked by the lack of #5307 - as much more time than necessary will be lost in IComparer.Compare)

https://gist.github.com/manofstick/275fe8ed62091aec52cd382548719f2a

And this gist was consistently better with the new balancing

https://gist.github.com/manofstick/e97dc9775bf01fd22b2f238cac9f1c27

test	bittage	percent
construct	64-bit	88%
construct	32-bit	81%
access	64-bit	89%
access	32-bit	81%

forki · 2018-07-23T06:49:44Z

can you please apply the same to set.fs and TaggedCollections.fs (2 times)

forki · 2018-07-23T06:50:02Z

src/fsharp/FSharp.Core/map.fs

        let empty = MapEmpty 

-        let height  = function
+        let size  = function


please inline size function

No really keen to inline whilst MapOne exists. size as it stands has to do two attempted types casts which easily outweight a function call.

manofstick · 2018-07-23T09:24:18Z

Gist for quick Set check: https://gist.github.com/manofstick/b25efcf69fb7791357918d5a780ac438

Seems to be ~10% faster in this gist... (create + union + element check) . But haven't done exhaustive testings...

test type	bittage	test sets size	%
sequential	32-bit	0	103%
sequential	32-bit	25	85%
sequential	32-bit	50	92%
sequential	32-bit	75	85%
sequential	32-bit	100	91%
sequential	32-bit	125	90%
sequential	32-bit	150	92%
sequential	32-bit	175	92%
sequential	32-bit	200	93%
sequential	32-bit	225	91%
sequential	32-bit	250	92%
random	32-bit	0	100%
random	32-bit	25	83%
random	32-bit	50	88%
random	32-bit	75	86%
random	32-bit	100	89%
random	32-bit	125	88%
random	32-bit	150	87%
random	32-bit	175	88%
random	32-bit	200	89%
random	32-bit	225	90%
random	32-bit	250	89%
sequential	64-bit	0	100%
sequential	64-bit	25	88%
sequential	64-bit	50	89%
sequential	64-bit	75	88%
sequential	64-bit	100	91%
sequential	64-bit	125	89%
sequential	64-bit	150	90%
sequential	64-bit	175	89%
sequential	64-bit	200	89%
sequential	64-bit	225	89%
sequential	64-bit	250	90%
random	64-bit	0	102%
random	64-bit	25	95%
random	64-bit	50	100%
random	64-bit	75	93%
random	64-bit	100	97%
random	64-bit	125	93%
random	64-bit	150	93%
random	64-bit	175	94%
random	64-bit	200	92%
random	64-bit	225	93%
random	64-bit	250	93%

manofstick · 2018-07-23T10:05:08Z

@forki

You can see in TaggedCollections.fs that the optimization to remove the single node has already been done (at least I assume #if ONE is not set anywhere for the build...) I'll push my modifications up as soon as I get a green build from Set.fs... (unless I fall asleep, which is possible :-)

forki · 2018-07-23T10:09:38Z

yes setone and mapone can probably go as well. but please do in separate PR. it will keep things easier for VF# team to accept

…a size

manofstick · 2018-07-24T06:25:00Z

@forki yes, yes, I just meant I would push the TaggedCollections.fs changes after green... (and after sleep!)

cartermp · 2018-07-24T15:40:26Z

@manofstick Just curious about this statement:

This change stores the count of values instead of the height - and rebalances a node when one side is over twice the size of the other - I believe algorithmically equivalent - although in practice it may slightly change performance, as for particular cases a better/worse tree could be in an inner loop.

I don't think this is true, as the current implementation re-balances once the height of one sub-tree is 2 higher than the other. This implementation re-balances after it's twice as high. I'm not familiar with the performance of AVL trees beyond what I learned in college, so I don't know what the long-term ramifications of this would be. But I'm certainly not opposed to basing count off of size and the initial performance results 😄

manofstick · 2018-07-24T19:59:53Z

@cartemp

Yes, I don't mean the same trees. That was stated. I was probably a bit strong without the word equivalent, but I was meaning computational complexity. Still getting balanced binary trees that are still created using the same AVL transforms. But yes the rest is a bit hand wavey! (I've actually sent it to a mate at University of British Columbia to do an analysis of, but he's usually pretty busy. But we'll see... At the moment I'm trusting intuition and tests)

forki · 2018-07-25T05:36:19Z

I wonder if we now rebalance more often. Paul Westcott <notifications@github.com> schrieb am Di., 24. Juli 2018, 21:59:

…

@cartemp Yes, I don't mean the same trees. That was stated. I was probably a bit strong without the word equivalent, but I was meaning computational complexity. Still getting balanced binary trees that are still created using the same AVL transforms. But yes the rest is a bit hand wavey! (I've actually sent it to a mate at University of Vancouver to do an analysis of, but he's usually pretty busy. But we'll see... At the moment I'm trusting intuition and tests) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#5365 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AADgNCLHLIB4EOBtqXARpxP3lurOogBKks5uJ3y6gaJpZM4VZ5Vk> .

manofstick · 2018-07-25T07:14:01Z

@forki

It's possible. I haven't run the numbers. Anyway am seeing a non-insignificant performance improvement for "smallish" (thousands) of node trees - so it's possible that it's just doing better rebalancing. Anyway, this is why I'm running random and sequential data in tests in case there are degenerate sequences.

...nd I'm willing to accept that in 1962 when AVL trees were first described, they were more interested in saving memory, and so the cost of carrying the size with each node would of been decadent. But considering we were already carrying a int height - well I think it's re-purpose was OK.

Anyway, this branch only has this change in it, so it can be tested in isolation...

manofstick · 2018-07-25T07:22:08Z

Another test: https://gist.github.com/manofstick/f285aa7b16025aabd4f60a4b8413ab81

bittage	ayende #	%
32-bit	250	64.00%
32-bit	500	66.15%
32-bit	750	77.73%
32-bit	1000	76.56%
32-bit	1250	77.52%
32-bit	1500	77.93%
64-bit	250	54.39%
64-bit	500	69.40%
64-bit	750	84.39%
64-bit	1000	84.53%
64-bit	1250	84.18%
64-bit	1500	83.75%

forki · 2018-07-25T07:23:25Z

what about very large size? are we now restricting the count since we track the size and not height as an int?

manofstick · 2018-07-25T07:24:48Z

@forki

The first gist mentioned in #5365 (comment) deals with large n, where it seems to return to about the same performance.

manofstick · 2018-07-25T07:27:20Z

@forki - but I'll create some more tests over the days ahead...

forki · 2018-07-25T07:27:36Z

didn't mean perf - I meant are getting issues with the count of elements when the size integer overflows? In theory the height overflows much later.

manofstick · 2018-07-25T07:31:05Z

@forki - yes. Could be a showstopper?

forki · 2018-07-25T07:36:52Z

dunno. Not even sure if it is a real problem or just imaginary

zpodlovics · 2018-08-01T08:44:04Z

This data structure looks like a Weight-balanced tree.

@forki It seems that the other Weight-balanced tree implementations solved it by using the logarithmic of the size:

"In order to ensure performance, the algorithm keeps the height of a tree
logarithmic to its size by balancing the sizes of the subtrees in each node." [1] [2]

How the balancing implemented?

"A weight-balanced tree (WBT) is a binary search tree, whose balance is based on the sizes
of the subtrees in each node. Although purely functional implementations on a variant
WBT algorithm are widely used in functional programming languages, many existing
implementations do not maintain balance after deletion in some cases. The difficulty lies
in choosing a valid pair of rotation parameters: one for standard balance and the other for
choosing single or double rotation. This paper identifies the exact valid range of the rotation
parameters for insertion and deletion in the original WBT algorithm where one and only
one integer solution exists. Soundness of the range is proved using a proof assistant Coq.
Completeness is proved using effective algorithms generating counterexample trees. For two
specific parameter pairs, we also proved in Coq that set operations also maintain balance.
Since the difference between the original WBT and the variant WBT is small, it is easy to
change the existing buggy implementations based on the variant WBT to the certified original
WBT with a rational solution." [1] [2]

[1] https://yoichihirai.com/bst.pdf
[2] http://www.mew.org/~kazu/proj/weight-balanced-tree/

manofstick · 2018-09-17T07:54:18Z

@KevinRansom

Closing this. Was always kind of a side thing. Better to follow the path to #5463

Changed Map.count from O(n) to O(1)

c1c30a8

forki approved these changes Jul 22, 2018

View reviewed changes

forki reviewed Jul 23, 2018

View reviewed changes

Modified Set to use size rather than height

63dc9b5

manofstick changed the title ~~Changed Map.count from O(n) to O(1)~~ [CompilerPerf] Changed Map.count + Set.count from O(n) to O(1) (AVL logic based on size not height) Jul 23, 2018

TIHan added Tenet-Performance Area-Library Issues for FSharp.Core not covered elsewhere labels Jul 23, 2018

Tagged collections Map + Set count from O(n) to O(1) and rebalance vi…

cbc0d28

…a size

manofstick closed this Sep 17, 2018

davidglassborow mentioned this pull request Jan 15, 2020

Net Core 3 support to compare new implementation of hashmap krauthaufen/ImmutableHashCollections#2

Open

[CompilerPerf] Changed Map.count + Set.count from O(n) to O(1) (AVL logic based on size not height) #5365

[CompilerPerf] Changed Map.count + Set.count from O(n) to O(1) (AVL logic based on size not height) #5365

Uh oh!

Conversation

manofstick commented Jul 22, 2018

Uh oh!

manofstick commented Jul 22, 2018

Uh oh!

forki commented Jul 23, 2018

Uh oh!

forki Jul 23, 2018

Choose a reason for hiding this comment

Uh oh!

manofstick Jul 23, 2018

Choose a reason for hiding this comment

Uh oh!

manofstick commented Jul 23, 2018

Uh oh!

manofstick commented Jul 23, 2018

Uh oh!

forki commented Jul 23, 2018

Uh oh!

manofstick commented Jul 24, 2018

Uh oh!

cartermp commented Jul 24, 2018

Uh oh!

manofstick commented Jul 24, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

forki commented Jul 25, 2018 via email

Uh oh!

manofstick commented Jul 25, 2018

Uh oh!

manofstick commented Jul 25, 2018

Uh oh!

forki commented Jul 25, 2018

Uh oh!

manofstick commented Jul 25, 2018

Uh oh!

manofstick commented Jul 25, 2018

Uh oh!

forki commented Jul 25, 2018

Uh oh!

manofstick commented Jul 25, 2018

Uh oh!

forki commented Jul 25, 2018

Uh oh!

zpodlovics commented Aug 1, 2018

Uh oh!

manofstick commented Sep 17, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

manofstick commented Jul 24, 2018 •

edited

Loading