Skip to content

Improve worst-case bounds#26

Merged
treeowl merged 2 commits intolspitzner:masterfrom
treeowl:stricter
Dec 7, 2021
Merged

Improve worst-case bounds#26
treeowl merged 2 commits intolspitzner:masterfrom
treeowl:stricter

Conversation

@treeowl
Copy link
Copy Markdown
Collaborator

@treeowl treeowl commented Nov 16, 2020

Previously, minView was amortized O(log n) but worst-case O(n).
Improve that to amortized and worst-case O(log n) (ignoring the
impact of repeated applications of mapMonotonic). In informal
testing, these changes lead to large performance improvements.

  • Previously, lots of things were suspended that didn't need to be.
    Document the actual laziness requirements with a debit invariant
    and be more eager where allowed.

  • Rework extractBin to calculate the minimum on the way down instead
    of on the way up. This avoids building a chain of thunks that
    (if forced) actually rebuilds the queue.

I chose to make the internal nodes of the binomial tree quite strict.
For most purposes, this is good. The only downside is that
mapMonotonic is now slower, since it cannot be "operationally fused"
with surrounding operations. This doesn't seem like a huge deal,
since I don't imagine mapping over priority queues is something
that happens all that much.

Closes #24

@treeowl treeowl force-pushed the stricter branch 5 times, most recently from 94d6426 to b6c2b22 Compare November 21, 2020 20:49
@treeowl treeowl mentioned this pull request Nov 22, 2020
Copy link
Copy Markdown
Owner

@lspitzner lspitzner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, all in all. Just some minor nitpicks and questions, in general this looks good to go.

But disclaimer: I did not benchmark this to any noteworthy degree, and I don't feel like I can make an educated comment on the "Debit invariants". So I am a bit lost on the exact implications of the strictness changes, sorry.

If you have any expressions to test the performance with / highlight the changes I can re-test them and play around a bit. But I would be fine with merging it; the optimisations make sense and otherwise I trust your judgement.

Comment thread src/Data/PQueue/Internals.hs
Comment on lines +165 to +164
-- The BinomTree and Succ constructors are entirely strict, primarily because
-- that makes it easier to make sure everything is as strict as it should
-- be. The downside is that this slows down `mapMonotonic`. If that's important,
-- we can do all the forcing manually; it will be a pain.
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine. I tried looking for uses of mapMonotonic in the wild but could not find any (at least not in the pqueue revdeps on hackage or on github, although the github search may not be accurate). Anyway I don't think anyone will use mapMonotonic in some (inner) loop.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For key-value queues, this also affects fmap, but again, that's kind of a niche operation for a priority queue.

Comment thread src/Data/PQueue/Prio/Min.hs
Comment thread src/Data/PQueue/Internals.hs Outdated
Comment thread src/Data/PQueue/Prio/Internals.hs
@treeowl
Copy link
Copy Markdown
Collaborator Author

treeowl commented Nov 28, 2020

Looks good, all in all. Just some minor nitpicks and questions, in general this looks good to go.

But disclaimer: I did not benchmark this to any noteworthy degree, and I don't feel like I can make an educated comment on the "Debit invariants". So I am a bit lost on the exact implications of the strictness changes, sorry.

If you have any expressions to test the performance with / highlight the changes I can re-test them and play around a bit. But I would be fine with merging it; the optimisations make sense and otherwise I trust your judgement.

For the new approach to forcing on insert, look at the Hinze-Paterson finger tree paper. If you just build a new thunk around the old child thunk, you get thunk chains with O(n) worst-case resolution. An easy test is to build a big heap (some millions or tens of millions of elements) with insert, evaluate it, print a message, then print take 5 . toAscList of the thing. Previously, there'd be a significant lag (a few seconds) between the two outputs. With these changes, there is no lag.

@treeowl
Copy link
Copy Markdown
Collaborator Author

treeowl commented Nov 28, 2020

Could you explain what I was unclear about regarding the debit invariant? That's kind of important to understanding what's going on!

@konsumlamm konsumlamm mentioned this pull request Aug 14, 2021
@treeowl
Copy link
Copy Markdown
Collaborator Author

treeowl commented Aug 24, 2021

I think that should address all your concerns.

@lspitzner
Copy link
Copy Markdown
Owner

I have pushed a rebased version of this branch to this repository; there was only a very simple conflict. I can force-push to this branch as well, I just don't like to do that out of the blue.

Copy link
Copy Markdown
Owner

@lspitzner lspitzner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread src/Data/PQueue/Internals.hs Outdated
Comment thread src/Data/PQueue/Internals.hs Outdated
Comment thread src/Data/PQueue/Internals.hs Outdated
| minKey `lt` x -> incrExtract' le t ex
_ -> Extract x ts (Skip f)
where a `lt` b = not (b `le` a)
extractBin le0 = start le0
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it more efficient to pass around le0 instead of using it directly in the helper functions? At least the latter would be easier to read imo.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea why we don't just use Ord constraints to get specialization. Let's deal with that in a different PR.

Comment thread src/Data/PQueue/Max.hs Outdated
Comment thread src/Data/PQueue/Internals.hs Outdated
Comment thread src/Data/PQueue/Prio/Max.hs Outdated
@treeowl treeowl force-pushed the stricter branch 2 times, most recently from 53af002 to 6d6c414 Compare December 7, 2021 00:32
Previously, `minView` was amortized `O(log n)` but worst-case `O(n)`.
Improve that to amortized *and* worst-case `O(log n)` (ignoring the
impact of repeated applications of `mapMonotonic`). In informal
testing, these changes lead to large performance improvements.

* Previously, lots of things were suspended that didn't need to be.
  Document the actual laziness requirements with a debit invariant
  and be more eager where allowed.

* Rework `extractBin` to calculate the minimum on the way down instead
  of on the way up. This avoids building a chain of thunks that
  (if forced) actually rebuilds the queue.

I chose to make the internal nodes of the binomial tree quite strict.
For most purposes, this is good. The only downside is that
`mapMonotonic` is now slower, since it cannot be "operationally fused"
with surrounding operations. This doesn't seem like a huge deal,
since I don't imagine mapping over priority queues is something
that happens all that much.

* Force on cascade in `insertMin`.

* Expand the strictification to key-value queues. This should
  be good for performance of everything except `mapKeysMonotonic`,
  `mapWithKey`, and `fmap`.

Closes lspitzner#24

Improve list conversion

* Implement a strictly accumulating `fromAscList`.
* Use one less comparison per element in `fromList`.

Make min-replacement faster

Use `insertMin`/`incrMin` to avoid further comparisons when the
new key replaces the minimum.
@treeowl treeowl merged commit 69bcb19 into lspitzner:master Dec 7, 2021
@treeowl treeowl mentioned this pull request Dec 7, 2021
@konsumlamm konsumlamm mentioned this pull request Dec 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve worst-case bounds

3 participants