Add INLINABLE pragmas to most overloaded combinators#113
Add INLINABLE pragmas to most overloaded combinators#113hvr merged 1 commit intohaskell:masterfrom lexi-lambda:inlinable-pragmas
Conversation
|
I wonder if FWIW, It would be great if the description is part of commit message. (GitHub UI is so nice, that when you create PR it would use commit message of single-commit-PR as the PR description). |
This adds INLINABLE pragmas to most exported combinators, which enables
cross-module specialization of the Stream constraint (which can in turn
enable further optimizations). This improves performance of these
combinators in scenarios where GHC chooses not to inline them, since
they may still be specialized instead.
This change is primarily in response to a performance regression
discovered by the GHC performance test suite when running haddock (since
haddock uses parsec). The full discussion is available here:
https://gitlab.haskell.org/ghc/ghc/-/merge_requests/3041
The gist is that, without these pragmas, performance relies too heavily
on inlining heuristics working out in our favor, and subtle changes in
the optimizer can cause regressions.
The GHC performance tests suggest this patch reliably reduces runtime of
haddock on base by 7–9% and allocation by 3–5%. Pretty good for doing
something so simple!
In this case, the important detail is that these combinators are available for cross-module specialization, which only happens with an
Yes, good point; I have dramatically extended the commit message. |
https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#inlinable-pragma doesn't mention directly that I tried running This patch with Looks like that if one still slaps Vanilla parsec GHC-8.8.3Vanilla parsec GHC-8.10.1With INLINABLE patchWith -fexpose-all-unfoldingsNo effect: With -fexpose-all-unfoldings (-fspecialise-aggressively in Cabal)No effect: With -fexpose-all-unfoldings, (-fexpose-all-unfoldings and -fspecialise-aggressibely in Cabal)Roughly the same numbers as with Everything, INLINABLE and options |
It is documented a little further down, in the section titled |
|
Few more stats, for completeness: I compiled I don't see any drawbacks in adding |
|
Thanks everyone; this optimization-for-almost-nothing is a nice win indeed! |
See haskell/parsec#113 (comment) for benchmark results. This does speedup parsing.
This PR adds
INLINABLEpragmas to most of the overloaded combinators exported byparsec, enabling cross-module specialization of theStreamconstraint (which can in turn enable further optimizations). This improves performance of these combinators in scenarios where GHC chooses not to inline them, since they may still be specialized instead.I took some rough measurements from running
haddockonbase(sincehaddockusesparsec), and I found that this patch reliably reduces runtime by 7–9% and allocation by 3–4%. A pretty good win for doing something so simple!Adding
INLINABLEpragmas is rather conservative, since they don’t affect inlining heuristics, they just ensure the (unoptimized) unfolding is exposed.megaparsecis much more aggressive in comparison, as it annotates many of its combinators withINLINErather thanINLINABLE. Some combinators inparsecmight benefit from similar levels of inlining, but determining which inlinings are actually beneficial would require significantly more investigation, so this just makes the conservative change for now.