Skip to content

Conversation

@firewave
Copy link
Collaborator

@firewave firewave commented Mar 20, 2022

This is the first of two PR introducing the usage of SmallVector in our code.

I adjusted the value locally based on profiling it with callgrind.
When using Clang and Boost the indicator that we need to increase the size of the vector are boost::container::vec_iterator<> boost::container::vector<>::priv_forward_range_insert_no_capacity<>(...) and operator delete(void*) calls in findAstNode(...).
When using no Boost and reserve() the indicator is amount of additional operator new(...) and operator delete(...) compared to the actual calls of findAstNode(...).
Update: With findAstNode() now being inlined it is now harder to trace this. You now need to follow the operator new() or operator delete() but you won't have a function or proper call count to compare against. It seem like memcpy@GLIBC_2.2.5 is now the indicator that the vector is being expanded and needs to reserve more space.

I still need to run a bigger test to see if we might need to increase the count even more.

Here's some numbers for a -O2 build using callgrind with --enable=all --inconclusive on the mame_regtest project, DISABLE_VALUEFLOW=1.

Clang 13
2,263,092,948 - std::vector
2,242,113,164 - std::vector (reserve(8))
2,263,074,159 - SmallVector (no Boost / size 8)
2,242,113,077 - SmallVector (no Boost / size 8 / reserve(8))
2,162,604,179 - SmallVector (Boost 1.74 / size 8)
2,167,203,248 - SmallVector (Boost 1.74 / size 8 / reserve(8))

GCC values coming soon.
I will file a ticket about the GCC difference soon.

Updated numbers below.

@firewave
Copy link
Collaborator Author

firewave commented Mar 25, 2022

The Visual Studio compilation failure was caused by the incorrect C++ standard being using by the *-PCRE configurations in the cli.vcxproj.

I opened #3937 for the compilation/selfcheck fixes.

@firewave
Copy link
Collaborator Author

@Ken-Patrick @pfultz2 @danmar

There's another intermediate step it seems.
Using reserve() on the underlying container helps to reduce unnecessary allocations if we are not using Boost.

    std::vector<T *> tokensContainer;
    tokensContainer.reserve(8);
    std::stack<T *, std::vector<T *>> tokens(std::move(tokensContainer));

Unfortunately that causes a performance regression when using Boost so that code requires to be conditional which is messy.

I wonder if what the reserve() call does could be achieved through the TaggedAllocator when using SmallVector without Boost as we are currently not using the given size. Unfortunately I have zero experience with custom std::allocator.

I provided values for the various cases in the description.

@firewave
Copy link
Collaborator Author

As std::vector::reserve() already provides a sizable improvement without need for any external dependencies I prepared #4158 which implements until SmallVector provides the same advantage in the non-Boost implementation.

@firewave
Copy link
Collaborator Author

firewave commented Jun 3, 2022

With GCC I still see (all?) calls to void boost::container::vector<>::priv_push_back<>(Token const* const&) during profiling. It seems boost::container::small_vector is not working as expected with GCC.

@firewave firewave changed the title astutils.h: use pre-sized SmallVector in visitAstNodes() astutils.h: use pre-sized SmallVector in visitAstNodes() Aug 19, 2022
@firewave
Copy link
Collaborator Author

This is almost done. 🙂
I still need to test it with the latest boost, add a build step to the CI and integrate it with the Visual Studio build (since it doesn't add any dependencies the official binaries should utilize it).

Here's a small preview of the performance win:

Clang 14 51,333,013,384 -> 37,000,473,137
GCC 12 55,161,312,808 -> 37,878,403,473

@firewave
Copy link
Collaborator Author

firewave commented Sep 8, 2022

I enabled the Boost usage for the sanitizer builds. That tests the build and runs the tests as well as (hopefully) providing a slight performance improvement for the sanitized runs.

The last remaining thing is to build the Windows version with Boost.

@firewave firewave force-pushed the smallvector branch 3 times, most recently from b76d0f8 to fe25c15 Compare September 28, 2022 11:16
@firewave
Copy link
Collaborator Author

This is finally ready for review. 🥳 I think the very slight Clang regression without Boost can be dismissed.

current -> SmallVector -> SmallVector + Boost 1.74

Performing a --enable=all --inconclusive scan on mame_regtest.

With DISABLE_VALUEFLOW=1:

Clang 14 2,026,151,486 -> 2,026,159,272 -> 1,854,855,377
GCC 12 1,994,489,219 -> 1,943,410,321 -> 1,842,068,039

With Valueflow enabled:

Clang 14 51,130,767,151 -> 51,131,952,779 -> 32,275,080,333
GCC 12 54,989,579,042 -> 49,360,382,494 -> 37,944,747,061

And scanning cli with --enable=all --inconclusive -Ilib -D __GNUC__ and DISABLE_VALUEFLOW=1

Clang 14 2,935,467,131 -> 2,935,473,066 -> 2,800,613,440
GCC 12 3,009,733,699 -> 2,968,684,311 -> 2,841,416,862

@firewave firewave marked this pull request as ready for review September 28, 2022 12:07
@danmar danmar merged commit 083efe6 into danmar:main Sep 29, 2022
@firewave firewave deleted the smallvector branch September 29, 2022 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants