Skip to content

Conversation

@firewave
Copy link
Collaborator

I tested this with cli/filelister.cpp. I wanted to test it with more files but with all the recent performance regressions it's agonizing slow while running in valgrind.

The initial Ir count was 353,306,619.

Reducing the std::stack operations by not adding nullptr entries reduced it to 334,276,673 even though the checks added some overhead.

Switching the backend of the std::stack from std::deque to std::vector reduced it to 293,371,232. The main problem is that the construction of a std::deque is much more expensive. The destruction cost stays the same.

So in total this saves us about 12% of the Ir count.

The main issue here still remains the creation and destruction of std::stack which still amounts to more than 22% of the total Ir count. So it's the same issue we are seeing with short-living small std::vector as related to in #3432. It is also still the single most expensive function we call.

@danmar danmar merged commit 2148b8b into danmar:main Jan 17, 2022
@firewave firewave deleted the visitast branch January 17, 2022 19:37
@firewave
Copy link
Collaborator Author

firewave commented Jan 18, 2022

These numbers apply to Clang 13 - with GCC 11 this speed-up in switching the container does not happen. I will try to dig into it and report it upstream.

I have some further optimization which improves the speed for GCC to match current Clang as well as improving Clang.

@firewave
Copy link
Collaborator Author

While looking into this I came across more code generation differences between the compilers like llvm/llvm-project#53268.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants