Conversation
|
Latest results with |
|
Because the partial sort speeds things up so much I took a look at profiling results with 45c91ed and after partial sort (83b555d) on OS X. In both cases, for one thread sampled, 22% of the thread is busy in Before:
After:
|
|
Sidenote as far as next steps here: |
|
Thanks @apendleton - per chat I've love help on adjusting the benchmark to test the right thing. Feel free to commit to this branch or another with any fixes there. The other next steps here I see are:
|
|
Noting that while OS X results are impressive, results on linux/travis are not. So I need to see if travis is lying (very possible) or perhaps libstdc++ (defalt standard library on linux) is not as optimized for master:45c91ed (this branch, before partial_sort)83b555d (this branch, after partial_sort) |
Yes! Also note that for deduping, the existing code in |
|
Update on my thinking on this PR:
I'm going to pause here since |
83b555d to
7ce6f0c
Compare
|
I've removed the experimental/not-quite-yet-ideal This means that I think this is ready to be tested in production since the remaining changes are clear improvements.
/cc @KaiBot3000 @aarthykc - would you be interested in running this out? My hope is that this will help reduce memory usage in production and therefore might also help performance. The steps would be:
If #96 is looking good, this could also be merged into master and tested alongside that as it is deployed. |
|
Note, after #116 lands I think it would be a good time to revisit this and get it landed. Basically making sure that the + Context() = delete;
+ Context(Context const& c) = delete;
+ Context& operator=(Context const& c) = delete;Also as a future idea/ticket: would be great to apply learnings on sorting optimizations from vtquery. Whoever wants to pick this up reach out to @mapsam for tips on sorting optimizations in C++. |
|
We made Context noncopyable in #116 . @springmeyer closing as I think ultimately that's all we wanted to carry over from here, but if there was other stuff from the earlier iterations of this PR that you wanted us to do as well, feel free to reopen. |
|
Sounds good @apendleton - that should help. I've ticketed #120 to log the idea of using |
This fixes an odd case (still need to research more into the why here) where Context objects were still being copied rather than moved (even though they had a move constructor enabled).
By moving these large objects we should avoid many allocations and help speed up performance.
On OS X this leads to a significant speedup. On Linux the speedup is less but still noticable.
OSX shows:
Master
This branch