Add _STD to improve tuple throughput #1490

StephanTLavavej · 2020-11-20T05:40:52Z

Our recursive tuple implementation is a challenge for compiler throughput. By profiling the compiler, @xiangfan-ms identified something that we were previously unaware of. Name lookup within enable_if_t constraints can be expensive in certain situations (enormous tuples, where the recursive inheritance and pack expansion lead to quadratic behavior within the compiler, combined with permissive mode). This affects the names of structs/classes, but doesn't affect alias/variable templates for a reason which was novel to me - struct/class names become injected-class-names, so the compiler remembers that they can be member names, but alias templates and variable templates aren't declared as member names. Within a large tuple with many bases, name lookup for is_meow has to consider whether it could be an injected-class-name (which will consider the many base classes in permissive mode), but name lookup for is_meow_v can be optimized because the compiler knows that it has never been seen as a member.

This is significant enough to motivate an exception to our usual convention, where we don't use _STD qualification for non-functions (because only functions are vulnerable to ADL). I've gone through <tuple> and looked at each of its enable_if_t constraints, and I've marked all of the mentioned structs/classes as _STD (including tuple itself when forming another specialization). I didn't bother modifying the !_HAS_CONDITIONAL_EXPLICIT codepaths (they are used for CUDA only, and hopefully not forever). Also, plain tuple still refers to the tuple's own injected-class-name; AFAIK this is not a bottleneck (because it should be found immediately, preventing any lookup into base classes in permissive mode).

StephanTLavavej · 2020-11-20T22:43:33Z

@xiangfan-ms has profiled this PR and confirmed that it improves tuple throughput, especially when there are more than 50 or so template arguments. (100-ish arguments have been encountered in real world code.) Also, there are no observed throughput regressions even for smaller tuples (32 args was observed to be sped up from 299 ms to 290 ms).

CaseyCarter

Just a request for more documentation.

stl/inc/tuple

StephanTLavavej · 2020-12-03T23:26:25Z

Thanks again for writing the explanation here, @CaseyCarter - it's the difference between a good codebase and an excellent codebase. 😻

Add _STD to improve tuple throughput.

a64f98e

StephanTLavavej added the throughput Must compile faster label Nov 20, 2020

StephanTLavavej requested a review from a team as a code owner November 20, 2020 05:40

CaseyCarter suggested changes Nov 21, 2020

View reviewed changes

stl/inc/tuple Outdated Show resolved Hide resolved

Clarify comment.

eec0e20

StephanTLavavej requested a review from CaseyCarter November 21, 2020 01:45

CaseyCarter approved these changes Nov 30, 2020

View reviewed changes

StephanTLavavej assigned mnatsuhara Dec 2, 2020

mnatsuhara approved these changes Dec 2, 2020

View reviewed changes

StephanTLavavej assigned StephanTLavavej and unassigned mnatsuhara Dec 2, 2020

StephanTLavavej merged commit debeb6f into microsoft:master Dec 3, 2020

StephanTLavavej deleted the tuple_throughput branch December 3, 2020 23:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add _STD to improve tuple throughput #1490

Add _STD to improve tuple throughput #1490

Uh oh!

StephanTLavavej commented Nov 20, 2020

Uh oh!

StephanTLavavej commented Nov 20, 2020

Uh oh!

CaseyCarter left a comment

Uh oh!

Uh oh!

StephanTLavavej commented Dec 3, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add _STD to improve tuple throughput #1490

Add _STD to improve tuple throughput #1490

Uh oh!

Conversation

StephanTLavavej commented Nov 20, 2020

Uh oh!

StephanTLavavej commented Nov 20, 2020

Uh oh!

CaseyCarter left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

StephanTLavavej commented Dec 3, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants