-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
When @BillyONeal ported the STL's build system from MSBuild to CMake, he discovered that we were building the release STL with /O2 /Os and replicated that, although we marked it with a TRANSITION comment since this seemed strange and undesirable. (The STL really wants inlining.)
Line 73 in bfa2cb2
| set(VCLIBS_RELEASE_OPTIONS "/O2;/Os") # TRANSITION: Potentially remove /Os |
Recently, @AlexGuteniev discovered that /Os was harming perf-critical codegen.
I attempted to remove /Os and found the following size effects on the primary DLL and static LIB (all sizes are in bytes):
| Arch | File | Before, /O2 /Os |
After, /O2 |
Change |
|---|---|---|---|---|
| x86 | msvcp140_oss.dll |
411,648 | 591,872 | +43.8% |
| x86 | libcpmt.lib |
29,416,878 | 31,426,992 | +6.8% |
| x64 | msvcp140_oss.dll |
553,984 | 648,192 | +17.0% |
| x64 | libcpmt.lib |
31,704,278 | 33,951,088 | +7.1% |
+43.8% for the x86 DLL is probably more than we can get away with. +17.0% for the x64 DLL is less extreme, although 5% or 10% combined with profiling data indicating speedups in different areas would be easier to justify.
We should investigate where this size increase is coming from, and whether /Os can be removed (combined with other changes) without increasing the size so much. The fact that x86 is strongly affected leads me to guess that EH machinery is involved.
MSVC-internal notes:
The corresponding MSBuild change is surprisingly simple. In src/vctools/crt/crt_build.settings.targets:
<ClOptimization Condition="'$(DebugBuild)' == 'true'">Disabled</ClOptimization>
<ClOptimization Condition="'$(DebugBuild)' == 'false'">MaxSpeed</ClOptimization>
+ <ClFavorSizeOrSpeed Condition="'$(DebugBuild)' == 'false'">Speed</ClFavorSizeOrSpeed>I believe that the /O2 /Os combination was unintentional - crt_build.settings.targets was setting ClOptimization to MaxSpeed, but there's a separate setting ClFavorSizeOrSpeed. This was being set to Size somewhere, I suspect by Microsoft.DevDiv.Native.targets.
The fix is to simply favor Speed for release mode. Changing crt_build.settings.targets affects the entire vctools/crt build which is exactly what I want to do here (although the STL cares the most about inlining, I believe that /O2 /Os is undesirable for all VCLibs). I verified that this affects the command lines being used to compile the STL (I searched for xrngdev as a uniquely-named source file):
C:\Temp\LOGS>grep -oP "((-|/)O\S+ )+(?=.*xrngdev)" plain-msbuild_x86chk.log | sort | uniq
/O2 /Os /Oy-
/O2 /Oy-
/Od /Oy-
C:\Temp\LOGS>grep -oP "((-|/)O\S+ )+(?=.*xrngdev)" updated-msbuild_x86chk.log | sort | uniq
/O2 /Ot /Oy-
/Od /Oy-
/O2 /Ot is fine as /O2 implies /Ot, see the docs. (There might be a way to omit /Ot entirely but I don't think that's worth the trouble of investigating.) Importantly, this verifies that /Od compilation is unaffected.
Note that both fe-components.settings.targets (for c1xx.dll) and utc.settings.targets (for c2.dll) favor Speed, so this isn't unusual, and those binaries care about performance to an extreme degree.