Update Phobos to 2.074.1#539
Conversation
| } | ||
| else | ||
| return sw & EXCEPTIONS_MASK; |
There was a problem hiding this comment.
@ibuclaw can you upstream your changes?
There was a problem hiding this comment.
It's gdc-specific assembler syntax, but sure. We've already got some upstreamed to core.cpuid.
| asm pure nothrow @nogc | ||
| { | ||
| "ldmxcsr %0" : : "m" (mxcsr); | ||
| } |
| } | ||
| } | ||
| else version (X86_64) | ||
| { | ||
| asm pure nothrow @nogc | ||
| { | ||
| "xor %%rax, %%rax; fstcw %[cw];" : [cw] "=m" cont :: "rax"; | ||
| "fstcw %0;" : "=m" cont; |
| asm pure nothrow @nogc | ||
| { | ||
| "ldmxcsr %0" : : "m" mxcsr; | ||
| } |
There was a problem hiding this comment.
@jpf91 and here are the relevant changes made to our asm.
jpf91
left a comment
There was a problem hiding this comment.
Interesting that toChars shows up here as well, I guess it triggers some quite specific DMD FE bug.
| uint mxcsr; | ||
| asm pure nothrow @nogc | ||
| { | ||
| "stmxcsr %0" : "=m" (mxcsr); |
There was a problem hiding this comment.
I wonder whether this works as expected on targets without SSE? IIRC on ARM the assembler complains if you try to assemble instructions not valid for the exact target. Not sure about X86 though.
There was a problem hiding this comment.
Without looking, it should bw guarded by haveSSE. Which uses core.cpuid for x86.
There was a problem hiding this comment.
Yes. I don't doubt this works fine at runtime but on ARM the assembler simply refuses to even assemble FPU instructions without passing certain FPU target flags to the assembler. For i386 the documentation says:
-march=CPU[+EXTENSION…]
This option specifies the target processor. The assembler will issue an error message if an attempt is made to assemble an instruction which will not execute on the target processor. The following processor names are recognized: i8086, i186, i286, i386, i486, i586, i686, pentium, pentiumpro, pentiumii, pentiumiii, pentium4, prescott, nocona, core, core2, corei7, l1om, k1om, iamcu, k6, k6_2, athlon, opteron, k8, amdfam10, bdver1, bdver2, bdver3, bdver4, znver1, btver1, btver2, generic32 and generic64.
In addition to the basic instruction set, the assembler can be told to accept various extension mnemonics. For example, -march=i686+sse4+vmx extends i686 with sse4 and vmx. The following extensions are currently supported: 8087, 287, 387, 687, no87, no287, no387, no687, mmx, nommx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, sse4, nosse, nosse2, nosse3, nossse3, nosse4.1, nosse4.2, nosse4, avx, avx2, noavx, noavx2, adx, rdseed, prfchw, smap, mpx, sha, rdpid, ptwrite, cet, prefetchwt1, clflushopt, se1, clwb, avx512f, avx512cd, avx512er, avx512pf, avx512vl, avx512bw, avx512dq, avx512ifma, avx512vbmi, avx512_4fmaps, avx512_4vnniw, avx512_vpopcntdq, noavx512f, noavx512cd, noavx512er, noavx512pf, noavx512vl, noavx512bw, noavx512dq, noavx512ifma, noavx512vbmi, noavx512_4fmaps, noavx512_4vnniw, noavx512_vpopcntdq, vmx, vmfunc, smx, xsave, xsaveopt, xsavec, xsaves, aes, pclmul, fsgsbase, rdrnd, f16c, bmi2, fma, movbe, ept, lzcnt, hle, rtm, invpcid, clflush, mwaitx, clzero, lwp, fma4, xop, cx16, syscall, rdtscp, 3dnow, 3dnowa, sse4a, sse5, svme, abm and padlock. Note that rather than extending a basic instruction set, the extension mnemonics starting with no revoke the respective functionality.
When the .arch directive is used with -march, the .arch directive will take precedent.
https://sourceware.org/binutils/docs/as/i386_002dOptions.html#i386_002dOptions
I'm just wondering how this works in practice. If we configure GCC with an SSE target we know SSE is available statically anyway. If we want to do a runtime check we have to compile for non-SSE target so GCC does not emit SSE instructions in codegen, but then as will refuse to assemble these instructions, as far as I understand. Do we somehow have to emit the .arch directive?
There was a problem hiding this comment.
Hmm, I didn't see anything like this when looking here initially.
https://github.com/bminor/glibc/blob/master/sysdeps/i386/fpu/fenv_private.h
This of course can be checked by actually building with different march options.
There was a problem hiding this comment.
Yeah, I think this is a non-issue. I have no problems compiling a small test with -m32 -march=i386.
| /* In the FPU control register, rounding mode is in bits 10 and | ||
| 11. In MXCSR it's in bits 13 and 14. */ | ||
| mxcsr &= ~(ROUNDING_MASK << 3); // delete old rounding mode | ||
| mxcsr |= (newState & ROUNDING_MASK) << 3; // write new rounding mode |
There was a problem hiding this comment.
Why did upstream use enum/immutable here? Was that a manual optimization for DMD? ;-)
enum ROUNDING_MASK_SSE = ROUNDING_MASK << 3;
immutable newRoundingModeSSE = (newState & ROUNDING_MASK) << 3;There was a problem hiding this comment.
Probably. Our code path doesn't have to be identical, just generate the same code.
Actually what's more likely is that the names are self documenting.
|
I assume, like in 2.073, that the unresolved failures are due to deprecated -> removed functions. |
|
All good to go. Let's merge. |
I had to revert dlang/phobos#5017, as it broke separate compilation.
Also I made changes to the IEEE control functions to get/set the SSE flags as well.