Skip to content
This repository was archived by the owner on Jun 20, 2019. It is now read-only.

Update Phobos to 2.074.1#539

Merged
ibuclaw merged 2 commits intoD-Programming-GDC:masterfrom
ibuclaw:phobos2074
Aug 3, 2017
Merged

Update Phobos to 2.074.1#539
ibuclaw merged 2 commits intoD-Programming-GDC:masterfrom
ibuclaw:phobos2074

Conversation

@ibuclaw
Copy link
Member

@ibuclaw ibuclaw commented Aug 3, 2017

I had to revert dlang/phobos#5017, as it broke separate compilation.

Also I made changes to the IEEE control functions to get/set the SSE flags as well.

}
else
return sw & EXCEPTIONS_MASK;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpf91 here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ibuclaw can you upstream your changes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's gdc-specific assembler syntax, but sure. We've already got some upstreamed to core.cpuid.

asm pure nothrow @nogc
{
"ldmxcsr %0" : : "m" (mxcsr);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpf91 here

}
}
else version (X86_64)
{
asm pure nothrow @nogc
{
"xor %%rax, %%rax; fstcw %[cw];" : [cw] "=m" cont :: "rax";
"fstcw %0;" : "=m" cont;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpf91 here

asm pure nothrow @nogc
{
"ldmxcsr %0" : : "m" mxcsr;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpf91 and here are the relevant changes made to our asm.

Copy link
Contributor

@jpf91 jpf91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting that toChars shows up here as well, I guess it triggers some quite specific DMD FE bug.

uint mxcsr;
asm pure nothrow @nogc
{
"stmxcsr %0" : "=m" (mxcsr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether this works as expected on targets without SSE? IIRC on ARM the assembler complains if you try to assemble instructions not valid for the exact target. Not sure about X86 though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without looking, it should bw guarded by haveSSE. Which uses core.cpuid for x86.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I don't doubt this works fine at runtime but on ARM the assembler simply refuses to even assemble FPU instructions without passing certain FPU target flags to the assembler. For i386 the documentation says:

-march=CPU[+EXTENSION…]

This option specifies the target processor. The assembler will issue an error message if an attempt is made to assemble an instruction which will not execute on the target processor. The following processor names are recognized: i8086, i186, i286, i386, i486, i586, i686, pentium, pentiumpro, pentiumii, pentiumiii, pentium4, prescott, nocona, core, core2, corei7, l1om, k1om, iamcu, k6, k6_2, athlon, opteron, k8, amdfam10, bdver1, bdver2, bdver3, bdver4, znver1, btver1, btver2, generic32 and generic64.

In addition to the basic instruction set, the assembler can be told to accept various extension mnemonics. For example, -march=i686+sse4+vmx extends i686 with sse4 and vmx. The following extensions are currently supported: 8087, 287, 387, 687, no87, no287, no387, no687, mmx, nommx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, sse4, nosse, nosse2, nosse3, nossse3, nosse4.1, nosse4.2, nosse4, avx, avx2, noavx, noavx2, adx, rdseed, prfchw, smap, mpx, sha, rdpid, ptwrite, cet, prefetchwt1, clflushopt, se1, clwb, avx512f, avx512cd, avx512er, avx512pf, avx512vl, avx512bw, avx512dq, avx512ifma, avx512vbmi, avx512_4fmaps, avx512_4vnniw, avx512_vpopcntdq, noavx512f, noavx512cd, noavx512er, noavx512pf, noavx512vl, noavx512bw, noavx512dq, noavx512ifma, noavx512vbmi, noavx512_4fmaps, noavx512_4vnniw, noavx512_vpopcntdq, vmx, vmfunc, smx, xsave, xsaveopt, xsavec, xsaves, aes, pclmul, fsgsbase, rdrnd, f16c, bmi2, fma, movbe, ept, lzcnt, hle, rtm, invpcid, clflush, mwaitx, clzero, lwp, fma4, xop, cx16, syscall, rdtscp, 3dnow, 3dnowa, sse4a, sse5, svme, abm and padlock. Note that rather than extending a basic instruction set, the extension mnemonics starting with no revoke the respective functionality.

When the .arch directive is used with -march, the .arch directive will take precedent.
https://sourceware.org/binutils/docs/as/i386_002dOptions.html#i386_002dOptions

I'm just wondering how this works in practice. If we configure GCC with an SSE target we know SSE is available statically anyway. If we want to do a runtime check we have to compile for non-SSE target so GCC does not emit SSE instructions in codegen, but then as will refuse to assemble these instructions, as far as I understand. Do we somehow have to emit the .arch directive?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I didn't see anything like this when looking here initially.

https://github.com/bminor/glibc/blob/master/sysdeps/i386/fpu/fenv_private.h

This of course can be checked by actually building with different march options.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this is a non-issue. I have no problems compiling a small test with -m32 -march=i386.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks for testing 👍

/* In the FPU control register, rounding mode is in bits 10 and
11. In MXCSR it's in bits 13 and 14. */
mxcsr &= ~(ROUNDING_MASK << 3); // delete old rounding mode
mxcsr |= (newState & ROUNDING_MASK) << 3; // write new rounding mode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did upstream use enum/immutable here? Was that a manual optimization for DMD? ;-)

enum ROUNDING_MASK_SSE = ROUNDING_MASK << 3;
immutable newRoundingModeSSE = (newState & ROUNDING_MASK) << 3;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably. Our code path doesn't have to be identical, just generate the same code.

Actually what's more likely is that the names are self documenting.

@ibuclaw
Copy link
Member Author

ibuclaw commented Aug 3, 2017

I assume, like in 2.073, that the unresolved failures are due to deprecated -> removed functions.

@ibuclaw
Copy link
Member Author

ibuclaw commented Aug 3, 2017

All good to go. Let's merge.

@ibuclaw ibuclaw merged commit fe8c157 into D-Programming-GDC:master Aug 3, 2017
@ibuclaw ibuclaw deleted the phobos2074 branch August 3, 2017 21:12
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants