Upgrade front-end & libs to v2.084.0#2946
Conversation
std.internal.digest.sha_SSSE3 now supports Win64 too. It contains naked DMD-style asm which isn't easily portable to LLVM asm. For Win32, there are 2 more modules - the reason we aren't shipping with LTO libs for Win32 yet (but for Win64). I took a pragmatic approach by compiling these modules separately - only generating the binary object file <skipping the bitcode one) and using it for both binary and LTO libs.
It's been broken for core.math.ldexp since switching to the alias. The std.math.ldexp builtins were removed upstream, which makes sense, as they all rely on core.math.ldexp.
For proper OS and MODEL vars. E.g., OS is needed for the new stdcpp test, as OS=osx needs `-stdlib=libc++` in the clang command line.
A port from dmd/mars.d.
…-{invariants,preconditions,postconditions,contracts}
This is a breaking change, conforming to new DMD semantics.
The previous semantics were inconsistent, as -{enable,disable}-asserts
and -boundscheck (as well as new -{enable,disable}-switch-errors)
weren't overridden.
This conforms to a DMD breaking change too.
… etc. This is what DMD does AFAICT and fixes ldc-developers#599.
|
Nothing in particular, it's just the |
Cache length & ptr in DSliceValue, so that e.g. a pair constructed from a constant length and some ptr keeps returning a constant length instead of an extractvalue instruction every time the length is needed. This enables checking for matching constant lengths when copying slices and makes `test1()` in runnable/betterc.d work (memcpy instead of _d_array_slice_copy call): ``` int[10] a1 = void; int[10] a2 = void; a1[] = a2[]; ``` (more or less equivalent to `a1 = a2`, which is already optimized)
|
The slice copy optimization wasn't really needed either, but I thought it deserves a bit of attention. int[10] a1 = void;
int[10] a2 = void;
a1[] = a2[];Old IR (v1.13): %a1 = alloca [10 x i32], align 4 ; [#uses = 2, size/byte = 40]
%a2 = alloca [10 x i32], align 4 ; [#uses = 1, size/byte = 40]
%1 = bitcast [10 x i32]* %a2 to i32* ; [#uses = 1]
%2 = insertvalue { i64, i32* } { i64 10, i32* undef }, i32* %1, 1 ; [#uses = 2]
%3 = bitcast [10 x i32]* %a1 to i32* ; [#uses = 1]
%4 = bitcast i32* %3 to i8* ; [#uses = 1]
%.ptr = extractvalue { i64, i32* } %2, 1 ; [#uses = 1]
%5 = bitcast i32* %.ptr to i8* ; [#uses = 1]
%.len = extractvalue { i64, i32* } %2, 0 ; [#uses = 1]
%6 = mul i64 %.len, 4 ; [#uses = 1]
call void @_d_array_slice_copy(i8* nocapture %4, i64 40, i8* nocapture %5, i64 %6) #1
%7 = bitcast [10 x i32]* %a1 to i32* ; [#uses = 1]
%8 = insertvalue { i64, i32* } { i64 10, i32* undef }, i32* %7, 1 ; [#uses = 0]
ret voidNew IR: %a1 = alloca [10 x i32], align 4 ; [#uses = 2, size/byte = 40]
%a2 = alloca [10 x i32], align 4 ; [#uses = 1, size/byte = 40]
%1 = bitcast [10 x i32]* %a2 to i32* ; [#uses = 2]
%2 = insertvalue { i64, i32* } { i64 10, i32* undef }, i32* %1, 1 ; [#uses = 0]
%3 = bitcast [10 x i32]* %a1 to i32* ; [#uses = 1]
%4 = bitcast i32* %3 to i8* ; [#uses = 1]
%5 = bitcast i32* %1 to i8* ; [#uses = 1]
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 %4, i8* align 1 %5, i64 40, i1 false)
%6 = bitcast [10 x i32]* %a1 to i32* ; [#uses = 1]
%7 = insertvalue { i64, i32* } { i64 10, i32* undef }, i32* %6, 1 ; [#uses = 0]
ret voidThis optimization doesn't check for overlap, but neither does the existing optimization for static arrays on both sides ( As to the actual |
|
Btw, the bounds are still checked for int[10] a1 = void;
int[10] a2 = void;
a1[0..256] = a2[0..256];(but no subsequent |
|
CircleCI OSX (64-bit only): Any idea what's missing? Incomplete Xcode installation?! |
|
Btw @JohanEngelen, the strange LLVM 7.0 crashes for Travis/Linux are apparently back, master failed too... |
One of those should be a |
How many functions are we talking about? |
I don't know exactly, but inlining every non-templated runtime function usable/useful in
I guess it depends on the new |
|
We could try forcing cross module inlining them? (i.e. inlining them from D source) |
|
Nope, the regular D version throws Errors, and exceptions aren't supported by This stuff (compiler support functions) needs to be tackled with a clear strategy IMO, for bare-metal and |
|
Wrt. 32-bit Linux dmd-testsuite failures, running a single one of them manually (via SSH directly on CI test machine) works. According to the logs, it seems like there are conflicts wrt. generated temp filenames for tests running in parallel (these files are removed immediately after each test): I didn't spot relevant changes in |
|
Argh I think I found the culprit: private ulong bootstrapSeed() @nogc nothrow
{
ulong result = void;
enum ulong m = 0xc6a4_a793_5bd1_e995UL; // MurmurHash2_64A constant.
void updateResult(ulong x)
{
x *= m;
x = (x ^ (x >>> 47)) * m;
result = (result ^ x) * m;
}
import core.thread : getpid, Thread;
import core.time : MonoTime;
updateResult(cast(ulong) cast(void*) Thread.getThis());
updateResult(cast(ulong) getpid());
updateResult(cast(ulong) MonoTime.currTime.ticks);
//printf(".: bootstrap seed [early]: %llx\n", result);
result = (result ^ (result >>> 47)) * m;
result = result ^ (result >>> 47);
//printf(".: bootstrap seed: %llx\n", result);
return result;
}With |
|
Nice find. Small note: the optimizer plays no role. Using an uninitialized value is UB in D. https://dlang.org/spec/declaration.html#void_init |
core.thread hasn't been touched in 2.084, but 3 assertions now fail with -O. I recall spurious core.thread issues on Win64 before; these may be related.
|
Finally green. |
|
Upstreamed the std.random fix. |
|
The Allthough callee |
|
I extracted the test in a separate module and looked at the optimized IR (not the asm) - it was fine, register being set, then the |
No description provided.