Skip to content

Conversation

@stevengj
Copy link
Contributor

@stevengj stevengj commented Aug 3, 2018

Fixes two calls that were using fabs on a long double argument rather than fabsl, which looks like it is doing an unintentional truncation to double precision.

Fixes two calls that were using `fabs` on a `long double` argument rather than `fabsl`, which looks like it is doing an unintentional truncation to `double` precision.
@stevengj
Copy link
Contributor Author

stevengj commented Aug 3, 2018

Looks like there are similar usages in rotg.c, symv.c, zrotg.c, zsbmv.c, zhemv.c, rotmg.c, ztbsv_U.c, ztbsv_L.c, ztrti2_L.c, ztrti2_U.c, ...

@brianborchers
Copy link

brianborchers commented Aug 3, 2018

I was surprised to see that "long double" variables are being used anywhere in OpenBLAS. Although gcc on x86_64 uses 80 bit Intel extended precision for these, the performance is slow in comparison to AVX2 or AVX512 SIMD. Other x86_64 compilers might just use IEEE double precision for long double. On many other architectures there's no hardware support for an extended precision long double (and the compiler often turns long double into double precision.) It's also possible that some compiler might implement a quadruple precision long double in software, but that would be far slower than the x86_64 80 bit extended precision.

Thus there's no real way to count on this helping with precision, and furthermore, if you do get some kind of extended precision using long double it might have poor performance. A final concern is that users of BLAS/LAPACK routines don't expect extended precision and might be surprised by results which are too accurate.

Which of the BLAS/LAPACK functions in OpenBLAS make use of long double?

@martin-frbg
Copy link
Collaborator

At least some of these - most likely all - are unchanged from GotoBLAS2 (which predates AVX2).

@martin-frbg
Copy link
Collaborator

Also if you are concerned about excess precision on x86, you will probably want to set the FPU control word to use internal rounding to double precision (0x27 I think it was) if your operating system does not already default to that mode (*BSD does I believe). I have not tried what effect enforcing fpmath=sse would have, possibly this would be sufficient to demote the "long double" to "double" already.

@martin-frbg
Copy link
Collaborator

martin-frbg commented Aug 4, 2018

Perhaps the discussion could be moved to an issue ticket, if nobody actually questions the validity of the PR as such ? (Unfortunately I am not aware of a github function that would do this while keeping individual messages and attributions intact)

@brianborchers
Copy link

I opened a separate issue for this with a reference to the pull request.

@stevengj
Copy link
Contributor Author

stevengj commented Aug 4, 2018

If people agree that this PR is correct, it should probably be expanded to include the other calls to fabs that I mentioned before it is merged.

@martin-frbg
Copy link
Collaborator

The only instances of "long double" that I can find appear to be in interface/rotg.c and interface/zrotg.c, not sure why you think symv.c and others are affected ?

@martin-frbg martin-frbg merged commit e788102 into OpenMathLib:develop Aug 4, 2018
@brianborchers
Copy link

As I understand it, the previous code took long double values, rounded them to double, took the absolute value and then converted back to long double at the end, losing any extended precision. The new version should actually keep the extended precision.

@stevengj
Copy link
Contributor Author

stevengj commented Aug 5, 2018

@martin-frbg, I haven't looked at the other files in detail — I just listed the files for which I saw a similar compiler warning about fabsl. I would have to recompile to see the warnings again in detail.

@stevengj stevengj deleted the patch-1 branch August 5, 2018 02:08
@martin-frbg
Copy link
Collaborator

I did not see any other warnings with clang, nor any other uses of "long double".
@brianborchers the code is something like

void rotg(double *DA,...
long double da = *DA;
long double ada = fabsl(da);

@stevengj
Copy link
Contributor Author

stevengj commented Aug 5, 2018

@martin-frbg, if all the code using fabs(long double) is like that, i.e. operating on a value just converted from double, then fabs should be safe since (long double) ((double) ((long double) somedouble)) is the identity, and fabs may even be faster (though you would need to benchmark this, and in any case performance problaby doesn't matter for these lines).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants