Skip to content

FPTemporary update#2536

Closed
9il wants to merge 1 commit intodlang:masterfrom
9il:math
Closed

FPTemporary update#2536
9il wants to merge 1 commit intodlang:masterfrom
9il:math

Conversation

@9il
Copy link
Member

@9il 9il commented Sep 19, 2014

Do not use x87 FPU stack when SSE2 is available.

FPTemporary update
@mihails-strasuns
Copy link

Need someone more knowledgeable with FP details to verify this change.

@dnadlinger
Copy link
Contributor

Pinging @donc @don-clugston-sociomantic. I'd need to think about this some more to figure out whether this is really the correct strategy.

Some quick thoughts: version (X86) seems troublesome, as the ABI might still mandate x87 stack returns and so on, even if the target supports SSE math. Also, making this depend on vector support seems a bit odd. For example, you can instruct GCC to emit x87 code on x86_64 and so on.

@9il
Copy link
Member Author

9il commented Sep 21, 2014

Agreed. Is it possible to add version(x87code) or something like that?
The reason of this PR is optimization of std.complex that use FPTemporary. Also one algorithm in std.numeric use this feature.

Would it better to reorganize std.complex to not use FPTemporary?

@don-clugston-sociomantic
Copy link
Contributor

I'm not really sure about the concept of FPTemporary. A few observations:

  1. Using 'does it compile' as a test is a bit doubtful. Some 32-bit Intel CPUs support SSE2, even though AMD never did. Conceivably, some compilers could allow that code to compile. We could forbid compilers from doing that, but it would be a funky rule.
  2. Another architecture which has this behaviour is Itanium. It isn't x87.
  3. The other issue with intermediate precision being higher than final precision is support for FMA. That can give you effectively 128 bit precision in certain cases. This means that the concept of FPTemporary is a bit simplistic.
  4. Walter has advocated vague semantics, where a local variable of type 'float' or 'double' can be implemented as an 80 bit real on x87. That is, you may still get 'real' semantics even if you haven't specified it. In fact, to some extent there is no other option -- unless you do the silly thing that Java does, where you cripple performance by reducing precision. I have an ongoing dispute with Walter about point 4 (because you still need some way of enforcing extra precision to be abandoned) but certainly the C/C++ concept of 'sequence points' doesn't actually work.

Do you actually need to use FPTemporary? Because of Point 4, it might not be necessary. Because of point 3, it might not be sufficient!

@9il
Copy link
Member Author

9il commented Sep 22, 2014

I don't need FPTemporary. In other hand I need std.complex. Can I make PR to remove FPTemporary from std.complex?

@don-clugston-sociomantic
Copy link
Contributor

I think you can probably remove FPTemporary from std.complex. See if it passes the test suite when you do.

@9il
Copy link
Member Author

9il commented Sep 22, 2014

Thanks!

@9il 9il closed this Sep 22, 2014
@9il 9il deleted the math branch September 23, 2014 20:04
@9il 9il mentioned this pull request Oct 27, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants